--- title: "Quick guide" author: "Nicolò Rossi" date: "28/3/2020" output: html_document: df_print: paged vignette: > %\VignetteIndexEntry{Quick guide} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(CIMICE) library(dplyr) library(igraph) library(Matrix) library(purrr) ``` ## Reference for quick CIMICE analysis In this short tutorial it is shown how to execute CIMICE's analysis ### Read input To load the input use `CIMICE::read` family functions or the pair `CIMICE::make_dataset`, `CIMICE::update_df` in the following manner: ```{r cars, results=FALSE} # read from file read_CAPRI(system.file("extdata", "example.CAPRI", package = "CIMICE", mustWork = TRUE)) ``` ```{r} # from a string read_CAPRI_string(" s\\g A B C D S1 0 0 0 1 S2 1 0 0 0 S3 1 0 0 0 S4 1 0 0 1 S5 1 1 0 1 S6 1 1 0 1 S7 1 0 1 1 S8 1 1 0 1 ") ``` ```{r} # using CIMICE::make_dataset and CIMICE::update_df # genes make_dataset(A,B,C,D) %>% # samples update_df("S1", 0, 0, 0, 1) %>% update_df("S2", 1, 0, 0, 0) %>% update_df("S3", 1, 0, 0, 0) %>% update_df("S4", 1, 0, 0, 1) %>% update_df("S5", 1, 1, 0, 1) %>% update_df("S6", 1, 1, 0, 1) %>% update_df("S7", 1, 0, 1, 1) %>% update_df("S8", 1, 1, 0, 1) ``` This last dataset will be use in this example, under the name of `example_dataset()`. Input dataset analysis and feature selection should be done at this point. ## Preprocess dataset This early phase reorganizes the dataset to simplify further analysis. ```{r} preproc <- dataset_preprocessing(example_dataset()) samples <- preproc[["samples"]] freqs <- preproc[["freqs"]] labels <- preproc[["labels"]] genes <- preproc[["genes"]] ``` ## Build topology The default method to organize genotypes in a graph in CIMICE is based on the "subset" relation among them. ```{r} g <- graph_non_transitive_subset_topology(samples,labels) ``` ```{r, echo=FALSE} V(g)$vertex.size <- rep(10, length(V(g))) plot(g, vertex.size=rep(55, length(V(g)))) ``` ## Compute Weights CIMICE provides a strategy to estimate transition probabilities among genotypes. This phase is divided in four different subphases that can be executed together in this way: ```{r} W <- compute_weights_default(g, freqs) ``` ## Visualize output: CIMICE supports many different output options, the recommended one is based on "visNetwork". ```{r} out <- quick_run(example_dataset()) # quick_run summarizes all the previous steps draw_visNetwork(out) ```