--- title: "NetBoxR Tutorial" author: "Eric Minwei Liu and Augustin Luna" date: "`r format(Sys.time(), '%d %B, %Y')`" output: BiocStyle::html_document: toc: yes toc_float: no html_document: df_print: paged toc: yes html_notebook: default md_document: toc: yes variant: gfm pdf_document: toc: yes always_allow_html: yes vignette: > %\VignetteIndexEntry{NetBoxR Tutorial} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- ```{r knitrSetup, include=FALSE} library(knitr) opts_chunk$set(out.extra='style="display:block; margin: auto"', fig.align="center", fig.width=12, fig.height=12, tidy=TRUE) ``` ```{r style, include=FALSE, echo=FALSE, results='asis'} BiocStyle::markdown() ``` # Overview The **netboxr** package composes a number of functions to retrive and process genetic data from large-scale genomics projects (e.g. TCGA projects) including from mutations, copy number alterations, gene expression and DNA methylation. The netboxr package implements NetBox algorithm in R package. NetBox algorithm integrates genetic alterations with literature-curated pathway knowledge to identify pathway modules in cancer. NetBox algorithm uses (1) global network null model and (2) local network null model to access the statistic significance of the discovered pathway modules. # Basics ## Installation ```{r installNetBoxr, eval=FALSE} BiocManager::install("netboxr") ``` ## Getting Started Load **netboxr** package: ```{r loadLibrary, message=FALSE, warning=FALSE} library(netboxr) ``` A list of all accessible vignettes and methods is available with the following command: ```{r searchHelp, eval=FALSE, tidy=FALSE} help(package="netboxr") ``` For help on any **netboxr** package functions, use one of the following command formats: ```{r showHelp, eval=FALSE, tidy=FALSE} help(geneConnector) ?geneConnector ``` # Example of Cerami et al. PLoS One 2010 This is an example to reproduce the network discovered on [Cerami et al.(2010)](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0008918). The results presented here are comparable to the those from Cerami et al. 2010 though the unadjusted p-values for linker genes are not the same. It is because the unadjusted p-value of linker genes in Cerami et al. 2010 were calculated by the probabiliy of the observed data point, Pr(X). The netboxr used the probability of an observed or more extreme assuming the null hypothesis is true, Pr(X>=x|H), as unadjusted p-value for linker genes. The final number of linker genes after FDR correction are the same between netboxr result and original Cerami et al. 2010. ## Load Human Interactions Network (HIN) network Load pre-defined HIN network and simplify the interactions by removing loops and duplicated interactions in the network. The netowork after reduction contains 9264 nodes and 68111 interactions. ```{r netboxrExampleNetwork} data(netbox2010) sifNetwork <- netbox2010$network graphReduced <- networkSimplify(sifNetwork,directed = FALSE) ``` ## Load altered gene list The altered gene list contains 517 candidates from mutations and copy number alterations. ```{r netboxrExampleGene} geneList <- as.character(netbox2010$geneList) length(geneList) ``` ## Map altered gene list on HIN network The geneConnector function in the netboxr package takes altered gene list as input and maps the genes on the curated network to find the local processes represented by the gene list. ```{r netboxrExampleGeneConnector, fig.width=12, fig.height=12} ## Use Benjamini-Hochberg method to do multiple hypothesis ## correction for linker candidates. ## Use edge-betweeness method to detect community structure in the network. threshold <- 0.05 results <- geneConnector(geneList=geneList, networkGraph=graphReduced, directed=FALSE, pValueAdj="BH", pValueCutoff=threshold, communityMethod="ebc", keepIsolatedNodes=FALSE) # Add edge annotations library(RColorBrewer) edges <- results$netboxOutput interactionType<-unique(edges[,2]) interactionTypeColor<-brewer.pal(length(interactionType),name="Spectral") edgeColors<-data.frame(interactionType,interactionTypeColor,stringsAsFactors = FALSE) colnames(edgeColors)<-c("INTERACTION_TYPE","COLOR") netboxGraphAnnotated <- annotateGraph(netboxResults = results, edgeColors = edgeColors, directed = FALSE, linker = TRUE) # Check the p-value of the selected linker linkerDF <- results$neighborData linkerDF[linkerDF$pValueFDR 0.1] ``` # References * Cerami E, Demir E, Schultz N, Taylor BS, Sander C (2010) Automated Network Analysis Identifies Core Pathways in Glioblastoma. PLoS ONE 5(2): e8918. doi:10.1371/journal.pone.0008918 * Cerami EG, Gross BE, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader GD, Sander C. Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res. 2011 Jan;39(Database issue):D685-90. doi:10.1093/nar/gkq1039. Epub 2010 Nov 10. # Session Information ```{r sessionInfo} sessionInfo() ```