--- title: "TF activity inference from bulk transcriptomic data with DoRothEA as regulon resource." author: - name: Christian H. Holland affiliation: Institute for Computational Biomedicine, Heidelberg University email: christian.holland@bioquant.uni-heidelberg.de - name: Alberto Valdeolivas affiliation: Institute for Computational Biomedicine, Heidelberg University - name: Julio Saez-Rodriguez affiliation: Institute for Computational Biomedicine, Heidelberg University package: dorothea output: BiocStyle::html_document bibliography: references.bib abstract: | This vignette describes how to infer transcription factor activity from bulk transcriptome data by using DoRothEA's curated regulons with viper. license: GNU-GLPv3, please check http://www.gnu.org/licenses/ vignette: | %\VignetteIndexEntry{TF activity inference from bulk transcriptomic data with DoRothEA as regulon resource.} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Introduction **DoRothEA** is a comprehensive resource containing a curated collection of transcription factors (TFs) and its transcriptional targets. The set of genes regulated by a specific transcription factor is known as regulon. DoRothEA's regulons were gathered from different types of evidence. Each TF-target interaction is defined by a confidence level based on the number of supporting evidence. The confidence levels ranges from A (highest confidence) to E (lowest confidence) [@GarciaAlonso2019]. While DoRothEA was originally developed for the application on human data it can be applied also on mouse data with comparable performace but better coverage than dedicated mouse regulons [@Holland2019]. **DoRothEA** regulons are usually coupled with the statistical method **VIPER** [@Alvarez2016]. In this context, TF activities are computed based on the mRNA expression levels of its targets. We therefore can consider TF activity as a proxy of a given transcriptional state [@Dugourd2019]. However, it is up to the user to decide which statistcal method to use. Alternatives could be for instance classical Gene Set Enrichment Analysis or simply mean statistic. # Installation First of all, you need a current version of R (http://www.r-project.org). In addition you need `r Biocpkg("dorothea")`, a freely available package deposited on http://bioconductor.org/ and https://github.com/saezlab/dorothea. You can install it by running the following commands on an R console: ```{r "installation", eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("dorothea") ``` We also load here the packages required to run this vignette ```{r "load packages", message = FALSE} ## We load the required packages library(dorothea) library(bcellViper) library(dplyr) library(viper) ``` # Example of usage According to the vignette from the `r Biocpkg("viper")` package we demonstrate how to combine viper with regulons from DoRothEA. ## Accessing example data and DoRothEA regulons. Similiar to the `r Biocpkg("viper")` vignette we use the gene expression matrix from the `r Biocpkg("bcellViper")` package. Click [here](https://bioconductor.org/packages/release/bioc/vignettes/viper/inst/doc/viper.pdf) for more information about the gene expression matrix. The regulons from DoRothEA are provided within the dorothea package and can be acessed via the `data()` function. As the gene expression matrix contains human data we also load the human version of DoRothEA. ```{r "load data"} # accessing expression data from bcellViper data(bcellViper, package = "bcellViper") # acessing (human) dorothea regulons # for mouse regulons: data(dorothea_mm, package = "dorothea") data(dorothea_hs, package = "dorothea") ``` ## Running VIPER with DoRothEA regulons We implemented a wrapper for the viper function that can deal with different input types such as matrix, dataframe, ExpressionSet or Seurat objects (see dedicated vignette for single-cell analysis). We subset DoRothEA to the confidence levels A and B to include only the high quality regulons. ```{r "run viper", message=FALSE} regulons = dorothea_hs %>% filter(confidence %in% c("A", "B")) tf_activities <- run_viper(dset, regulons, options = list(method = "scale", minsize = 4, eset.filter = FALSE, cores = 1, verbose = FALSE)) ``` # Session info ```{r sessionInfo, echo=FALSE} sessionInfo() ``` # References