--- title: > The `iSEEfier` User's Guide author: - name: Najla Abassi affiliation: - Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Mainz email: najla.abassi@uni-mainz.de - name: Federico Marini affiliation: - Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Mainz - Research Center for Immunotherapy (FZI), Mainz email: marinif@uni-mainz.de date: "`r BiocStyle::doc_date()`" output: BiocStyle::html_document: toc: true toc_float: true vignette: > %\VignetteIndexEntry{iSEEfier_userguide} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignettePackage{iSEEfier} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", error = FALSE, warning = FALSE, eval = TRUE, message = FALSE ) ``` # Introduction {#introduction} This vignette describes how to use the `r BiocStyle::Biocpkg("iSEEfier")` package to configure various initial states of iSEE instances, in order to simplify the task of visualizing single-cell RNA-seq, bulk RNA-seq data, or even your proteomics data in `r BiocStyle::Biocpkg("iSEE")`. In the remainder of this vignette, we will illustrate the main features of `r BiocStyle::Biocpkg("iSEEfier")` on a publicly available dataset from Baron et al. "A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure", published in Cell Systems in 2016. [doi:10.1016/j.cels.2016.08.011](https://doi.org/10.1016/j.cels.2016.08.011). The data is made available via the `r BiocStyle::Biocpkg("scRNAseq")` Bioconductor package. We'll simply use the mouse dataset, consisting of islets isolated from five C57BL/6 and ICR mice. # Getting started {#gettingstarted} To install `r BiocStyle::Biocpkg("iSEEfier")` package, we start R and enter: ```{r install, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("iSEEfier") ``` Once installed, the package can be loaded and attached to the current workspace as follows: ```{r setup} library("iSEEfier") ``` # Create an initial state for gene expression visualization using `iSEEinit()` When we have all input elements ready, we can create an `iSEE` initial state by running: ```{r runfunc, eval=FALSE} iSEEinit(sce = sce_obj, features = feature_list, reddim.type = reduced_dim, clusters = cluster, groups = group, add_markdown_panel = FALSE) ``` To configure the initial state of our `iSEE` instance using `iSEEinit()`, we need five parameters: 1. `sce` : A `SingleCellExperiment` object. This object stores information of different quantifications (counts, log-expression...), dimensionality reduction coordinates (t-SNE, UMAP...), as well as some metadata related to the samples and features. We'll start by loading the `sce` object: ```{r loaddata} library("scRNAseq") sce <- BaronPancreasData('mouse') sce ``` Let's add the normalized counts ```{r logNorm} library("scuttle") sce <- logNormCounts(sce) ``` Now we can add different dimensionality reduction coordinates ```{r reddim} library("scater") sce <- runPCA(sce) sce <- runTSNE(sce) sce <- runUMAP(sce) ``` Now our `sce` is ready, we can move on to the next argument. 2. `features` : which is a vector or a dataframe containing the genes/features of interest. Let's say we would like to visualize the expression of some genes that were identified as marker genes for different cell population. ```{r genelist} gene_list <- c("Gcg", # alpha "Ins1") # beta ``` 3. `reddim_type` : In this example we decided to plot our data as a t-SNE plot. ```{r reddim-type} reddim_type <- "TSNE" ``` 4. `clusters` : Now we specify what clusters/cell-types/states/samples we would like to color/split our data with ```{r cluster-id} # cell populations cluster <- "label" #the name should match what's in the colData names ``` 5. `groups` : Here we can add the groups/conditions/cell-types ```{r group-id} # ICR vs C57BL/6 group <- "strain" #the name should match what's in the colData names ``` We can choose to include in this initial step a `MarkdownBoard` by setting the arguments `add_markdown_panel` to `TRUE`. At this point, all the elements are ready to be transferred into `iSEEinit()` ```{r initial1} initial1 <- iSEEinit(sce = sce, features = gene_list, clusters = cluster, groups = group, add_markdown_panel = TRUE) ``` In case our `features` parameter was a data.frame, we could assign the name of the column containing the features to the `gene_id` parameter. Now we are one step away from visualizing our list of genes of interest. All that's left to do is to run `iSEE` with the initial state created with `iSEEinit()` ```{r iSEEviz1, eval=FALSE} library("iSEE") iSEE(sce, initial= initial1) ``` This instance, generated with `iSEEinit()`, returns a combination of panels, linked to each other, with the goal of visualizing the expression of certain marker genes in each cell population/group: - A `ReducedDimensionPlot`, `FeatureAssayPlot` and `RowDataTable` for each single gene in `features`. - A `ComplexHeatmapPlot` with all genes in `features` - A `ColumnDataPlot` panel - A `MarkdownBoard` panel # Create an initial state for feature sets exploration using `iSEEnrich()` Sometimes it is interesting to look at some specific feature sets and the associated genes. That's when the utility of `iSEEnrich` becomes apparent. We will need 4 elements to explore feature sets of interest: - `sce`: A SingleCellExperiment object - `collection`: A character vector specifying the gene set collections of interest (it is possible to use GO or KEGG terms) - `gene_identifier`: A character string specifying the identifier to use to extract gene IDs for the organism package. This can be **"ENS"** for ENSEMBL ids, **"SYMBOL"** for gene names... - `organism`: A character string of the `org.*.eg.db` package to use to extract mappings of gene sets to gene IDs. - `reddim_type`: A string vector containing the dimensionality reduction type - `clusters`: A character string containing the name of the clusters/cell-type/state...(as listed in the colData of the sce) - `groups`: A character string of the groups/conditions...(as it appears in the colData of the sce) ```{r set-param} GO_collection <- "GO" Mm_organism <- "org.Mm.eg.db" gene_id <- "SYMBOL" cluster <- "label" group <- "strain" reddim_type <- "PCA" ``` Now let's create this initial setup for `iSEE` using `iSEEnrich()` ```{r initial2} results <- iSEEnrich(sce = sce, collection = GO_collection, organism = Mm_organism, gene_identifier = gene_id, clusters = cluster, groups = group, reddim_type = reddim_type) ``` `iSEEnrich` will specifically return a list with the updated `sce` object and its associated `initial` configuration. To start the `iSEE` instance we run: ```{r iSEEviz2, eval=FALSE} iSEE(results$sce, initial = results$initial) ``` # Create an initial state for marker gene exploration using `iSEEmarker()` In many cases, we are interested in determining the identity of our clusters, or further subset our cells types. That's where `iSEEmarker()` comes in handy. Similar to `iSEEinit()`, we need the following parameters: - `sce`: a `SingleCellExperiment` object - `clusters`: the name of the clusters/cell-type/state - `groups`: the groups/conditions - `selection_plot_format`: the class of the panel that we will be using to select the clusters of interest. ```{r initial3} initial3 <- iSEEmarker( sce = sce, clusters = cluster, groups = group, selection_plot_format = "ColumnDataPlot") ``` This function returns a list of panels, with the goal of visualizing the expression of marker genes selected from the `DynamicMarkerTable` in each cell cell type. Unlike `iSEEinit()`, which requires us to specify a list of genes, `iSEEmarker()` utilizes the `DynamicMarkerTable` that performs statistical testing through the `findMarkers()` function from the `r BiocStyle::Biocpkg("scran")` package. To start exploring the marker genes of each cell type with `iSEE`, we run: ```{r iSEEviz3, eval=FALSE} iSEE(sce, initial = initial3) ``` # Visualize a preview of the initial configurations with `view_initial_tiles()` Previously, we successfully generated three distinct initial configurations for iSEE. However, understanding the expected content of our iSEE instances is not always straightforward. That's when we can use `view_initial_tiles()`. We only need as an input the initial configuration to obtain a graphical visualization of the expected the corresponding `iSEE` instance: ```{r panelgraph} library(ggplot2) view_initial_tiles(initial = initial1) view_initial_tiles(initial = results$initial) ``` # Visualize network connections between panels with `view_initial_network()` As some of these panels are linked to each other, we can visualize these networks with `view_initial_network()`. Similar to `iSEEconfigviewer()`, this function takes the initial setup as input: This function always returns the `igraph` object underlying the visualizations that can be displayed as a side effect. ```{r networkviz} library("igraph") library("visNetwork") g1 <- view_initial_network(initial1, plot_format = "igraph") g1 initial2 <- results$initial g2 <- view_initial_network(initial2, plot_format = "visNetwork") ``` # Merge different initial configurations with `glue_initials()` Sometimes, it would be interesting to merge different `iSEE` initial configurations to visualize all different panel in the same `iSEE` instance. ```{r glueconfig} merged_config <- glue_initials(initial1,initial2) ``` We can then preview the content of this initial configuration ```{r preview} view_initial_tiles(merged_config) ``` # Related work The idea of launching `iSEE()` with some specific configuration is not entirely new, and it was covered in some use cases by the `mode_` functions available in the `r BiocStyle::Biocpkg("iSEEu")` package.\ There, the user has access to the following: - `iSEEu::modeEmpty()` - this will launch `iSEE` without any panels, and let you build up the configuration from the scratch. Easy to start, easy to build. - `iSEEu::modeGating()` - this will open `iSEE` with multiple chain-linked FeatureExpressionPlot panels, just like when doing some in silico gating. This could be a very good fit if working with mass cytometry data. - `iSEEu::modeReducedDim()` - `iSEE` will be ready to compare multiple ReducedDimensionPlot panels, which is a suitable option to compare the views resulting from different embeddings (and/or embeddings generated with slightly different parameter configurations). The `mode`s directly launch an instance of `iSEE`, whereas the functionality in `r BiocStyle::Biocpkg("iSEEfier")` is rather oriented to obtain more tailored-to-the-data-at-hand `initial` objects, that can subsequently be passed as an argument to the `iSEE()` call. We encourage users to submit suggestions about their "classical ways" of using `iSEE` on their data - be that by opening an issue or already proposing a Pull Request on GitHub. # Session info {.unnumbered} ```{r} sessionInfo() ```