---
title: "SEtools"
author:
- name: Pierre-Luc Germain
affiliation:
- D-HEST Institute for Neurosciences, ETH Zürich
- Laboratory of Statistical Bioinformatics, University Zürich
package: SEtools
output:
BiocStyle::html_document:
fig_height: 3.5
abstract: |
Showcases the use of SEtools to merge objects of the SummarizedExperiment class.
vignette: |
%\VignetteIndexEntry{SEtools}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include=FALSE}
library(BiocStyle)
```
# Getting started
The `r Rpackage("SEtools")` package is a set of convenience functions for the _Bioconductor_ class `r Biocpkg("SummarizedExperiment")`. It facilitates merging, melting, and plotting `SummarizedExperiment` objects.
**NOTE that the heatmap-related and melting functions have been moved to a standalone package, `r Biocpkg("sechm")`.**
The old `sehm` function of `SEtools` should be considered deprecated, and most `SEtools` functions are conserved for legacy/reproducibility reasons (or until they find a better home).
## Package installation
```{r, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("SEtools")
```
Or, to install the latest development version:
```{r, eval=FALSE}
BiocManager::install("plger/SEtools")
```
## Example data
To showcase the main functions, we will use an example object which contains (a subset of) whole-hippocampus RNAseq of mice after different stressors:
```{r}
suppressPackageStartupMessages({
library(SummarizedExperiment)
library(SEtools)
})
data("SE", package="SEtools")
SE
```
This is taken from [Floriou-Servou et al., Biol Psychiatry 2018](https://doi.org/10.1016/j.biopsych.2018.02.003).
## Merging and aggregating SEs
```{r}
se1 <- SE[,1:10]
se2 <- SE[,11:20]
se3 <- mergeSEs( list(se1=se1, se2=se2) )
se3
```
All assays were merged, along with rowData and colData slots.
By default, row z-scores are calculated for each object when merging. This can be prevented with:
```{r}
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE)
```
If more than one assay is present, one can specify a different scaling behavior for each assay:
```{r}
se3 <- mergeSEs( list(se1=se1, se2=se2), use.assays=c("counts", "logcpm"), do.scale=c(FALSE, TRUE))
```
Differences to the `cbind` method include prefixes added to column names, optional scaling, handling of metadata (e.g. for `sechm`)
### Merging by rowData columns
It is also possible to merge by rowData columns, which are specified through the `mergeBy` argument.
In this case, one can have one-to-many and many-to-many mappings, in which case two behaviors are possible:
* By default, all combinations will be reported, which means that the same feature of one object might appear multiple times in the output because it matches multiple features of another object.
* If a function is passed through `aggFun`, the features of each object will by aggregated by `mergeBy` using this function before merging.
```{r merging}
rowData(se1)$metafeature <- sample(LETTERS,nrow(se1),replace = TRUE)
rowData(se2)$metafeature <- sample(LETTERS,nrow(se2),replace = TRUE)
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE, mergeBy="metafeature", aggFun=median)
sechm::sechm(se3, features=row.names(se3))
```
### Aggregating a SE
A single SE can also be aggregated by using the `aggSE` function:
```{r aggregating}
se1b <- aggSE(se1, by = "metafeature")
se1b
```
If the aggregation function(s) are not specified, `aggSE` will try to guess decent aggregation functions from the assay names.
This is similar to `scuttle::sumCountsAcrossFeatures`, but preserves other SE slots.
***
## Other convenience functions
Calculate an assay of log-foldchanges to the controls:
```{r}
SE <- log2FC(SE, fromAssay="logcpm", controls=SE$Condition=="Homecage")
```
# Session info {.unnumbered}
```{r sessionInfo, echo=FALSE}
sessionInfo()
```