--- title: 'SummarizedExperiment object' author: - name: Lindsay Rutter date: '`r Sys.Date()`' package: bigPint bibliography: bigPint.bib output: BiocStyle::html_document: toc_float: true tidy: TRUE vignette: > \usepackage[utf8]{inputenc} %\VignetteIndexEntry{"Data metrics object"} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignettePackage{bigPint} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo=TRUE) ``` ## SummarizedExperiment object Researchers who prefer using the `SummarizedExperiment` class can feed their data and associated metrics into the `bigPint` methods using one `SummarizedExperiment` object (instead of both a `data` and `dataMetrics` object). We demonstrate here how to create a `SummarizedExperiment` object. In the remaining articles of the vignette, the example code uses the `data` and `dataMetrics` objects as example input. However, at the bottom of each vignette, we also include the corresponding example code that uses instead the `SummarizedExperiment` object to create the exact same plots. ____________________________________________________________________________________ ## Example: two treatments As was shown in the article [Data object](https://lindsayrutter.github.io/bigPint/articles/data.html), the `data` object called `soybean_ir_sub` contained 5,604 genes and two treatment groups, N and P [@soybeanIR]. We can create a `SummarizedExperiment` object that combines aspects of the `data` object and `dataMetrics` object. We start by preparing the `data` and `dataMetrics` objects. ```{r, eval=TRUE, include=TRUE, message=FALSE} library(bigPint) library(edgeR) library(data.table) library(dplyr) library(DelayedArray) library(SummarizedExperiment) data(soybean_ir_sub) data = soybean_ir_sub rownames(data) = data[,1] y = DGEList(counts=data[,-1]) group = c(1,1,1,2,2,2) y = DGEList(counts=y, group=group) Group = factor(c(rep("N",3), rep("P",3))) design <- model.matrix(~0+Group, data=y$samples) colnames(design) <- levels(Group) y <- estimateDisp(y, design) fit <- glmFit(y, design) dataMetrics <- data.frame(ID = rownames(data)) for (i in 1:(ncol(fit)-1)){ for (j in (i+1):ncol(fit)){ contrast=rep(0,ncol(fit)) contrast[i]=1 contrast[j]=-1 lrt <- glmLRT(fit, contrast=contrast) lrt <- topTags(lrt, n = nrow(y[[1]]))[[1]] setDT(lrt, keep.rownames = TRUE)[] lrt <- as.data.frame(lrt) colnames(lrt) <- paste0(colnames(fit)[i], "_", colnames(fit)[j], ".", colnames(lrt)) colnames(lrt)[1] = "ID" reorderID <- full_join(data, lrt, by = "ID") dataMetrics <- cbind(dataMetrics, reorderID[,-c(1:ncol(data))]) } } ``` Now, we can tranform our `data` object into a `DelayedMatrix` class object and then input both our `data` and `dataMetrics` objects combined into a `SummarizedExperiment` class object. We can verify that the `assay()` and `rowData()` methods work for accessing our `SummarizedExperiment` object. ```{r, eval=TRUE, include=TRUE, message=FALSE} dataMetrics$ID <- as.character(dataMetrics$ID) data = data[,-1] data <- DelayedArray(data) se_soybean_ir_sub <- SummarizedExperiment(assays = data, rowData = dataMetrics) assay(se_soybean_ir_sub) rowData(se_soybean_ir_sub) ``` ____________________________________________________________________________________ ## Example: three treatments Similarly, as was shown in the data page, the `data` object called `soybean_cn_sub` contained 7,332 genes and three treatment groups, S1, S2, and S3 [@brown2015developmental]. We can create a `SummarizedExperiment` object that combines aspects of the `data` object and `dataMetrics` object. We start by preparing the `data` and `dataMetrics` objects. ```{r, eval=TRUE, include=TRUE, message=FALSE} data(soybean_cn_sub) data = soybean_cn_sub rownames(data) = data[,1] y = DGEList(counts=data[,-1]) group = c(1,1,1,2,2,2,3,3,3) y = DGEList(counts=y, group=group) Group = factor(c(rep("S1",3), rep("S2",3), rep("S3",3))) design <- model.matrix(~0+Group, data=y$samples) colnames(design) <- levels(Group) y <- estimateDisp(y, design) fit <- glmFit(y, design) dataMetrics <- data.frame(ID = rownames(data)) for (i in 1:(ncol(fit)-1)){ for (j in (i+1):ncol(fit)){ contrast=rep(0,ncol(fit)) contrast[i]=1 contrast[j]=-1 lrt <- glmLRT(fit, contrast=contrast) lrt <- topTags(lrt, n = nrow(y[[1]]))[[1]] setDT(lrt, keep.rownames = TRUE)[] lrt <- as.data.frame(lrt) colnames(lrt) <- paste0(colnames(fit)[i], "_", colnames(fit)[j], ".", colnames(lrt)) colnames(lrt)[1] = "ID" reorderID <- full_join(data, lrt, by = "ID") dataMetrics <- cbind(dataMetrics, reorderID[,-c(1:ncol(data))]) } } ``` Now, we can tranform our `data` object into a `DelayedMatrix` class object and then input both our `data` and `dataMetrics` objects combined into a `SummarizedExperiment` class object. We can verify that the `assay()` and `rowData()` methods work for accessing our `SummarizedExperiment` object. ```{r, eval=TRUE, include=TRUE, message=FALSE} dataMetrics$ID <- as.character(dataMetrics$ID) data = data[,-1] data <- DelayedArray(data) se_soybean_cn_sub <- SummarizedExperiment(assays = data, rowData = dataMetrics) assay(se_soybean_cn_sub) rowData(se_soybean_cn_sub) ``` ____________________________________________________________________________________ ## References