Bioconductor has advanced facilities for analysis of microarray platforms including Affymetrix, Illumina, Nimblegen, Agilent, and other one- and two-color technologies. Bioconductor includes extensive support for analysis of expression arrays, and well-developed support for exon, copy number, SNP, methylation, and other assays. Major workflows in Bioconductor include pre-processing, quality assessment, differential expression, clustering and classification, gene set enrichment analysis, and genetical genomics. Bioconductor offers extensive interfaces to community resources, including GEO, ArrayExpress, Biomart, genome browsers, GO, KEGG, and diverse annotation sources.
R version: R version 4.0.3 (2020-10-10)
Bioconductor version: 3.12
Package version: 1.16.0
The following code illustrates a typical R / Bioconductor
session. It uses RMA from the affy
package to pre-process Affymetrix
arrays, and the limma
package for assessing differential expression.
## Load packages
library(affy) # Affymetrix pre-processing
library(limma) # two-color pre-processing; differential
# expression
## import "phenotype" data, describing the experimental design
phenoData <-
read.AnnotatedDataFrame(system.file("extdata", "pdata.txt",
package="arrays"))
## RMA normalization
celfiles <- system.file("extdata", package="arrays")
eset <- justRMA(phenoData=phenoData,
celfile.path=celfiles)
## Warning: replacing previous import 'AnnotationDbi::tail' by 'utils::tail' when
## loading 'hgfocuscdf'
## Warning: replacing previous import 'AnnotationDbi::head' by 'utils::head' when
## loading 'hgfocuscdf'
##
## differential expression
combn <- factor(paste(pData(phenoData)[,1],
pData(phenoData)[,2], sep = "_"))
design <- model.matrix(~combn) # describe model to be fit
fit <- lmFit(eset, design) # fit each probeset to model
efit <- eBayes(fit) # empirical Bayes adjustment
topTable(efit, coef=2) # table of differentially expressed probesets
## logFC AveExpr t P.Value adj.P.Val B
## 204582_s_at 3.468416 10.150533 39.03471 1.969915e-14 1.732146e-10 19.86082
## 211548_s_at -2.325670 7.178610 -22.73165 1.541158e-11 6.775701e-08 15.88709
## 216598_s_at 1.936306 7.692822 21.73818 2.658881e-11 7.793180e-08 15.48223
## 211110_s_at 3.157766 7.909391 21.19204 3.625216e-11 7.969130e-08 15.24728
## 206001_at -1.590732 12.402722 -18.64398 1.715422e-10 3.016740e-07 14.01955
## 202409_at 3.274118 6.704989 17.72512 3.156709e-10 4.626157e-07 13.51659
## 221019_s_at 2.251730 7.104012 16.34552 8.353283e-10 1.049292e-06 12.69145
## 204688_at 1.813001 7.125307 14.75281 2.834343e-09 3.115297e-06 11.61959
## 205489_at 1.240713 7.552260 13.62265 7.264649e-09 7.097562e-06 10.76948
## 209288_s_at -1.226421 7.603917 -13.32681 9.401074e-09 7.784531e-06 10.53327
A top table resulting from a more complete analysis, described in Chapter 7 of Bioconductor Case Studies, is shown below. The table enumerates Affymetrix probes, the log-fold difference between two experimental groups, the average expression across all samples, the t-statistic describing differential expression, the unadjusted and adjusted (controlling for false discovery rate, in this case) significance of the difference, and log-odds ratio. These results can be used in further analysis and annotation.
ID logFC AveExpr t P.Value adj.P.Val B
636_g_at 1.10 9.20 9.03 4.88e-14 1.23e-10 21.29
39730_at 1.15 9.00 8.59 3.88e-13 4.89e-10 19.34
1635_at 1.20 7.90 7.34 1.23e-10 1.03e-07 13.91
1674_at 1.43 5.00 7.05 4.55e-10 2.87e-07 12.67
40504_at 1.18 4.24 6.66 2.57e-09 1.30e-06 11.03
40202_at 1.78 8.62 6.39 8.62e-09 3.63e-06 9.89
37015_at 1.03 4.33 6.24 1.66e-08 6.00e-06 9.27
32434_at 1.68 4.47 5.97 5.38e-08 1.70e-05 8.16
37027_at 1.35 8.44 5.81 1.10e-07 3.08e-05 7.49
37403_at 1.12 5.09 5.48 4.27e-07 1.08e-04 6.21
[ Back to top ]
Follow installation instructions to start using these
packages. You can install affy
and limma
as follows:
if (!"BiocManager" %in% rownames(installed.packages()))
install.packages("BiocManager")
BiocManager::install(c("affy", "limma"), dependencies=TRUE)
To install additional packages, such as the annotations associated with the Affymetrix Human Genome U95A 2.0, use
BiocManager::install("hgu95av2.db", dependencies=TRUE)
Package installation is required only once per R installation. View a /packagesfull list of available packages.
To use the affy
and limma
packages, evaluate the commands
library("affy")
library("limma")
These commands are required once in each R session.
[ Back to top ]
Packages have extensive help pages, and include vignettes highlighting common use cases. The help pages and vignettes are available from within R. After loading a package, use syntax like
help(package="limma")
?topTable
to obtain an overview of help on the limma
package, and the
topTable
function, and
browseVignettes(package="limma")
to view vignettes (providing a more comprehensive introduction to
package functionality) in the limma
package. Use
help.start()
to open a web page containing comprehensive help resources.
[ Back to top ]
The following provide a brief overview of packages useful for pre-processing. More comprehensive workflows can be found in documentation (available from package descriptions) and in Bioconductor Books and monographs.
BiocManager::install()
BiocManager::install()
BiocManager::install()
BiocManager::install()
lumiHumanAll.db
and lumiHumanIDMapping
)illuminaHumanv1BeadID.db
and illuminaHumanV1.db
)[ Back to top ]
sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.12-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.12-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] hgfocuscdf_2.18.0 affy_1.68.0 Biobase_2.50.0
## [4] BiocGenerics_0.36.0 limma_3.46.0 arrays_1.16.0
## [7] BiocStyle_2.18.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.5 AnnotationDbi_1.52.0 knitr_1.30
## [4] magrittr_1.5 IRanges_2.24.0 zlibbioc_1.36.0
## [7] bit_4.0.4 rlang_0.4.8 blob_1.2.1
## [10] stringr_1.4.0 tools_4.0.3 xfun_0.18
## [13] DBI_1.1.0 htmltools_0.5.0 bit64_4.0.5
## [16] yaml_2.2.1 digest_0.6.27 preprocessCore_1.52.0
## [19] bookdown_0.21 affyio_1.60.0 BiocManager_1.30.10
## [22] S4Vectors_0.28.0 vctrs_0.3.4 memoise_1.1.0
## [25] RSQLite_2.2.1 evaluate_0.14 rmarkdown_2.5
## [28] stringi_1.5.3 compiler_4.0.3 stats4_4.0.3
[ Back to top ]