Robust Probabilistic Averaging (RPA) is a fully scalable algorithm for probe-level preprocessing and analysis of short oligonucleotide microarray collections of any size, from moderately sized standard data sets to arbitrarily large microarray atlases involving tens of thousands of samples, or more.
Special wrappers are available for Affymetrix gene expression arrays and phylogenetic micoarrays (HITChip).
RPA also provides explicit data-driven estimates of probe-specific affinity and noise based on a rigorous probabilistic model and significantly outperforms the standard RMA model in benchmarking tests (NAR 2013), at the same time achieving a full scalability.
The method is available in R/Bioconductor, and documented in the publications listed below. For all installation and usage instructions, kindly see the RPA wiki.
The RPA methodology has been documented in these two publications. Kindly cite if appropriate:
A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases Leo Lahti, Aurora Torrente, Laura L Elo, Alvis Brazma, Johan Rung. Nucleic Acids Research’ 41(10):e110, 2013.
Probabilistic analysis of probe reliability in differential gene expression studies with short oligonucleotide arrays Leo Lahti, Laura L. Elo, Tero Aittokallio, and Samuel Kaski. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(1):217-25, 2011.