--- title: "Get started with limpca." author: "Benaiche Nadia, Sébastien Franceschini, Martin Manon, Thiel Michel, Govaerts Bernadette" date: '`r format(Sys.time(), "%B %d, %Y")`' package: limpca output: BiocStyle::html_document: toc: true toc_depth: 4 toc_float: true vignette: > %\VignetteIndexEntry{Get started with limpca} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console bibliography: references.bib --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width = "70%", fig.align = "center", eval = TRUE ) rm(list = ls()) library("limpca") if (!requireNamespace("SummarizedExperiment", quietly = TRUE)) { stop("Install 'SummarizedExperiment' to knit this vignette") } library(SummarizedExperiment) ``` [![R-CMD-check](https://github.com/ManonMartin/limpca/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/ManonMartin/limpca/actions/workflows/check-standard.yaml) # Introduction ## About the package This package was created to analyse models with high-dimensional data and a multi-factor design of experiment. `limpca` stands for **li**near **m**odeling of high-dimensional designed data based on the ASCA (ANOVA-Simultaneous Component Analysis) and A**PCA** (ANOVA-Principal Component Analysis) family of methods. These methods combine ANOVA with a General Linear Model (GLM) decomposition and PCA. They provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design. Details on the methods used and the package implementation can be found in the articles of @Thiel2017, @Guisset2019 and @Thiel2023. Therefore, ASCA/APCA are highly informative modeling and visualisation tools to analyse -omics data tables in a multivariate framework and act as a complement to differential expression analyses methods such as `limma` (@limma2015). ## Vignettes description - `Get started with limpca` (this vignette): This vignette is a short application of `limpca` on the `UCH` dataset with data visualisation, exploration (PCA), GLM decomposition and ASCA modelling. The ASCA model used in this example is a three-way ANOVA with fixed effects. - [Analysis of the UCH dataset with limpca](UCH.html): This vignette is an extensive application of `limpca` on the `UCH` dataset with data visualisation, exploration (PCA), GLM decomposition and ASCA/APCA/ASCA-E modelling. The applied model is a three-way ANOVA with fixed effects. This document presents all the usual steps of the analysis, from importing the data to visualising the results. - [Analysis of the Trout dataset with limpca](Trout.html): This vignette is an extensive application of `limpca` on the `Trout` dataset with data visualisation, exploration (PCA), GLM decomposition and ASCA/APCA/ASCA-E modelling. The applied model involves three main effects and their two-way interaction terms. It also compares the results of ASCA to a univariate ANOVA modeling. # Installation and loading of the `limpca` package `limpca` can be installed from Bioconductor: ```{r Install, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("limpca") ``` And then loaded into your R session: ```{r Load, results=FALSE, message=FALSE} library("limpca") ``` For any enquiry, you can send an email to the package authors: [bernadette.govaerts@uclouvain.be](mailto:bernadette.govaerts@uclouvain.be) ; [michel.thiel@uclouvain.be](mailto:michel.thiel@uclouvain.be) or [manon.martin@uclouvain.be](mailto:manon.martin@uclouvain.be) # Short application on the `UCH` dataset ## Data object In order to use the limpca core functions, the data need to be formatted as a list (informally called an lmpDataList) with the following elements: `outcomes` (multivariate matrix), `design` (data.frame) and `formula` (character string). The `UCH` data set is already formatted appropriately and can be loaded from `limpca` with the `data` function. ```{r} data("UCH") str(UCH) ``` Alternatively, the lmpDataList can be created with the function `data2LmpDataList` : - from scratch: ```{r, message = TRUE} UCH2 <- data2LmpDataList( outcomes = UCH$outcomes, design = UCH$design, formula = UCH$formula ) ``` - or from a `SummarizedExperiment`: ```{r, message = TRUE} se <- SummarizedExperiment( assays = list( counts = t(UCH$outcomes)), colData = UCH$design, metadata = list(formula = UCH$formula) ) UCH3 <- data2LmpDataList(se, assay_name = "counts") ``` `SummarizedExperiment` is a generic data container that stores rectangular matrices of experimental results. See @SummarizedExperiment23 for more information. ## Data visualisation The design can be visualised with `plotDesign()`. ```{r dataVisu} # design plotDesign( design = UCH$design, x = "Hippurate", y = "Citrate", rows = "Time", title = "Design of the UCH dataset" ) # row 3 of outcomes plotLine( Y = UCH$outcomes, title = "H-NMR spectrum", rows = c(3), xlab = "ppm", ylab = "Intensity" ) ``` ## PCA ```{r PCA} ResPCA <- pcaBySvd(UCH$outcomes) pcaScreePlot(ResPCA, nPC = 6) pcaScorePlot( resPcaBySvd = ResPCA, axes = c(1, 2), title = "PCA scores plot: PC1 and PC2", design = UCH$design, color = "Hippurate", shape = "Citrate", points_labs_rn = FALSE ) ``` ## Model estimation and effect matrix decomposition ```{r modelEst} # Model matrix generation resMM <- lmpModelMatrix(UCH) # Model estimation and effect matrices decomposition resEM <- lmpEffectMatrices(resMM) ``` ## Effect matrix test of significance and importance measure ```{r effectImpSign} # Effects importance resEM$varPercentagesPlot # Bootstrap tests resBT <- lmpBootstrapTests(resLmpEffectMatrices = resEM, nboot = 100) resBT$resultsTable ``` ## ASCA decomposition ```{r ASCA} # ASCA decomposition resASCA <- lmpPcaEffects(resLmpEffectMatrices = resEM, method = "ASCA") # Scores Plot for the hippurate lmpScorePlot(resASCA, effectNames = "Hippurate", color = "Hippurate", shape = "Hippurate" ) # Loadings Plot for the hippurate lmpLoading1dPlot(resASCA, effectNames = c("Hippurate"), axes = 1, xlab = "ppm" ) # Scores ScatterPlot matrix lmpScoreScatterPlotM(resASCA, PCdim = c(1, 1, 1, 1, 1, 1, 1, 2), modelAbbrev = TRUE, varname.colorup = "Citrate", varname.colordown = "Time", varname.pchup = "Hippurate", varname.pchdown = "Time", title = "ASCA scores scatterplot matrix" ) ``` # sessionInfo ```{r} sessionInfo() ```