--- title: specL automatic report author: - name: Christian Panse email: cp@fgcz.ethz.ch - name: Witold E. Wolski email: wew@fgcz.ethz.ch affiliation: Functional Genomics Center Zurich date: "`r doc_date()`" package: "`r pkg_ver('specL')`" references: - id: bfabric title: "B-Fabric: the Swiss Army Knife for life sciences" author: - given: Can family: Türker - given: Fuat family: Akal - given: Dieter family: Studer-Joho - given: Christian family: Panse - given: Simon family: Barkow-Oesterreicher - given: Hubert family: Rehrauer - given: Ralph family: Schlapbach container-title: EDBT 2010, 13th International Conference on Extending Database Technology, Lausanne, Switzerland, March 22-26, 2010, Proceedings volume: 11 URL: 'http://doi.acm.org/10.1145/1739041.1739135' DOI: 10.1145/1739041.1739135 page: 717-720 type: article-journal issued: year: 2010 month: 3 - id: pmid25712692 title: "specL—an R/Bioconductor package to prepare peptide spectrum matches for use in targeted proteomics" author: - given: Christian family: Panse - given: Christian family: Trachsel - given: Jonas family: Grossmann - given: Ralph family: Schlapbach container-title: Bioinformatics volume: 31 URL: 'http://dx.doi.org/10.1093/bioinformatics/btv105' DOI: 10.1093/bioinformatics/btv105 number: 13 page: 2228-2231 type: article-journal issued: year: 2015 month: 7 - id: pmid18428681 title: Using BiblioSpec for Creating and Searching Tandem MS Peptide Libraries author: - family: Frewen given: Barbara , - given: Michael J. family: MacCoss container-title: Curr Protoc Bioinformatics DOI: 10.1002/0471250953.bi1307s20 type: article-journal issued: year: 2007 month: 12 abstract: > This files contains all the commands performing a default SWATH ion library generation at the FGCZ. This document is usually triggered by the bfabric system [@bfabric] and is meant for training and reproducibility. vignette: > %\VignetteIndexEntry{Automatic Workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} output: BiocStyle::html_document --- # Requirements In a first step, the peptide identification result is generated by a standard shotgun proteomics experiment and has to be processed using the _bibliospec_ software. [@pmid18428681]. For generating the ion library the `r Biocpkg('specL')` is used. The workflow is described in [@pmid25712692]. The following R packages has to be installed on the compute box. ```{r library} library(specL) ``` This file can be rendered by useing the following code snippet. ```{r render, eval=FALSE} library(rmarkdown) library(BiocStyle) report_file <- tempfile(fileext='.Rmd'); file.copy(system.file("doc", "report.Rmd", package = "specL"), report_file); rmarkdown::render(report_file, output_format='html_document', output_file='/tmp/report_specL.html') ``` # Input ## Parameter If no `INPUT` is defined the report uses the `r Biocpkg("specL")` package's data and the following default parameters. ```{r defineInput} if(!exists("INPUT")){ INPUT <- list(FASTA_FILE = system.file("extdata", "SP201602-specL.fasta.gz", package = "specL"), BLIB_FILTERED_FILE = system.file("extdata", "peptideStd.sqlite", package = "specL"), BLIB_REDUNDANT_FILE = system.file("extdata", "peptideStd_redundant.sqlite", package = "specL"), MIN_IONS = 5, MAX_IONS = 6, MZ_ERROR = 0.05, MASCOTSCORECUTOFF = 17, FRAGMENTIONMZRANGE = c(300, 1250), FRAGMENTIONRANGE = c(5, 200), NORMRTPEPTIDES = specL::iRTpeptides, OUTPUT_LIBRARY_FILE = tempfile(fileext ='.csv'), RDATA_LIBRARY_FILE = tempfile(fileext ='.RData'), ANNOTATE = TRUE ) } ``` The library generation workflow was performed using the following parameters: ```{r cat, echo=FALSE, eval=FALSE} cat( " MASCOTSCORECUTOFF = ", INPUT$MASCOTSCORECUTOFF, "\n", " BLIB_FILTERED_FILE = ", INPUT$BLIB_FILTERED_FILE, "\n", " BLIB_REDUNDANT_FILE = ", INPUT$BLIB_REDUNDANT_FILE, "\n", " MZ_ERROR = ", INPUT$MZ_ERROR, "\n", " FRAGMENTIONMZRANGE = ", INPUT$FRAGMENTIONMZRANGE, "\n", " FRAGMENTIONRANGE = ", INPUT$FRAGMENTIONRANGE, "\n", " FASTA_FILE = ", INPUT$FASTA_FILE, "\n", " MAX_IONS = ", INPUT$MAX_IONS, "\n", " MIN_IONS = ", INPUT$MIN_IONS, "\n" ) ``` ```{r kableParameter, echo=FALSE, results='asis'} library(knitr) # kable(t(as.data.frame(INPUT))) ii <- ((lapply(INPUT, function(x){ if(typeof(x) %in% c("character", "double")){paste(x, collapse = ', ')}else{NULL} } ))) parameter <- as.data.frame(unlist(ii)) names(parameter) <- 'parameter.values' kable(parameter, caption = 'used INPUT parameter') ``` ## Define the fragment ions of interest The following R helper function is used for composing the in-silico fragment ions using `r CRANpkg("protViz")`. ```{r defineFragmenIons} fragmentIonFunction_specL <- function (b, y) { Hydrogen <- 1.007825 Oxygen <- 15.994915 Nitrogen <- 14.003074 b1_ <- (b ) y1_ <- (y ) b2_ <- (b + Hydrogen) / 2 y2_ <- (y + Hydrogen) / 2 return( cbind(b1_, y1_, b2_, y2_) ) } ``` ## Read the sqlite files ```{r readSqliteFILTERED, warning=FALSE} BLIB_FILTERED <- read.bibliospec(INPUT$BLIB_FILTERED_FILE) summary(BLIB_FILTERED) ``` ```{r readSqliteREDUNDANT, warning=FALSE} BLIB_REDUNDANT <- read.bibliospec(INPUT$BLIB_REDUNDANT_FILE) summary(BLIB_REDUNDANT) ``` ## Protein (re)-annotation After processing the psm using bibliospec the protein information is gone. The `read.fasta` function is provided by the CRAN package `r CRANpkg("seqinr")`. ```{r read.fasta} if(INPUT$ANNOTATE){ FASTA <- read.fasta(INPUT$FASTA_FILE, seqtype = "AA", as.string = TRUE) BLIB_FILTERED <- annotate.protein_id(BLIB_FILTERED, fasta = FASTA) } ``` ## Peptides used for RT normalization The following peptides are used for the RT normalization. The last column indicates by FALSE/TRUE if a peptides is included in the data. The rows were ordered by the RT values. ```{r checkIRTs, echo=FALSE, results='asis'} library(knitr) incl <- INPUT$NORMRTPEPTIDES$peptide %in% sapply(BLIB_REDUNDANT, function(x){x$peptideSequence}) INPUT$NORMRTPEPTIDES$included <- incl if (sum(incl) > 0){ res <- INPUT$NORMRTPEPTIDES[order(INPUT$NORMRTPEPTIDES$rt),] # row.names(res) <- 1:nrow(res) kable(res, caption='peptides used for RT normaization.') } ``` # Generate the ion library ```{r specL::genSwathIonLib, message=FALSE} specLibrary <- specL::genSwathIonLib( data = BLIB_FILTERED, data.fit = BLIB_REDUNDANT, max.mZ.Da.error = INPUT$MZ_ERROR, topN = INPUT$MAX_IONS, fragmentIonMzRange = INPUT$FRAGMENTIONMZRANGE, fragmentIonRange = INPUT$FRAGMENTIONRANGE, fragmentIonFUN = fragmentIonFunction_specL, mascotIonScoreCutOFF = INPUT$MASCOTSCORECUTOFF, iRT = INPUT$NORMRTPEPTIDES ) ``` ## Library Generation Summary Total Number of PSM's with Mascot e-value < 0.05, in your search is __`r length(BLIB_REDUNDANT)`__. The number of unique precurosors is __`r length(BLIB_FILTERED)`__. The size of the generated ion library is __`r length(specLibrary@ionlibrary)`__. That means that __`r round(length(specLibrary@ionlibrary)/length(BLIB_FILTERED) * 100, 2)`__ % of the unique precursors fullfilled the filtering criteria. ```{r summarySpecLibrary} summary(specLibrary) ``` In the following two code snippets the first element of the ion library is displayed: ```{r showSpecLibrary} # slotNames(specLibrary@ionlibrary[[1]]) specLibrary@ionlibrary[[1]] ``` ```{r plotSpecLibraryIons, fig.retina=3} plot(specLibrary@ionlibrary[[1]]) ``` ```{r plotSpecLibrary, fig.retina=3} plot(specLibrary) ``` plots an overview of the whole ion library. Please note, that the iRT peptides used for the normalization of RT do not have to be included in the resulting \code{specLibrary}. # Output ```{r write.spectronaut, eval=TRUE} write.spectronaut(specLibrary, file = INPUT$OUTPUT_LIBRARY_FILE) ``` ```{r save, eval=TRUE} save(specLibrary, file = INPUT$RDATA_LIBRARY_FILE) ``` saves the result object to a file. # Remarks For questions and improvements please do contact the authors of the [specL](https://bioconductor.org/packages/specL/). This report Rmarkdown file has been written by WEW and is maintained by CP. # Session info Here is the output of `sessionInfo()` on the system on which this document was compiled: ```{r sessionInfo, echo=FALSE} sessionInfo() ``` # References