---
title: specL automatic report
author:
  - name: Christian Panse
    email: cp@fgcz.ethz.ch
  - name: Witold E. Wolski
    email: wew@fgcz.ethz.ch
    affiliation: Functional Genomics Center Zurich
date: "`r doc_date()`"
package: "`r pkg_ver('specL')`"
references:
- id: bfabric
  title: "B-Fabric: the Swiss Army Knife for life sciences"
  author:
  - given: Can 
    family: Türker
  - given: Fuat 
    family: Akal
  - given: Dieter 
    family: Studer-Joho
  - given: Christian 
    family: Panse                
  - given: Simon 
    family: Barkow-Oesterreicher
  - given: Hubert 
    family: Rehrauer
  - given: Ralph 
    family: Schlapbach
  container-title: EDBT 2010, 13th International Conference on Extending Database Technology, Lausanne, Switzerland, March 22-26, 2010, Proceedings
  volume: 11
  URL: 'http://doi.acm.org/10.1145/1739041.1739135'
  DOI: 10.1145/1739041.1739135
  page: 717-720
  type: article-journal
  issued:
    year: 2010
    month: 3
- id: pmid25712692
  title: "specL—an R/Bioconductor package to prepare peptide spectrum matches for use in targeted proteomics"
  author:
  - given: Christian 
    family: Panse   
  - given: Christian
    family: Trachsel
  - given: Jonas
    family: Grossmann
  - given: Ralph 
    family: Schlapbach
  container-title: Bioinformatics
  volume: 31
  URL: 'http://dx.doi.org/10.1093/bioinformatics/btv105'
  DOI: 10.1093/bioinformatics/btv105
  number: 13
  page: 2228-2231
  type: article-journal
  issued:
    year: 2015
    month: 7
- id: pmid18428681
  title: Using BiblioSpec for Creating and Searching Tandem MS Peptide Libraries
  author:
  - family: Frewen
    given: Barbara , 
  - given: Michael J. 
    family: MacCoss
  container-title: Curr Protoc Bioinformatics
  DOI: 10.1002/0471250953.bi1307s20
  type: article-journal
  issued:
    year: 2007
    month: 12
abstract: >
  This files contains all the commands performing a default SWATH ion library
  generation at the FGCZ. This document is usually triggered by the bfabric 
  system [@bfabric] and is meant for training and reproducibility.
vignette: >
  %\VignetteIndexEntry{Automatic Workflow}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
output: 
  BiocStyle::html_document
---

# Requirements

In a first step, the peptide identification result is generated by a standard 
shotgun proteomics experiment and has to be processed using the _bibliospec_
software. [@pmid18428681].


For generating the ion library the  `r Biocpkg('specL')` is used. The workflow is
described in [@pmid25712692].

The following R packages has to be installed on the compute box.

```{r library}
library(specL)
```

This file can be rendered by useing the following code snippet.

```{r render, eval=FALSE}
library(rmarkdown)
library(BiocStyle)
report_file <- tempfile(fileext='.Rmd'); 
file.copy(system.file("doc", "report.Rmd", 
                      package = "specL"), 
          report_file); 
rmarkdown::render(report_file, 
                  output_format='html_document', 
                  output_file='/tmp/report_specL.html')
```       


# Input

## Parameter
If no `INPUT` is defined the report uses the `r Biocpkg("specL")` package's data
and the following default parameters. 
```{r defineInput}
if(!exists("INPUT")){
  INPUT <- list(FASTA_FILE 
      = system.file("extdata", "SP201602-specL.fasta.gz",
                    package = "specL"),
    BLIB_FILTERED_FILE 
      = system.file("extdata", "peptideStd.sqlite",
                    package = "specL"),
    BLIB_REDUNDANT_FILE 
      = system.file("extdata", "peptideStd_redundant.sqlite",
                    package = "specL"),
    MIN_IONS = 5,
    MAX_IONS = 6,
    MZ_ERROR = 0.05,
    MASCOTSCORECUTOFF = 17,
    FRAGMENTIONMZRANGE = c(300, 1250),
    FRAGMENTIONRANGE = c(5, 200),
    NORMRTPEPTIDES = specL::iRTpeptides,
    OUTPUT_LIBRARY_FILE = tempfile(fileext ='.csv'),
    RDATA_LIBRARY_FILE = tempfile(fileext ='.RData'),
    ANNOTATE = TRUE
    )
} 
```

The library generation workflow was performed using the following parameters:
```{r cat, echo=FALSE, eval=FALSE}
  cat(
  " MASCOTSCORECUTOFF = ", INPUT$MASCOTSCORECUTOFF, "\n",
  " BLIB_FILTERED_FILE = ", INPUT$BLIB_FILTERED_FILE, "\n",
  " BLIB_REDUNDANT_FILE = ", INPUT$BLIB_REDUNDANT_FILE, "\n",
  " MZ_ERROR = ", INPUT$MZ_ERROR, "\n",
  " FRAGMENTIONMZRANGE = ", INPUT$FRAGMENTIONMZRANGE, "\n",
  " FRAGMENTIONRANGE = ", INPUT$FRAGMENTIONRANGE, "\n",
  " FASTA_FILE = ", INPUT$FASTA_FILE, "\n",
  " MAX_IONS = ", INPUT$MAX_IONS, "\n",
  " MIN_IONS = ", INPUT$MIN_IONS, "\n"
  )

```

```{r kableParameter, echo=FALSE, results='asis'}
library(knitr)
# kable(t(as.data.frame(INPUT)))
ii <- ((lapply(INPUT, function(x){ if(typeof(x) %in% c("character", "double")){paste(x, collapse = ', ')}else{NULL} } )))


parameter <- as.data.frame(unlist(ii))
names(parameter) <- 'parameter.values'
kable(parameter, caption = 'used INPUT parameter')
```

## Define the fragment ions of interest

The following R helper function is used for composing the in-silico 
fragment ions using `r CRANpkg("protViz")`.
```{r defineFragmenIons}
fragmentIonFunction_specL <- function (b, y) {
  Hydrogen <- 1.007825
  Oxygen <- 15.994915
  Nitrogen <- 14.003074
  b1_ <- (b )
  y1_ <- (y )
  b2_ <- (b + Hydrogen) / 2
  y2_ <- (y + Hydrogen) / 2 
  return( cbind(b1_, y1_, b2_, y2_) )
}
```


## Read the sqlite files

```{r readSqliteFILTERED, warning=FALSE}
BLIB_FILTERED <- read.bibliospec(INPUT$BLIB_FILTERED_FILE) 

summary(BLIB_FILTERED)
```


```{r readSqliteREDUNDANT, warning=FALSE}
BLIB_REDUNDANT <- read.bibliospec(INPUT$BLIB_REDUNDANT_FILE) 
summary(BLIB_REDUNDANT)
```


## Protein (re)-annotation
After processing the psm using bibliospec the protein information is gone.

The `read.fasta` function is provided by the CRAN package `r CRANpkg("seqinr")`.

```{r read.fasta}
if(INPUT$ANNOTATE){
  FASTA <- read.fasta(INPUT$FASTA_FILE, 
                    seqtype = "AA", 
                    as.string = TRUE)

  BLIB_FILTERED <- annotate.protein_id(BLIB_FILTERED, 
                                       fasta = FASTA)
}
```

## Peptides used for RT normalization

The following peptides are used for the RT normalization. The last column 
indicates by FALSE/TRUE if a peptides is included in the data. The rows were 
ordered by the RT values.

```{r checkIRTs, echo=FALSE, results='asis'}
library(knitr)
incl <-  INPUT$NORMRTPEPTIDES$peptide %in% sapply(BLIB_REDUNDANT, function(x){x$peptideSequence})
INPUT$NORMRTPEPTIDES$included <- incl

if (sum(incl) > 0){
  res <- INPUT$NORMRTPEPTIDES[order(INPUT$NORMRTPEPTIDES$rt),]
  # row.names(res) <- 1:nrow(res)
  kable(res, caption='peptides used for RT normaization.')
}
```

# Generate the ion library

```{r specL::genSwathIonLib, message=FALSE}
specLibrary <- specL::genSwathIonLib(
  data = BLIB_FILTERED,
  data.fit = BLIB_REDUNDANT,
  max.mZ.Da.error = INPUT$MZ_ERROR,
  topN = INPUT$MAX_IONS,
  fragmentIonMzRange = INPUT$FRAGMENTIONMZRANGE,
  fragmentIonRange = INPUT$FRAGMENTIONRANGE,
  fragmentIonFUN = fragmentIonFunction_specL,
  mascotIonScoreCutOFF = INPUT$MASCOTSCORECUTOFF,
  iRT = INPUT$NORMRTPEPTIDES
  )
```

## Library Generation Summary

Total Number of PSM's with Mascot e-value < 0.05,
in your search is __`r length(BLIB_REDUNDANT)`__.
The number of unique precurosors is __`r length(BLIB_FILTERED)`__.
The size of the generated ion library is __`r length(specLibrary@ionlibrary)`__.
That means that __`r round(length(specLibrary@ionlibrary)/length(BLIB_FILTERED) * 100, 2)`__ % 
of the unique precursors fullfilled the filtering criteria.


```{r summarySpecLibrary}
summary(specLibrary)
```


In the following two code snippets the first element of the ion library is displayed:
```{r showSpecLibrary}
#  slotNames(specLibrary@ionlibrary[[1]])
specLibrary@ionlibrary[[1]]
```

```{r plotSpecLibraryIons, fig.retina=3}
plot(specLibrary@ionlibrary[[1]])
```


```{r plotSpecLibrary, fig.retina=3}
plot(specLibrary)
```
plots an overview of the whole ion library. 
Please note, that the iRT peptides used for the normalization of RT do not have
to be included in the resulting  \code{specLibrary}.

# Output

```{r write.spectronaut, eval=TRUE}
write.spectronaut(specLibrary, file =  INPUT$OUTPUT_LIBRARY_FILE)
```

```{r save, eval=TRUE}
save(specLibrary, file = INPUT$RDATA_LIBRARY_FILE)
```
saves the result object to a file.

# Remarks

For questions and improvements please
do contact the authors of the [specL](https://bioconductor.org/packages/specL/).

This report Rmarkdown file has been written by WEW and
is maintained by CP.

# Session info

Here is the output of `sessionInfo()` on the system on which this
document was compiled:

```{r sessionInfo, echo=FALSE}
sessionInfo()
```

# References