--- title: "FScanR: detect programmed ribosomal frameshifting events from various genomes" author: "Xiao Chen\\ Columbia University Medical Center" date: "`r Sys.Date()`" output: prettydoc::html_pretty: toc: true theme: cayman highlight: github pdf_document: toc: true vignette: > %\VignetteIndexEntry{FScanR} %\VignetteEngine{knitr::rmarkdown} %\usepackage[utf8]{inputenc} --- ```{r style, echo=FALSE, results="asis", message=FALSE} knitr::opts_chunk$set(tidy = FALSE, message = FALSE) ``` ```{r echo=FALSE} CRANpkg <- function (pkg) { cran <- "https://CRAN.R-project.org/package" fmt <- "[%s](%s=%s)" sprintf(fmt, pkg, cran, pkg) } Biocpkg <- function (pkg) { sprintf("[%s](http://bioconductor.org/packages/%s)", pkg, pkg) } ``` ```{r echo=FALSE, results='hide', message=FALSE} library(FScanR) ``` # Abstract 'FScanR' identifies Programmed Ribosomal Frameshifting (PRF) events from BLASTX homolog sequence alignment between targeted genomic/cDNA/mRNA sequences against the peptide library of the same species or a close relative. The output by BLASTX or diamond BLASTX will be used as input of 'FScanR' and should be in a tabular format with 14 columns. For BLASTX, the output parameter should be: -outfmt '6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qframe sframe' For diamond BLASTX, the output parameter should be: -outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qframe qframe For details, please visit . # Introduction Ribosomal frameshifting, also known as translational frameshifting or translational recoding, is a biological phenomenon that occurs during translation that results in the production of multiple, unique proteins from a single mRNA. The process can be programmed by the nucleotide sequence of the mRNA and is sometimes affected by the secondary, 3-dimensional mRNA structure. It has been described mainly in viruses (especially retroviruses), retrotransposons and bacterial insertion elements, and also in some cellular genes. For details, please visit [Ribosomal frameshift](https://en.wikipedia.org/wiki/Ribosomal_frameshift). ## Install FScanR ```{r eval = FALSE} ## Install FScanR in R (>= 3.5.0) if(!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("FScanR") ``` ## Load FScanR and test data The dataset _test_ in this vignettes was homolog sequence alignment between _Euplotes vannus_ mRNA and protein sequences, output by BLASTX, from [_Chen et al., 2019_](https://doi.org/10.1111/1755-0998.13023). ```{r} ## loading package library(FScanR) ## loading test data test_data <- read.table(system.file("extdata", "test.tab", package = "FScanR"), header=TRUE, sep="\t") ``` ## PRF events detection The default cutoffs for _E-value_ and _frameDist_ are _1e-05_ and _10_ (nt), respectively. Low _E-value_ cutoff ensures the fidelity of sequence alignment, but a too strict cutoff may also leads to false-negative detection. Small _frameDist_ cutoff avoids the false-positive PRF events introduced by introns, especially when using genomic sequences as query sequence. _frameDist_ cutoff should be at least 4 nt. Detected high PRF events will be output in tabular format with 7 columns. The column _FS_type_ contains the type information (-2, -1, +1, +2) of PRF events. ```{r} ## loading packages prf <- FScanR(test_data, evalue_cutoff = 1e-05, frameDist_cutoff = 10) table(prf$FS_type) ``` ## PRF event types plot In this vignettes, the number of detected events of four PRF types are presented in a pie chart. ```{r} ## plot the 4-type PRF events detected mytable <- table(prf$FS_type) lbls <- paste(names(mytable), " : ", mytable, sep="") pie(mytable, labels = NA, main=paste0("PRF events"), cex=0.5, col=cm.colors(length(mytable))) legend("right",legend=lbls[!is.na(lbls)], bty="n", cex=1, fill=cm.colors(length(mytable))[!is.na(lbls)]) ``` # Citation If you use [FScanR](https://github.com/seanchen607/FScanR) in published research, please cite the most appropriate paper(s) from this list: 1. **X Chen**, Y Jiang, F Gao\*, W Zheng, TJ Krock, NA Stover, C Lu, LA Katz & W Song (2019). Genome analyses of the new model protist Euplotes vannus focusing on genome rearrangement and resistance to environmental stressors. ***Molecular Ecology Resources***, 19(5):1292-1308. doi: [10.1111/1755-0998.13023](https://doi.org/10.1111/1755-0998.13023). # Session Information Here is the output of `sessionInfo()` on the system on which this document was compiled: ```{r echo=FALSE} sessionInfo() ```