%\VignetteIndexEntry{miRNAtap}
%\VignetteKeywords{miRNAtap, miRNA, targets}
%\VignettePackage{miRNAtap}

\documentclass{article}

\usepackage{amsmath}
\usepackage{amscd}
\usepackage[tableposition=top]{caption}
\usepackage{ifthen}
\usepackage{natbib}
\usepackage[utf8]{inputenc}

\usepackage{Sweave}
\SweaveOpts{prefix.string = tP}


\begin{document}
\SweaveOpts{concordance=TRUE}

\title{{\bf miRNAtap} example use}
\author{Maciej Pajak, Ian Simpson}
\date{\today}

\maketitle
\tableofcontents
\newpage


\section{Introduction}
{\tt miRNAtap} package is designed to facilitate implementation of 
workflows requiring miRNA prediction.
Aggregation of commonly used prediction algorithm outputs in a way that 
improves on performance of every single one of them on their own when 
compared against experimentally derived targets.
microRNA (miRNA) is a 18-22nt long single strand that binds with RISC (RNA 
induced silencing complex) and targets mRNAs effectively reducing their 
translation rates.

Targets are aggregated from 5 most commonly cited prediction algorithms: 
DIANA \citep{Maragkakis2011}, Miranda \citep{Enright2003}, PicTar 
\citep{Lall2006}, TargetScan \citep{Friedman2009}, and miRDB \citep{Wong2015}.

Programmatic access to sources of data is crucial when streamlining the 
workflow of our analysis, this way we can run similar analysis for multiple 
input miRNAs or any other parameters.
Not only does it allow us to obtain predictions from multiple sources 
straight into R but also through aggregation of sources it improves the 
quality of predictions.

Finally, although direct predictions from all sources are only available for 
{\it Homo sapiens} and {\it Mus musculus}, this package includes an 
algorithm that allows to translate target genes to other speices (currently 
only {\it Rattus norvegicus}) using homology information where direct 
targets are not available.


\section{Installation}

This section briefly describes the necessary steps to get {\tt 
miRNAtap} running on your system. We assume that the user has the R 
program (see the R project at http://www.r-project.org) already installed 
and is familiar with it. 
You will need to have R 3.2.0 or later to be able to install and run {\tt 
miRNAtap}.
The miRNAtap package is available from the Bioconductor repository at 
http://www.bioconductor.org To be
able to install the package one needs first to install the core Bioconductor 
packages. If you have already
installed Bioconductor packages on your system then you can skip the two 
lines below.

<<eval = FALSE>>=
if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install()
@

Once the core Bioconductor packages are installed, we can install the 
miRNAtap and accompanying database {\tt miRNAtap.db} package by

<<eval = FALSE>>=
if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("miRNAtap")
BiocManager::install("miRNAtap.db")
@

\section{Workflow}

This section explains how {\tt miRNAtap} package can be integrated in 
the workflow aimed at predicting which processes can be regulated by a given 
microRNA.

In this example workflow we'll use {\tt miRNAtap} as well as another 
Bioconductor package {\tt topGO} together with Gene Ontology (GO) 
annotations. 
In case we don't have {\tt topGO} or GO annotations on our machine we need 
to install them first:

<<eval = FALSE>>=
if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("topGO")
BiocManager::install("org.Hs.eg.db")
@

Then, let's load the required libraries

<<results = hide>>=
library(miRNAtap)
library(topGO)
library(org.Hs.eg.db)
@

Now we can start the analysis. First, we will obtain predicted targets for 
human miRNA {\it miR-10b} 

<<>>=
mir = 'miR-10b'
predictions = getPredictedTargets(mir, species = 'hsa',
                                    method = 'geom', min_src = 2)
@

Let's inspect the top of the prediction list.

<<>>=
head(predictions)
@

We are using {\it geometric mean} aggregation method as it proves to perform 
best when tested against experimental data from MirBase 
\citep{Griffiths-Jones2008}. 

We can compare it to the top of the list of the output of {\it mimumum} method:

<<>>=
predictions_min = getPredictedTargets(mir, species = 'hsa',
                                    method = 'min', min_src = 2)
head(predictions_min)
@


Where predictions for rat genes are not available we can obtain predictions 
for mouse genes and translate them into rat genes through homology. 
The operation happens automatically if we specify species as {\tt rno} (for 
{\it Rattus norvegicus})

<<>>=
predictions_rat = getPredictedTargets(mir, species = 'rno',
                                    method = 'geom', min_src = 2)
@

Now we can use the ranked results as input to GO enrichment analysis. For 
that we will 
use our initial prediction for human {\it miR-10b}

<<results = hide>>=
rankedGenes = predictions[,'rank_product']
selection = function(x) TRUE 
# we do not want to impose a cut off, instead we are using rank information
allGO2genes = annFUN.org(whichOnto='BP', feasibleGenes = NULL,
                mapping="org.Hs.eg.db", ID = "entrez")
GOdata =  new('topGOdata', ontology = 'BP', allGenes = rankedGenes, 
            annot = annFUN.GO2genes, GO2genes = allGO2genes, 
            geneSel = selection, nodeSize=10)
@

In order to make use of the rank information we will use Kolomonogorov 
Smirnov (K-S) test instead of Fisher exact test which is based only on counts.

<<>>=
results.ks = runTest(GOdata, algorithm = "classic", statistic = "ks")
results.ks
@

We can view the most enriched GO terms (and potentially feed them to further 
steps in our workflow)

<<>>=
allRes = GenTable(GOdata, KS = results.ks, orderBy = "KS", topNodes = 20)
allRes[,c('GO.ID','Term','KS')]
@

For more details about GO analysis refer to {\tt topGO } package vignette 
\citep{topgo}.

Finally, we can use our predictions in a similar way for pathway enrichment 
analysis based on KEGG \citep{Kanehisa2000}, for example using 
Bioconductor's {\tt KEGGprofile} \citep{keggprofile}.
\section{Session Information}

<<echo=FALSE,results=tex>>=
toLatex(sessionInfo())
@

\addcontentsline{toc}{section}{References} \label{references}


\bibliography{refs}
\bibliographystyle{apalike}


\end{document}