\name{pileup}
\alias{pileup}

\title{Calculate a pile-up representation of short-read mappings}

\description{
Given short read mappings or similar data, this function calculates a pile-up, i.e. 
representing the reference sequence (that is, typically, one of the chromosome), such
that its length is the number of base pairs of the reference sequence, and each integer 
is the number of reads (or fragments, see below) mapped to the corresponding basepair. 
}

\usage{
pileup( start, fraglength, chrlength, 
   dir = factor( "+", levels=c("-","+") ), 
   readlength = fraglength,
   offset = 1 )
}

\arguments{

\item{start}{A vector with the start positions of each read on the reference sequence. All
reads must correspond to the same reference sequence.}

\item{fraglength}{A vector of the same length as 'start' with the lengths of all the fragments.
Alternatively, a single integer, specifying one constant length to assume for all tags.}

\item{chrlength}{The length of the reference sequence. You may use the function
\code{\link{readBfaToc}} to extract this information from the .bfa file.}

\item{dir}{A factor with level "-" and "+" of the same length as 'start', specifying whether the
fragment extends to the right (towards higher index values, '+') or to the left (towards lower
index values, '-') beyond the read. See below for more explanation.}

\item{readlength}{The length of the reads, either as a vector of the same length as 'start' or as
a single number. This parameter makes sense only if 'dir' is used, too. If not specified, read
lengths and fragment lengths are taken to be the same.}

\item{offset}{The index of the first base pair in the result vector. The default is 1, i.e.
assumes that the 'start' positions are in 1-based chromosome coordinates.}

}

\value{an integer vector of length 'chrlength', each element counting how many fragments map to
this basepair.}

\note{

1. This function is not (yet) suitable for paired-end reads.


2. If the arguments 'dir' and 'readlength' are not used, the fragments are assumed to start at the
positions given in 'start' and extend to the right by the number of basepairs given in
fraglength. If 'dir' and 'readlength' are supplied then the interval starting at 'start' and
extending to the right by the number of base pairs given in 'readlength' marks the position of
the read, which is one end of the fragment. If 'dir' ist '+', it is taken as the left end and 
the fragment will be extended to the right to have the total length given by 'fraglength'. If
'dir' is '-', the end is taken as the right end and is extended to the left. Note that in the
latter case, the 'start' position does mark the border between read and rest of fragment, not an
actual 'end' of the fragment. If you are confused now, look at the examples below.

3. Sorry for the inconsequent use of 'width' and 'length' in a seemingly interchangeable fashion.
}

\examples{\dontrun{ 

Example 1: Assuming that 'lane' is a \link[=AlignedRead-class]{AlignedRead} object containing aligned reads froma
Solexa lane, you may get a pile-up representation of chromosome 13 as follows

chr13length <- 114142980   # the length of human chromosme 13
pu <- pileup( position(lane)[chromosome(lane)=="13"], width(lane), chr13length )

Example 2: Even though the width of the reads (as repored by \code{width(lane)}) is only 24,
these 24 bp are just one end of a longer fragment. Assuming that all fragments have been
sonicated to about the same length, say 150 bp, we may get a better pile-up representation by:

pu2 <- pileup( position(lane)[chromosome(lane)=="13"], 150, chr13length,
strand(lane)[chromosome(lane)=="13"], width(lane) )

}}

\author{Simon Anders, EMBL-EBI, \email{sanders@fs.tum.de}}