%\VignetteIndexEntry{biomvRCNS package introduction} %\VignetteKeyword{segmentation} %\VignetteKeyword{HSMM} %\VignetteKeyword{transcriptome mapping} %\VignetteKeyword{copy number analysis} %\VignetteKeyword{max gap min run} %\VignettePackage{biomvRCNS} %\VignetteDepends{methods, IRanges, GenomicRanges, GenomicFeatures, Gviz, TxDb.Hsapiens.UCSC.hg19.knownGene, Rsamtools} \documentclass[10pt]{article} \usepackage{natbib} \usepackage{hyperref} \usepackage[utf8]{inputenc} \usepackage[nogin]{Sweave} \usepackage[margin=2cm]{geometry} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\textit{#1}}} \newcommand{\Rfunarg}[1]{{\textit{#1}}} \begin{document} \setkeys{Gin}{width=0.8\textwidth} \setlength{\parskip}{10pt plus 1pt minus 1pt} \title{\Rpackage{biomvRCNS}: Copy Number study and Segmentation for multivariate biological data.} \author{Yang Du\footnote{email: yang.du@uni-rostock.de}} \date{July 16, 2014} \maketitle \begin{abstract} With high throughput experiments like tiling array and NGS, researchers are looking for continuous homogeneous segments or signal peaks, which would represent chromatin states, methylation ratio, transcripts or genome regions of deletion and amplification. While in a normal experimental set-up, these profiles would be generated for multiple samples or conditions with replicates. In the package \Rpackage{biomvRCNS}, a Hidden Semi Markov Model and one homogeneous segmentation model are implemented and tailored to handle multiple genomic profiles, with the aim of assisting in transcripts detection using high throughput technology like RNA-seq or tiling array, and copy number analysis using aCGH or targeted sequencing. \end{abstract} % \section{Introduction} % To begin with \Rpackage{biomvRCNS}, load the package and read the manual page. <>= library(biomvRCNS) @ <>= options(width = 95) @ In the package, 3 main functions are provided for the batch processing of multiple chromosome regions across samples: \Rfunction{biomvRhsmm}, a hidden semi Markov model (HSMM) \citep{biomvrcns2013fbn}; \Rfunction{biomvRseg}, a maximum likelihood based homogeneous segmentation model; and a third \Rfunction{biomvRmgmr}, custom batch function using max-gap-min-run algorithm. In the following sections we will illustrate their functionalities using example data. Currently the package does not deal with data correction, so input should be normalized by reference or paired sample and corrected for factor of interest before passing down. % \section{Example of array CGH data set of Coriell cell lines} % Extracted from packge \Rpackage{DNACopy} \citep{cbsdnacopy}, the \Robject{coriell} data contains two aCGH studies (GM05296 and GM13330) of Corriel cell lines taken from \citet{snijders2001assembly}. In particular, with 2271 mapped features in total across 22 autosomes and chromosome X. All three main functions accept common data matrix plus positional information as input or a \Rclass{GRanges} object with data matrix stored in the meta columns. To get started, we first build a \Rclass{GRanges} object from \Rclass{data.frame}. <>= data('coriell', package='biomvRCNS') head(coriell, n=3) xgr<-GRanges(seqnames=coriell[,2], IRanges(start=coriell[,3], width=1, names=coriell[,1])) values(xgr)<-DataFrame(coriell[,4:5], row.names=NULL) xgr<-sort(xgr) head(xgr, n=3) @ Please be sure that the data is sorted with respect to their positions before feeding to the models. % \subsection{Genomic segmentation with Hidden-semi Markov model} % First we use the hidden-semi Markov model with the batch function \Rfunction{biomvRhsmm}, which will sequentially process each chromosome identified by the \Rfunarg{seqnames} (using dummy name when no \Rclass{GRanges} supplied in \Robject{x} or \Robject{xRange}), thus for non-continuous regions on the same chromosome user should give different \Rfunarg{seqnames} to each part of the data. Within this package, there is one argument \Rfunarg{grp}, for all main batch functions, which is used to assign data columns to groups according to the experimental design, say technical replicates or biological replicates. Sample columns within the same group could be treated simultaneously in the modelling process as well as iteratively. \footnote{Simultaneous treatment within group is currently available for \Rfunarg{emis.type} equals 'mvnorm' or 'mvt' in \Rfunction{biomvRhsmm}, \Rfunarg{poolGrp=TRUE} in \Rfunction{biomvRmgmr} and \Rfunarg{twoStep=FALSE} in \Rfunction{biomvRseg}.} In this example, the two profiles are considered independent and not similar, thus been given different values in the \Rfunarg{grp} vector. Additionally there is a built-in automatic grouping method, given a valid clustering method \Rfunarg{cluster.m} and \Rfunarg{grp} set to \Robject{NULL}. By default, all data columns are assumed to be from the same group. <>= rhsmm<-biomvRhsmm(x=xgr, maxbp=1E5, J=3, soj.type='gamma', com.emis=T, emis.type='norm', prior.m='quantile', grp=c(1,2)) @ <>= show(rhsmm) @ In the above run, we limit the model complexity by setting the \Rfunarg{maxbp} to $1E5$, which will restrict the maximum sojourn time to \Rfunarg{maxbp}. \Rfunarg{J} is the number of states in the HSMM model, this argument can be given explicitly or estimated from prior information provided in \Rfunarg{xAnno}. Argument \Rfunarg{soj.type} defines the type of sojourn distribution; with Gamma distributed sojourn, the neighbouring position will tend to have the same state, and transit to other states if far apart. In this way the sojourn distribution fully incorporate the positional information into the probabilistic framework. Argument \Rfunarg{emis.type} controls the distribution of emission probability, in this case the log2 ratio of aCGH data is considered to follow Normal distribution. The emission density could be estimated using all data or only data on the respective region or chromosome (identified by unique \Rfunction{seqnames}), controlling via \Rfunarg{com.emis}. In this case, the ratios cross chromosomes are directly comparable, thus \Rfunarg{com.emis} was set to true. The prior of the emission parameters could be controlled by supplying \Rfunarg{q.alpha} and \Rfunarg{r.var} with \Rfunarg{prior.m='quantile'}, or automatically determined through a clustering process with \Rfunarg{prior.m='cluster'}. The function will then call C codes and estimate the most likely state sequence, with either \Rfunarg{cMethod='F-B'} or \Rfunarg{cMethod='Viterbi'}. The \Rfunarg{F-B} method (default) uses a forward-backward algorithm described in \citet{Guedon2003}, which gives a smooth state sequence, whereas the Viterbi algorithm with \Rfunarg{cMethod='Viterbi'} will use the state profile estimated by the forward-backward algorithm and rebuild the most likely state sequence. The parameter \Rfunarg{maxit} controls the maximum iteration of the EM algorithm. When assessing aCGH data, the quantile method should be able to give a good estimation of the emission desity priors, one can also adjust \Rfunarg{q.alpha} and \Rfunarg{r.var} for better control over the mean-variance relationships in extreme states. SInce we are not training a prediction model, but trying to derive the most likely state sequence, one iteration of the EM procedure is sufficient. The function returns an object of class \Rclass{biomvRCNS}, in which the \Robject{res} slot is a \Rclass{GRanges} object containing the summary of each estimated segments. There are three meta columns: column \Robject{SAMPLE} gives the column name of which sample this segment belongs to; column \Robject{STATE}, the estimated state for each segment, the lower state number represents state with lower mean value, thus in this example, a state of $1$ could represent region of deletion and $3$ for region of duplication, whereas state $2$ could be considered copy neutral; column \Robject{AVG}, gives the segment average value, which could take the form of (trimmed) mean or median controlled by \Rfunarg{avg.m}. The original input is also kept and returned in slot \Robject{x} with the estimated most likely state assignment and associated probability. A \Rfunction{plot} method has been implemented for \Rclass{biomvRCNS} object using package \Rpackage{Gviz}, by default the \Rfunction{plot} method tries to output graphics to multiple EPS/PDF files for each chromosome region and sample. Multiple samples could also be overlaid on the same image, by passing \Rfunarg{sampleInOne=TRUE} in the \Rfunction{plot} method. Here we set \Rfunarg{tofile=FALSEALSE} to output graphics to the current device, and only show resulting graphics for chromosome 11 from sample Coriell.05296. \begin{center} <>= obj<-biomvRGviz(exprgr=xgr[seqnames(xgr)=='11', 'Coriell.05296'], seggr=rhsmm@res[mcols(rhsmm@res)[,'SAMPLE']=='Coriell.05296'], tofile=FALSE) @ \end{center} % \subsection{Using other methods provided in the package} % In this section, we use the other two batch functions to process the \Robject{coriell} data. First we use \Rfunction{biomvRseg}, in which a similar segmentation method like in the package \Rpackage{tillingArray} \citep{Hubertiling} is implemented and extended to handle Poisson and Negative binomial distributed data. The function shares several argument with \Rfunction{biomvRhsmm}, like \Rfunarg{maxbp} and \Rfunarg{grp}. The \Rfunarg{maxseg} gives the maximum number of segment per chromosome region, while the optimal number of segment per chromosome region is determined internally by assessing the likelihood with optional penalty terms, by default \Rfunarg{penalty='BIC'} is used. Another option is to use modified Bayes information criterion \Rfunarg{penalty='mBIC'} \citep{mbic2007}, as in the CBS algorithm used in \Rpackage{DNAcopy}. The function proceed in the following manner: assuming within each group sample columns exhibit similar patterns, and thus be processed simultaneously in the first step. By maximizing the likelihood the optimal number of segments is selected for each group. And in a second step if \Rfunarg{twoStep=TRUE} or merging is necessary, the candidate segments produced in the first step are merged with respect to each sample, thus forcing sample columns in the same group to have a more unified segmentation result yet keeping it possible to have sample specific pattern. <>= rseg<-biomvRseg(x=xgr, maxbp=4E4, maxseg=10, family='norm', grp=c(1,2)) @ <>= head(rseg@res) @ After the example run, the function returns a \Rclass{biomvRCNS} object, containing similar information as the previous \Rfunction{biomvRhsmm} run, except that the \Rfunarg{STATE} column now only have a binary state value of either "HIGH" or "LOW", which is simply graded as 'HIGH' if the segment average is higher than the grand average of the whole region, and 'LOW' otherwise. It is also possible to use the simple max-gap-min-run algorithm to segment aCGH profiles, by calling \Rfunction{biomvRmgmr}. But due to the binary nature of the algorithm, one have to run twice in order to get both extremely high and low segments, then combine the resulting \Rclass{GRanges} manually. <>= rmgmrh<-biomvRmgmr(xgr, q=0.9, high=T, maxgap=1000, minrun=2500, grp=c(1,2)) rmgmrl<-biomvRmgmr(xgr, q=0.1, high=F, maxgap=1000, minrun=2500, grp=c(1,2)) res<-c(rmgmrh@res, rmgmrl@res) @ % \section{Example of transcript detection with RNA-seq data from ENCODE} % The data contains gene expressions and transcript annotations in the region of the human TP53 gene (chr17:7,560,001-7,610,000 from the Human February 2009 (GRCh37/hg19) genome assembly), which is part of the long RNA-seq data generated by ENCODE \citep{ENCODEproj} /Cold Spring Harbor Lab, containing 2 cell types (GM12878 and K562) with 2 replicates each. The libraries were sequenced on the Illumina GAIIx platform as paired-ends for 76 or 101 cycles for each read. The average depth of sequencing was ~200 million reads (100 million paired-ends). The data were mapped against hg19 using Spliced Transcript Alignment and Reconstruction (STAR). To generate local read counts, alignment files were pulled from UCSC (\url{http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCshlLongRnaSeq/}) using package \Rpackage{Rsamtools}. And subsequently reads were counted in each non-overlapping unit sized window for the region (chr17:7,560,001-7,610,000). In the pre-compiled data \Robject{encodeTP53} , a window size of 25bp was used with the chunk of code below. <>= winsize<-25 cgr<-GRanges("chr17", strand='-', IRanges(start=seq(7560001, 7610000, winsize), width =winsize)) bf<-system.file("extdata", "encodeFiles.txt", package = "biomvRCNS") bamfiles<-read.table(bf, header=T, stringsAsFactors=F) library(Rsamtools) which<-GRanges("chr17", IRanges(7560001, 7610000)) param<-ScanBamParam(which=which, what=scanBamWhat()) for(i in seq_len(nrow(bamfiles))){ frd<-scanBam(bamfiles[i,1], param=param) frdgr<-GRanges("chr17", strand=frd[[1]]$strand, IRanges(start=frd[[1]]$pos , end = frd[[1]]$pos+frd[[1]]$qwidth-1)) mcols(cgr)<-DataFrame(mcols(cgr), DOC=countOverlaps(cgr, frdgr)) } @ Alternatively one can also operate on base pair resolution, in which case a \Rclass{Rle} object should be preferred to store the count data for lower memory footprint and better efficiency. Also to speed things up, one could set \Rfunarg{useMC=T} to enable parallel processing of multiple \Robject{seqnames}, the number of parallel process could be set by \Rfunarg{options(mc.cores=n)}. <>= cgr<-GRanges("chr17", strand='-', IRanges(seq(7560001, 7610000), width=1)) bf<-system.file("extdata", "encodeFiles.txt", package = "biomvRCNS") bamfiles<-read.table(bf, header=T, stringsAsFactors=F) library(Rsamtools) which<-GRanges("chr17", IRanges(7560001, 7610000)) param<-ScanBamParam(which=which, flag=scanBamFlag(isMinusStrand=TRUE)) for(i in seq_len(nrow(bamfiles))){ cod<-coverage(BamFile(bamfiles[i,1]), param=param)[['chr17']][7560001:7610000] mcols(cgr)<-DataFrame(mcols(cgr), DOC=cod) } @ The pre-compiled data \Robject{encodeTP53} also includes the regional annotation of TP53 RNAs isoforms, \Robject{gmgr}, which were derived from the ENCODE Gene Annotations (GENCODE), \url{http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGencodeV4/wgEncodeGencodeManualV4.gtf.gz)} , and subset to only isoforms of TP53 gene and neighboring genes in the region. The annotation object \Robject{gmgr} could be rebuilt with the following lines using the included file under \Robject{extdata}. The additional file 'gmodTP53.gff' can also be directly imported as \Rclass{GRanges} using \Rpackage{rtracklayer}. <>= af<-system.file("extdata", "gmodTP53.gff", package = "biomvRCNS") gtfsub<-read.table(af, fill=T, stringsAsFactors=F) gmgr<-GRanges("chr17", IRanges(start=gtfsub[, 4], end=gtfsub[, 5], names=gtfsub[, 13]), strand=gtfsub[, 7], TYPE=gtfsub[, 3]) @ We first load the \Robject{encodeTP53} data, poll the read counts for each cell type and add 1 to the base count to increase stability. <>= data(encodeTP53) cgr<-encodeTP53$cgr gmgr<-encodeTP53$gmgr mcols(cgr)<-DataFrame( Gm12878=1+rowSums(as.matrix(mcols(cgr)[,1:2])), K562=1+rowSums(as.matrix(mcols(cgr)[,3:4])) ) @ For count data from sequencing, the \Rfunarg{emis.type} could be set to either \Robject{'pois'} or \Robject{'nbinom'}, though \Robject{'pois'} is preferred for sharp boundary detection. For the sojourn settings, instead of using the uninformative flat prior, we here use estimates from other data source as a prior. We load the \Rpackage{TxDb.Hsapiens.UCSC.hg19.knownGene} known gene database, and pass the \Rclass{TxDb} object to \Rfunarg{xAnno}. Then internally sojourn parameters and state number \Rfunarg{J} will be estimated from \Rfunarg{xAnno} by calling function \Rfunction{sojournAnno}. When given a \Rclass{TxDb} object to \Rfunarg{xAnno}, state number would be set to 3 and each represents 'intergenic', 'intron' and 'exon'. One can also supply a named \Rclass{list} object with initial values for parameters of distribution specified by \Rfunarg{soj.type}. For emission, since the highly dispersed nature of count data, we set the prior for emission mean to be more extreme, with \Rfunarg{q.alpha=0.01}. <>= library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb<-TxDb.Hsapiens.UCSC.hg19.knownGene rhsmm<-biomvRhsmm(x=cgr, xAnno=txdb, maxbp=1E3, soj.type='gamma', emis.type='pois', prior.m='quantile', q.alpha=0.01) @ As in the ENCODE guide \citep{ENCODEguide}, the study identified the p53 isoform observed in K562 cells has a longer 3'UTR than the isoform seen in the GM12878 cell line. So here we plot our model estimates and consider the third state, namely 'exon', to represent detected transcripts. And the HSMM model clearly picked up the extra transcripts of the K562 cell line at the 3'UTR. <>= rhsmm@res[mcols(rhsmm@res)[,'STATE']=='exon'] @ \begin{center} <>= g<-mcols(rhsmm@res)[,'STATE']=='exon' & mcols(rhsmm@res)[,'SAMPLE']=='Gm12878' obj<-biomvRGviz(exprgr=cgr[,'Gm12878'], gmgr=gmgr, seggr=rhsmm@res[g], plotstrand='-', regionID='TP53', tofile=FALSE) @ \end{center} \begin{center} <>= k<-mcols(rhsmm@res)[,'STATE']=='exon' & mcols(rhsmm@res)[,'SAMPLE']=='K562' obj<-biomvRGviz(exprgr=cgr[,'K562'], gmgr=gmgr, seggr=rhsmm@res[k], plotstrand='-', regionID='TP53', tofile=FALSE) @ \end{center} Now we can locate those novel detected fragments in K562 cell line comparing to the annotation and those detected in Gm12878 cell line. One can then follow up those findings either by gene structure prediction using local nucleotides composition or by experimental validation. <>= nK2gm<-queryHits(findOverlaps(rhsmm@res[k], gmgr)) nK2G<-queryHits(findOverlaps(rhsmm@res[k], rhsmm@res[g])) rhsmm@res[k][setdiff(seq_len(sum(k)), unique(c(nK2G, nK2gm)))] @ The other 2 batch functions could also be similarly applied here. <>= rseg<-biomvRseg(x=cgr, maxbp=1E3, maxseg=20, family='pois') rmgmr<-biomvRmgmr(x=cgr, q=0.99, maxgap=50, minrun=100) @ \section{Example of differentially methylated region (DMR) detection} As an example, we include a toy dataset extracted from \Rpackage{BiSeq} \citep{biseq}, which is a small subset of a published study \citep{variodata}, comprising intermediate differential methylation results prior to DMR detection. We first load the \Robject{variosm} data, the data contains a \Rclass{GRanges} object \Robject{variosm} with two meta columns: '\Robject{meth.diff}', methylation difference between the two sample groups; \Robject{'p.val'}, significance level from the Wald test. What we will show here latter could be applied on other pipelines as well, using similar data input. <>= data(variosm) head(variosm, n=3) @ In the \Rpackage{BiSeq} work-flow, they use an approach similar to the max-gap-min-run algorithm to define the DMR boundaries, by prior filtering and comparing the differential test statistics with a user specified significance level in the candidate regions. The positional information of methylation sites is taking into account by locating and testing highly correlated cluster regions in the filtering process. We now use the \Rfunction{biomvRhsmm} model to detect DMR, since there are mainly two types of measurement associated with differential methylation studies like we have here, one is the difference in the methylation ratio and the other one is the significance level from differential test. The methylation difference gives information about the directionality of the change as well as the size, and the significance level gives the confidence in claiming differential events. So here we utilize both information for the DMR detection. We implicitly ask the model to give 3 states, since \Rfunarg{J} is default to 3, in which case the three states may each represent hypomethylated regions, undefined null regions, and hypermethylated regions respectively when modelling \Robject{meth.diff}; While modelling significance level these states would represent highly confident regions, lowly confident regions or / and null results. For both scenarios, we are more interested in extreme states where we have consistent differences and low P-values. However the distribution of \Robject{p.val} and \Robject{meth.diff} are both highly asymmetric, we thus enable the cluster mode for emission prior initialization by setting \Rfunarg{prior.m}='cluster' . And due to the non-uniformly located CpG sites, one may split inter-spreading long segments with parameter \Rfunarg{maxgap}=100. <>= rhsmm<-biomvRhsmm(x=variosm, maxbp=100, prior.m='cluster', maxgap=100) @ <>= hiDiffgr<-rhsmm@res[mcols(rhsmm@res)[,'STATE']!=2 & mcols(rhsmm@res)[,'SAMPLE']=='meth.diff'] dirNo<-mcols(hiDiffgr)[,'STATE']=='1' & mcols(hiDiffgr)[,'AVG']>0 | mcols(hiDiffgr)[,'STATE']=='3' & mcols(hiDiffgr)[,'AVG']<0 hiDiffgr<- hiDiffgr[!dirNo] loPgr<-rhsmm@res[mcols(rhsmm@res)[,'STATE']==1 & mcols(rhsmm@res)[,'SAMPLE']=='p.val'] DMRs<-reduce(intersect(hiDiffgr, loPgr), min.gapwidth=100) idx<-findOverlaps(variosm, DMRs, type='within') mcols(DMRs)<-DataFrame(cbind(TYPE='DMR', aggregate(as.data.frame(mcols(variosm[queryHits(idx)])), by=list(DMR=subjectHits(idx)), FUN=median)[,-1])) names(DMRs)<-paste0('DMRs', seq_along(DMRs)) DMRs @ After the model fitting, by intersecting regions with extreme \Robject{meth.diff} and regions with low \Robject{p.val}, we can locate those detected DMRs, returned with their average \Robject{meth.diff} and \Robject{p.val}. Comparing to the regions detected in the \Rpackage{BiSeq} vignette, the two sets of regions are largely similar except for two regions: (chr1:872335,872386), which in our case the \Robject{meth.diff} has not been considered high enough due to the highly asymmetric distribution of ‘meth.diff’; another region (chr2:46915,46937) resides in the tail of chromosome 2 with low density of methylation sites, which has been sorted to the intermediate states due to the lack of support from both the emission level and the sojourn time. However it is worth mentioning that due to the filtering applied in their work-flow, they built wider regions out of a smaller set of more significant sites; while in our case, the regions are more refined and especially we identified two hypomethylated regions. \begin{center} <>= plot(rhsmm, gmgr=DMRs, tofile=FALSE) @ \end{center} Other than the iterative fitting of individual models for the two profiles, it is also possible to take advantage of the multivariate nature of the data. Since in this case we are interested in locating regions show high difference in the methylation level, while also achieving high significance in the statistical test. To better capture the covariance structure of possible combination of the two profiles, we use the multivariate Normal for the emission and raise the number of state to a relatively high number, for example 6. Also we enable the '\Rfunarg{com.emis}' mode to learn from the whole data rather than individual chromosome. After the model fitting, we inspect the estimated segment profile produced by the \Rfunction{plot} method. We could see that segments labelled with state '6' could be consider as the most significant DMRs, which show high \Robject{meth.diff} and with very low \Robject{p.val}, while state '5' could be considered as potential candidates for DMRs with relatively lower confidence. After the model fitting, one can also access the fitted emission and sojourn parameters. One can see from the emission parameter and also from the figure, the state 1 could be potentially linked to hypomethylated regions, where the methylation difference average is under 0 yet with relatively low \Robject{p.val}. <>= rhsmm<-biomvRhsmm(x=variosm, J=6, maxbp=100, emis.type='mvnorm', prior.m='cluster', maxgap=100, com.emis=T) @ \begin{center} <>= plot(rhsmm, tofile=FALSE) @ \end{center} <>= DMRs<-reduce(rhsmm@res[mcols(rhsmm@res)[,'STATE']=='6'], min.gapwidth=100) idx<-findOverlaps(variosm, DMRs, type='within') mcols(DMRs)<-DataFrame(cbind(TYPE='DMR', aggregate(as.data.frame(mcols(variosm[queryHits(idx)])), by=list(DMR=subjectHits(idx)), FUN=median)[,-1], stringsAsFactors=F)) names(DMRs)<-paste0('DMRs', seq_along(DMRs)) DMRs cDMRs<-reduce(rhsmm@res[mcols(rhsmm@res)[,'STATE']=='5'], min.gapwidth=100) idx<-findOverlaps(variosm, cDMRs, type='within') mcols(cDMRs)<-DataFrame(cbind(TYPE='cDMR', aggregate(as.data.frame(mcols(variosm[queryHits(idx)])), by=list(cDMRs=subjectHits(idx)), FUN=median)[,-1], stringsAsFactors=F)) names(cDMRs)<-paste0('cDMRs', seq_along(cDMRs)) cDMRs rhsmm@param$emis.par['chr1',][[1]] rhsmm@param$soj.par['chr1',][[1]] @ \begin{center} <>= plot(rhsmm, gmgr=c(DMRs, cDMRs), tofile=FALSE) @ \end{center} \section{More} To be continued ... \section{Session information} <>= sessionInfo() @ \bibliographystyle{plainnat} \bibliography{biomvRCNS} \end{document}