%\VignetteIndexEntry{Iterators in SeqVarTools} %\VignetteDepends{SeqVarTools, GenomicRanges} \documentclass[11pt]{article} <>= BiocStyle::latex() @ \begin{document} \title{Using Iterators in SeqVarTools} \author{Stephanie M. Gogarten} \maketitle \tableofcontents \section{Introduction} Iterators can be used to apply a user function to a \Rclass{SeqVarData} object. Creating the iterator defines the sets of variants returned on every subsequent call to \Rfunction{iterateFilter}. \Rfunction{iterateFilter} returns TRUE if there are more variants remaining, and FALSE when all variants have been returned. \section{Block iterators} The simplest type of iterator, a \Rclass{SeqVarBlockIterator}, returns variants in consecutive blocks. <<>>= library(SeqVarTools) gds <- seqOpen(seqExampleFileName("gds")) seqData <- SeqVarData(gds) iterator <- SeqVarBlockIterator(seqData, variantBlock=500) var.info <- list(variantInfo(iterator)) i <- 2 while(iterateFilter(iterator)) { var.info[[i]] <- variantInfo(iterator) i <- i + 1 } lapply(var.info, head) seqResetFilter(seqData) @ A filter can be applied before the iterator is created, and only variants included in the filter will be returned by the iterator. <<>>= seqSetFilter(seqData, variant.sel=1:100) iterator <- SeqVarBlockIterator(seqData, variantBlock=500) var.info <- variantInfo(iterator) nrow(var.info) iterateFilter(iterator) seqResetFilter(seqData) @ \section{Range iterators} A \Rclass{GRanges} object can be used to create a \Rclass{SeqVarRangeIterator}, where every iteration returns the next range. <<>>= library(GenomicRanges) gr <- GRanges(seqnames=rep(1,3), ranges=IRanges(start=c(1e6, 2e6, 3e6), width=1e6)) iterator <- SeqVarRangeIterator(seqData, variantRanges=gr) var.info <- list(variantInfo(iterator)) i <- 2 while(iterateFilter(iterator)) { var.info[[i]] <- variantInfo(iterator) i <- i + 1 } lapply(var.info, head) seqResetFilter(seqData) @ \section{Window iterators} Window iterators (\Rclass{SeqVarWindowIterator}) are a special class of range iterators. When the object is created, the ranges are generated automatically with a specified width and step size, covering the entire genome. <<>>= seqSetFilterChrom(seqData, include="22") iterator <- SeqVarWindowIterator(seqData, windowSize=10000, windowShift=5000) var.info <- list(variantInfo(iterator)) i <- 2 while(iterateFilter(iterator)) { var.info[[i]] <- variantInfo(iterator) i <- i + 1 } lapply(var.info, head) seqResetFilter(seqData) @ \section{List iterators} A \Rclass{SeqVarListIterator} can be used to specify particular variants to include in each iteration. The input is a \Rclass{GRangesList}, and each list element defines an iteration set. <<>>= gr <- GRangesList( GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(16e6, 17e6), width=1e6)), GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(18e6, 20e6), width=1e6))) iterator <- SeqVarListIterator(seqData, variantRanges=gr) var.info <- list(variantInfo(iterator)) i <- 2 while(iterateFilter(iterator)) { var.info[[i]] <- variantInfo(iterator) i <- i + 1 } lapply(var.info, head) @ After the last iteration, any methods used on the iterator object will return 0 variants. The \Rfunction{resetIterator} method can be used to reset an iterator back to the beginning. <<>>= variantInfo(iterator) resetIterator(iterator) variantInfo(iterator) @ <<>>= seqClose(gds) @ \end{document}