\documentclass[a4paper, 12pt]{article} \author{\href{mailto:m.robinson@garvan.org.au}{Mark Robinson} \href{mailto:a.statham@garvan.org.au}{Aaron Statham} \href{mailto:d.strbenac@garvan.org.au}{Dario Strbenac}} \usepackage[pdftex]{hyperref} \usepackage{amsmath} \usepackage{amssymb} \usepackage{amscd} %\usepackage{attachfile} \usepackage{graphicx} \usepackage[tableposition=top]{caption} \usepackage{ifthen} \usepackage[utf8]{inputenc} \topmargin -.5in \headheight 0in \headsep 0in \oddsidemargin -.5in \evensidemargin -.5in \textwidth 176mm \textheight 245mm \usepackage{color} \usepackage{Sweave} \begin{document} \SweaveOpts{engine=R} %\VignetteIndexEntry{Using Repitools for Epigenomic Sequencing Data} \title{Integrative Analysis of Epigenomic sequencing (and microarray) data with \texttt{Repitools}} \date{} \maketitle \begin{center} Last compiled on: \today \end{center} \section{Introduction} \texttt{Repitools} is a package that allows exploratory as well as targeted statistical analysis of absolute and differential binding for ChIP-seq and MeDIP-seq data types, and gives visual summaries in a variety of formats. Some basic quality checking utilities are available for sequencing data. Much of the functionality available is implemented for both tiling microarrays and sequencing data, with very similar function calls for both types of data. \\ In this vignette, we highlight various features within the package. Further description of the package can be found in the associated Bioinformatics Applications Note\footnote{\href{http://bioinformatics.oxfordjournals.org/content/26/13/1662.abstract}{Repitools: an R package for the analysis of enrichment-based epigenomic data}} as well as in the help documents. \\ To start with, set a random seed, and load the \texttt{Repitools} package: <>= options(prompt = " ", continue = " ") set.seed(4) library(Repitools) @ \section{Example Datasets} \input{datasets} \section{Quality Checking} \input{qc} \section{Analyses} \input{analyses1} \subsection{Domains of Concordant Change} Another analysis of interest is the detection of {\em regions} where changes in expression (or an epigenetic mark, etc.) occur on a particular chromosome. The function \texttt{findClusters} addresses this need. The method of determining clusters requires a search through the column of scores (e.g. t-statistics) for a persistent change. Significance of clusters is determined by randomization. The order of the statistics is permuted a large number of times and the number of clusters found in the true statistics column and the permuted statistics columns is counted, ranging from a loose cutoff to a tight cutoff. A cutoff is chosen to control the user-specified FDR. Importantly, the table must be pre-sorted in positional order. This allows the user to use whatever definition of position they want. Note that the distance between features is not taken into account in this implementation. <