% \VignetteIndexEntry{XdeParameterClass Vignette} % \VignetteKeywords{microarray, differential expression} % \VignettePackage{XDE} \documentclass{article} \usepackage{amsmath} \usepackage{graphicx} \usepackage{natbib} \usepackage{hyperref} \usepackage[mathcal]{euscript} \parindent 0in % Left justify \newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle} \newcommand{\Rpackage}[1]{\textit{#1}} \newcommand{\Rfunction}[1]{\texttt{#1}} \newcommand{\Robject}[1]{\texttt{#1}} \newcommand{\stan}{\texttt{Stanford}} \newcommand{\xde}{\textit{XDE}} \newcommand{\eset}{\Robject{ExpressionSet}} \newcommand{\esets}{\Robject{ExpressionSet}s} \newcommand{\esetList}{\Robject{ExpressionSetList}} \newcommand{\esetLists}{\Robject{ExpressionSetList}s} \newcommand{\xparam}{\Robject{XdeParameter}} \newcommand{\xmcmc}{\Robject{XdeMcmc}} \newcommand{\R}{\textsf{R}} \newcommand{\bioc}{http://www.bioconductor.org} \newcommand{\xdeVignette}{\xde{} vignette} \textwidth=6.2in \textheight=8.5in \parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in \title{The \xparam{} Class} \author{H\aa kon Tjelmeland and Robert B. Scharpf} \begin{document} \maketitle <>= options(width=60) @ \section{Introduction} The goal of this vignette is to briefly describe the Bayesian hierarchical model to estimate differential expression across multiple studies. This vignette necessarily assumes familiarity with hierarchical models and Bayesian methods of computation, such as MCMC. After reading this vignette, one should be able to: \begin{itemize} \item create an instance of the \xparam{} class that will provide all the necessary parameters for fitting the Bayesian model, including starting values of all the MCMC chains \item change values from their default setting to custom setting. This includes \begin{itemize} \item changing the values of any of the hyperparameters \item changing the starting values of the MCMC chains \item change the number of updates per MCMC iteration for any of the parameters \item store all, some, or none of the MCMC chains \end{itemize} \end{itemize} Apart from these pragmatic goals, the vignette provides a means to look-up the role of any parameters in our formulation of the Bayesian model. A more detailed discussion of the Bayesian model is provided elsewhere \cite{Scharpf2007d}. If you are content with the default settings provided when creating an instance of the \xparam{} class, fitting the model, assessing convergence, and generating alternative measures of differential expression are discussed in the \xdeVignette{}. \section{\label{sec:model} The Stochastic Model} \newcommand{\bDelta}{\mbox{\boldmath $\Delta$}} \newcommand{\bnu}{\mbox{\boldmath $\nu$}} \newcommand{\bx}{\mbox{\boldmath $x$}} \newcommand{\bpsi}{\mbox{\boldmath $\psi$}} Let $x_{gsp}$ denote observed expression value for gene $g\in \{ 1,\ldots,G\}$ and sample (array) $s\in \{ 1,\ldots,S_q\}$ in study $p \in \{ 1,\ldots,P\}$, where $P\geq 2$. We assume that some clinical variable (with two possible values) is available for each of the arrays in all studies. We denote this by $\psi_{sp}\in\{ 0,1\}$ for sample (array) number $s$ in study number $p$. We will assume that the studies have been somehow standardized so that the values in each study is centered around zero and are approximately Gaussian distributed. Our model defined below is based on the following assumptions, $(i)$ for some of the genes, the expression values $x_{gsp}$ are differentially expressed (have different mean values) for arrays where $\psi_{sp}=0$ and arrays where $\psi_{sp}=1$, and $(ii)$ any gene is either differentially expresses in all studies or in no studies (the ``$\delta_g$'' model). Assumption $(ii)$ can be relaxed by fitting the ``$\delta_{gp}$'' model that allows for a given gene to be differentially expressed in some studies but not in others. The following description of the Bayesian model is for the $\delta_g$" model. The $\delta_{gp}$ model is very similar, substituting $\delta_{gp}$ for $\delta_g$ and independent beta priors with parameter $\xi_p$ instead of $\xi$. We assume the following hierarchical Bayesian model for the expression data. Conditionally on underlying unobserved parameters we assume the expression values to be independently Gaussian distributed, \begin{equation}\label{x} x_{gsp}|\ldots \sim \mbox{N}(\nu_{gp} + \delta_g (2\psi_{sp} - 1)\Delta_{gp}, \sigma^2_{g\psi_{sp}p}), \end{equation} where $\delta_g\in \{ 0,1\}$ indicates whether gene $g$ is differentially expressed ($\delta_g=1$) or not ($\delta_g=0$). Thus, if $\delta_g = 0$, the mean of $x_{gsp}$ is $\nu_{gp}$, whereas if $\delta_g=1$, the mean is $\nu_{gp}-\Delta_{gp}$ and $\nu_{gp}+\Delta_{gp}$ for samples with $\psi_{sp}=0$ and $\psi_{sp}=1$, respectively. Given hyper-parameters, we assume $\bnu_g = (\nu_{g1},\ldots,\nu_{gP})^T$ and $\bDelta_g = (\Delta_{g1},\ldots,\Delta_{gP})^T$ to be multi-Gaussian distributed, \begin{equation} \bnu_g |\ldots \sim \mbox{N}(0,\Sigma) \mbox{~~~~~and~~~~~} \bDelta_g |\ldots \sim \mbox{N}(0,R), \end{equation} where $\Sigma = [\Sigma_{pq}]\in\Re^{P\times P}$ and $R = [R_{pq}]\in\Re^{P\times P}$ with \begin{equation}\label{Sigma} \Sigma_{pq} = \gamma^2 \rho_{pq} \sqrt{ \tau_p^2 \tau_q^2 \sigma_{gp}^{2a_p} \sigma_{gq}^{2a_q}} \end{equation} and \begin{equation} R_{pq} = c^2 r_{pq} \sqrt{\tau_p^2 \tau_q^2 \sigma_{gp}^{2b_p} \sigma_{gq}^{2b_q}}. \end{equation} The $a_p$ and $b_p, p=1,\ldots P$ are assumed apriori independent with discrete probabilities in $0$ and $1$ and a continuous density on $(0,1)$. More precisely, we let \begin{equation} \mbox{P}(a_p = 0) = p_a^0 \mbox{~~~,~~~} \mbox{P}(a_p = 1) = p_a^1 \mbox{~~~,~~~} a_p | a_p \in (0,1) \sim \mbox{Beta}(\alpha_a,\beta_a), \end{equation} and \begin{equation} \mbox{P}(b_p = 0) = p_b^0 \mbox{~~~,~~~} \mbox{P}(b_p = 1) = p_b^1 \mbox{~~~,~~~} b_p | b_p \in (0,1) \sim \mbox{Beta}(\alpha_b,\beta_b). \end{equation} To $c^2$ we assign a uniform prior distribution on $[0,c^2_{\mbox{\tiny Max}}]$, where $c^2_{\mbox{\tiny Max}}$ is a parameter specified by the user, and for $\gamma^2$ we use an improper uniform distributions on $(0,\infty)$. We restrict $\tau^2_p>0,p=1,\ldots,P$ and $\tau_1^2 \cdot \ldots \cdot \tau_P^2 = 1$ and assume an (improper) uniform distribution for $(\tau_1^2,\ldots,\tau_P^2)$ under this restriction. The $[\rho_{pq}] \in \Re^{P\times P}$ and $[r_{pq}] \in \Re^{P\times P}$ are restricted to be correlation matrices and we assign apriori independent Barnard et al (2000) distributions for them, with $\nu_\rho$ and $\nu_r$ degrees of freedom for $[\rho_{pq}]$ and $[r_{pq}]$, respectively. We assume the indicators $\delta_1,\ldots,\delta_G$ to be apriori independent, given a hyper-parameter $\xi$, with \begin{equation} \mbox{P}(\delta_g = 1 | \xi ) = \xi, \mbox{~~~~~where~~~~~} \xi \sim \mbox{Beta}(\alpha_\xi,\beta_\xi). \end{equation} We assume $\sigma^2_{gp}$ in (\ref{Sigma}) to be given from $\sigma^2_{g\psi p}$ in (\ref{x}) via the following relations \begin{equation} \sigma^2_{g0p} = \sigma^2_{gp} \varphi_{gp} \mbox{~~~~~and~~~~~} \sigma^2_{g1p} = \frac{\sigma^2_{gp}} {\varphi_{gp}}. \end{equation} For $\sigma^2_{gp}$ we assume independent prior distributions, given hyper-parameters, \begin{equation} \sigma^2_{gp}|\ldots \sim \mbox{Gamma}\left(\frac{l_p^2}{t_p},\frac{l_p}{t_p}\right), \end{equation} where the mean $l_p$ and variance $t_p$ for $p=1,\ldots,P$ are assigned independent (improper) uniform distributions on $(0,\infty)$. The $\varphi_{gp}$ are assigned independent gamma priors, given hyper-parameters, \begin{equation} \varphi_{gp}|\ldots \sim \mbox{Gamma}\left(\frac{\lambda_p^2}{\theta_p},\frac{\lambda_p}{\theta_p}\right), \end{equation} where the mean $\lambda_p$ and the variance $\theta_p$ for $p=1,\ldots,P$ are assigned independent (improper) uniform distributions on $(0,\infty)$. \section{\label{sec:MH} Metropolis-Hastings algorithm} To simulate from the posterior distribution resulting from the above specified Bayesian model, we use a Metropolis--Hastings algorithm. Default values for the tuning parameters in this algorithm and how these tuning parameters may be modified are discussed in Section \ref{sec:xparamClass}. The Metropolis--Hastings algorithm uses the following moves: \begin{enumerate} \item The $\nu_{pg}$'s are updated in a Gibbs step. \item The $\Delta_{pg}$'s are updated in a Gibbs step. \item \label{Proposal:a_p}Each $a_p,p=1,\ldots,P$ is updated in turn, where a potential new value for $a_p$, $\widetilde{a}_p$, is generated as follows. If $a_p = 0$, $\widetilde{a}_p \sim \mbox{Uniform}(0,\varepsilon)$, if $a_p = 1$, $\widetilde{a}_p \sim \mbox{Uniform}(1-\varepsilon,1)$, and if $a_p\in (0,1)$ we use \begin{equation} \mbox{P}(\widetilde{a}_p = 0) = \max\{ -(a_p - \varepsilon),0\} \cdot \mbox{I}(p_a^0 \neq 0), \end{equation} \begin{equation} \mbox{P}(\widetilde{a}_p = 1) = \max\{ a_p + \varepsilon - 1,0\} \cdot \mbox{I}(p_a^1 \neq 0) \end{equation} and \begin{equation} \widetilde{a}_p |\widetilde{a}_p\in (0,1) \sim \mbox{Uniform}(\max\{a_p-\varepsilon,0\}, \min\{a_p+\varepsilon,1\}). \end{equation} \item Each $b_p,p=1,\ldots,P$ is updated in turn, where the proposal distribution is of the same type as in \ref{Proposal:a_p}. \item A block Gibbs update for $c^2$ and $\bDelta_g$ for $g$'s where $\delta_g=0$. \item The $\gamma^2$ is updated in a Gibbs step. \item \label{Proposal:r}A block update for the correlation matrix $[r_{pq}]$ and the variance $c^2$. First, potential new values for $[r_{pq}]$, $[\widetilde{r}_{pq}]$, is set by \begin{equation} \widetilde{r}_{pq} = (1-\varepsilon) r_{pq} + \varepsilon T_{pq}, \end{equation} where $[T_{pq}]$ is a correlation matrix which with probability a half is generated from the prior for $[r_{pq}]$, or with probability a half set equal to unity on the diagonal and with all off diagonal elements set equal to the same value $b$. In the latter case, the value $b$ is sampled from a uniform distribution on $(-1/(P-1),1)$. Second, the potential new value for $c^2$ is sampled from the corresponding full conditional (given the potential new values $[\widetilde{r}_{pq}]$). The $\varepsilon$ is a tuning parameter. \item A block update for the correlation matrix $[\rho_{pq}]$ and the variance $\gamma^2$. The potential new values are generated similar to what is done in \ref{Proposal:r}. \item For each $g=1,\ldots,G$ in turn, a block update for $\delta_g$ and $\bDelta_g$. First, the potential new value for $\delta_g$ is set equal to $\widetilde{\delta}_g = 1 - \delta_g$. Second, the potential new value for $\bDelta_g$ is sampled from the full conditional (given the potential new value $\widetilde{\delta}_g$). No tuning parameter for this update. \item The $\xi$ is updated in a Gibbs step. \item \label{Proposal:sigma2}Each $\sigma^2_{gp}$ is updated in turn, where the potential new value is given as $\widetilde{\sigma}^2_{gp} = \sigma_{gp}^2 \cdot u$, where $u\sim \mbox{Uniform}(1/(1+\varepsilon),1+\varepsilon)$. The $\varepsilon$ is a tuning parameter. \item \label{Proposal:t}Each $t_p,p=1,\ldots,P$ is updated in turn. The potential new value is given as $\widetilde{t}_p = t_p \cdot u$, where $u\sim\mbox{Uniform}(1/(1+\varepsilon),1+\varepsilon)$. The $\varepsilon$ is a tuning parameter. \item \label{Proposal:l}Each $l_p,p=1,\ldots,P$ is updated in turn. The potential new value is given as $\widetilde{l}_p = l_p \cdot u$, where $u\sim\mbox{Uniform}(1/(1+\varepsilon),1+\varepsilon)$. The $\varepsilon$ is a tuning parameter. \item Each $\varphi_{gp}$ is updated in turn, where the proposal distribution is of the same type as in \ref{Proposal:sigma2}. \item Each $\theta_p,p=1,\ldots,P$ is updated in turn, where the proposal distribution is of the same type as in \ref{Proposal:t}. \item Each $\lambda_p,p=1,\ldots,P$ is updated in turn, where the proposal distribution is of the same type as in \ref{Proposal:l}. \item The $(\tau_1^2,\ldots,\tau_P^2)$ is updated by first uniformly at random drawing a pair $p,q\in \{1,\ldots,P\}$ so that $p\neq q$. Potential new values for $\tau_p^2$ and $\tau_q^2$ are generated by setting \begin{equation} \widetilde{\tau}_p^2 = \tau_p^2 \cdot u \mbox{~~~~and~~~~} \widetilde{\tau}_q^2 = \tau_q^2 / u, \end{equation} where $u \sim \mbox{Uniform}(1/(1+\varepsilon),1+\varepsilon)$. The $\varepsilon$ is again a tuning parameter. \end{enumerate} \section{\label{sec:xparamClass} The \xparam{} class} The Bayesian hierarchical model can be tuned and output modified in a number of ways. The \xparam{} class is an effort to organize these options and to facilitate tweaking. In our experience, it is easier to change and keep track of the parameters in the class than to provide a function for fitting the model with numerous arguments. The \xparam{} class is not a container for the gene expression data (see \esetList{} in the \xdeVignette{}). \subsection{Initializing the \xparam{} class} There are numerous options for customizing the fit of the Bayesian model as well as the output. Default values are defined in the initialization method for the \xparam{} class. Initialization requires an object of class \esetList{} and the name of the classification variable used to define differential expression. (The initialization method will produce an error if the supplied phenotypeLabel is not present in the \Robject{varLabel}s of each element of the \esetList{}). Using the example dataset discussed in the \xdeVignette{}, we have defined the covariate ``adenoVsquamous'' that takes the value 0 for adenocarcinomas and 1 for squamous carcinomas. The \Rfunction{show} method only lists the first few parameters in each slot of the \xparam{} object. <>= library(XDE) data(expressionSetList) xlist <- expressionSetList params <- new("XdeParameter", esetList=xlist, phenotypeLabel="adenoVsquamous") params @ \subsection{Starting values for MCMC chains} A complete listing of initial values for the MCMC can be obtained by <>= initialValues <- firstMcmc(params) str(initialValues) @ The initial values of the MCMC chains stored in \texttt{firstMcmc} are generated by random sampling from the prior distributions for these parameters. One may replace any one of the named elements in this list with different starting values. At this time, we do not provide checks that the replacement vectors are of the appropriate length. WARNING: providing a replacement vector of the incorrect length may cause the MCMC algorithm to crash without warning. Note that specifiedInitialValues in the \xparam{} object is always \Robject{TRUE} as the values created here are the initial values used to run the MCMC, regardless of the starting values were explicitly defined or simulated from the priors. <>= params@specifiedInitialValues @ \noindent If one were to change \Robject{specifiedInitialValues} to \Robject{FALSE}, the MCMC would begin at a different set of initial values drawn from the prior and not the values provided in the \xparam{} class. \subsection{Hyper-parameters} When initializing \xparam for three studies as we are here, the default values for the hyper-parameters are as follows: <>= hyperparameters(params) @ The names used in the software implementation that correspond to the notation used in Section \ref{sec:model} are as follows: \[ \begin{array}{rl} \mbox{hyperparameter} & \mbox{\xde{} name} \\ \hline \alpha_a & \mbox{alpha.a} \\ \beta_a & \mbox{beta.a} \\ p^0_a & \mbox{p0.a} \\ p^1_a & \mbox{p1.a} \\ \alpha_b & \mbox{alpha.b} \\ \beta_b & \mbox{beta.b} \\ p^0_b & \mbox{p0.b} \\ p^1_b & \mbox{p1.b} \\ \nu_r & \mbox{nu.r} \\ \nu_{\rho} & \mbox{nu.rho} \\ \alpha_{\xi} & \mbox{alpha.xi} \\ \beta_{\xi} & \mbox{beta.xi} \\ c^2_{\mbox{max}} & \mbox{c2max} \end{array} \] In situations in which there is no signal (all noise), the $c^2_{\mbox{max}}$ is provided as a precaution to avoid sampling from infinity. However, in simulations of complete noise, the $c^2_{\mbox{max}}$ parameter was not generally needed and remains in the present model primarily for debugging purposes. Modification of any of the above hyperparameters can be performed in the usual way: <>= hyperparameters(params)["alpha.a"] <- 1 @ \subsection{Tuning parameters for Metropolis-Hastings proposals} A default $\epsilon$ for each of the Metropolis-Hastings proposals discussed in Section \ref{sec:MH} is created when initializing the \xparam{} class. See <>= tuning(params) @ for a named vector of the default values. A replacement method has been defined to facilitate the tuning of the proposals. To illustrate, the $\epsilon$ used in the proposal for the parameter $a$ (the power conjugate parameter for $\nu$) could be adjusted by the following command: <>= tuning(params)["a"] <- tuning(params)["a"]*0.5 @ \subsection{\label{sec:updates} Frequency of updates for each MCMC iteration} One may control the frequency at which any of the parameters described in Section \ref{sec:MH} are updated at each iteration of the MCMC through the method \Robject{updates}. See <>= updates(params) @ Specifying zero updates forces a parameter to remain at its initial value. For instance, one may impose conjugacy between location and scale by initializing the power conjugate parameter, $b$, to 1 and changing the number of updates per MCMC iteration to zero: <>= firstMcmc(params)$B <- rep(1,3) updates(params)["b"] <- 0 @ Alternatively, one could force independence of location and scale by setting $b$ to zero: <>= firstMcmc(params)$B <- rep(0, 3) updates(params)["b"] <- 0 @ By default, $b$ is allowed to vary between zero and 1. Block updates are denoted by 'And'. For instance, $r$ and $c^2$ are updated as a block (as described above). Therefore, to update the $r$ parameter 3 times per MCMC iteration, one would use the command <>= updates(params)["rAndC2"] <- 3 @ \subsection{MCMC output} \xde{} provides options to save all, some, or none of the chains produced by MCMC. Options for writing MCMC chains to file are provided in the \Robject{output} slot of the \xparam class. <>= output(params) @ The first value, thin, tells the MCMC algorithm how often to write the chain to file. For instance, if thin is two every other iteration is written to file. The remaining numbers in the output vector are either zero (nothing is written to file) or one. If the output is one, the simulated value from the current iteration is added to the log file. The log files are plain text files and are by default stored in the current working directory. One may change the location of where to store the log files by providing a different path <>= directory(params) <- "logFiles" @ If the directory ``logFiles'' does not exist, the above replacement method for directory will automatically create the directory. In general, directory should provide the path relative to the current working directory or the complete path to the desired directory. An additional slot in the \xparam class that may be useful is \texttt{burnin}. By default, \texttt{burnin} is \texttt{TRUE} and no chains are written to file. One reason for this setting as a default is to easily check whether the \xde{} model will run with the default values without producing voluminous output in the MCMC chains. For instance, we often set <>= burnin(params) <- TRUE iterations(params) <- 5 @ Note the following behavior of the replacement method for burnin: <>= output(params)[2:22] <- rep(1, 21) output(params) burnin(params) <- TRUE output(params) @ Hence, when setting burnin to \texttt{TRUE} we assume that none of the iterations are to be saved. Similarly, when setting \texttt{burnin} to FALSE, we assume that one wishes to save all of the parameters: <>= ##Specify a thin of 1 and save none of the parameters output(params)[2:22] <- rep(0, 21) output(params) burnin(params) <- FALSE output(params) @ To save a subset of the parameters, we recommend setting the \texttt{burnin} argument to \texttt{FALSE} and turning off the parameters that one does not wish to save. Warning: the parameters $\Delta_{gp}, \delta_{g}, \mbox{probDelta}_g, \sigma_{gp}^2, \nu_{gp}, \phi_{gp}$ are all indexed by genes and/or study and may therefore require a large amount of memory to save. Use the following commands to avoid writing these chains to file: <>= burnin(params) <- FALSE output(params)[c("nu", "DDelta", "delta", "probDelta", "sigma2", "phi")] <- 0 output(params) @ \section{Session Information} The version number of \R{} and packages loaded for generating the vignette were: <>= toLatex(sessionInfo()) @ \bibliography{genomics} \bibliographystyle{plain} \end{document}