\name{deds.stat.linkC} \alias{deds.stat.linkC} \title{Differentail Expression via Distance Summary of Multiple Statistics} \description{ \code{deds.stat.linkC} integrates different statistics of differential expression (DE) to rank and select a set of DE genes. } \usage{ deds.stat.linkC(X, L, B = 1000, tests = c("t", "fc", "sam"), tail = c("abs", "lower", "higher"), extras = NULL, distance = c("weuclid", "euclid"), adj = c("fdr", "adjp"), nsig = nrow(X), quick = TRUE) } \arguments{ \item{X}{A matrix, with \eqn{m} rows corresponding to variables (hypotheses) and \eqn{n} columns corresponding to observations. In the case of gene expression data, rows correspond to genes and columns to mRNA samples. The data can be read using \code{\link{read.table}}.} \item{L}{A vector of integers corresponding to observation (column) class labels. For \eqn{k} classes, the labels must be integers between 0 and \eqn{k-1}.} \item{B}{The number of permutations. For a complete enumeration, \code{B} should be 0 (zero) or any number not less than the total number of permutations.} \item{tests}{A character vector specifying the statistics to be used to test the null hypothesis of no association between the variables and the class labels, \code{test} could be any of the following: \cr \tabular{ll}{ \code{"t"}: \tab one or two sample t-statistics; \cr \code{"f"}: \tab F-statistics;\cr \code{"fc"}: \tab fold changes among classes;\cr \code{"sam"}:\tab SAM-statistics; \cr \code{"modt"}: \tab moderated t-statistics; \cr \code{"modt"}: \tab moderated F-statistics; \cr \code{"B"}: \tab B-statistics.} } \item{tail}{A character string specifying the type of rejection region.\cr If \code{side="abs"}, two-tailed tests, the null hypothesis is rejected for large absolute values of the test statistic.\cr If \code{side="higher"}, one-tailed tests, the null hypothesis is rejected for large values of the test statistic.\cr If \code{side="lower"}, one-tailed tests, the null hypothesis is rejected for small values of the test statistic. } \item{extras}{Extra parameter needed for the test specified; see \code{\link{deds.genExtra}}.} \item{distance}{A character string specifying the type of distance measure used for the calculation of the distance to the extreme point (E). \cr If \code{distance="weuclid"}, weighted euclidean distance, the weight for statistic \eqn{t} is \eqn{\frac{1}{MAD(t)}}{1/MAD(t)}; \cr If \code{distance="euclid"}, euclidean distance. } \item{adj}{A character string specifying the type of multiple testing adjustment. \cr If \code{adj="fdr"}, False Discovery Rate is controled and \eqn{q} values are returned. \cr If \code{adj="adjp"}, ajusted \eqn{p} values that controls family wise type I error rate are returned.} \item{nsig}{If \code{adj = "fdr"}, \code{nsig} specifies the number of top differentially expressed genes whose \eqn{q} values will be calculated; we recommend setting \code{nsig < m}, as the computation of \eqn{q} values will be extensive. \eqn{q} values for the rest of genes will be approximated to 1. If \code{adj = "adjp"}, the calculation of the adjusted \eqn{p} values will be for the whole dataset.} \item{quick}{A logical variable specifying if a quick but memory requiring procedure will be selected. If \code{quick=TRUE}, permutation will be carried out once and stored in memory; If \code{quick=FALSE} a fixed seeded sampling procedure will be employed, which requires more computation time as the permutation will be carried out twice, but will not use extra memory for storage.} } \details{ \code{deds.stat.linkC} summarizes multiple statistical measures for the evidence of DE. The DEDS methodology treats each gene as a point corresponding to a gene's vector of DE measures. An "extreme origin" is defined as the maxima of all statistics and the distance from all points to the extreme is computed and ranking of a gene for DE is determined by the closeness of the gene to the extreme. To determine a cutoff for declaration of DE, null referent distributions are generated by permuting the data matrix. Statistical measures currently in the DEDS package include t statistics (\code{tests="t"}), fold changes (\code{tests="fc"}), F statistics (\code{tests="f"}), SAM (\code{tests="sam"}), moderated t (\code{tests="modt"}), moderated F statistics (\code{tests="modf"}), and B statistics (\code{tests="B"}). The function \code{deds.stat.linkC} interfaces to C functions for the tests and the computation of DEDS. For more flexibility, the user can also use \code{deds.stat} which has the same functionality as \code{deds.stat.linkC} but is written completely in R (therefore slower) and the user can supply their own function for a statistic not covered in the DEDS package. DEDS can also summarize p values from different statistical models, see \code{\link{deds.pval}}. } \value{ An object of class \code{\link{DEDS}}. See \code{\link{DEDS-class}}. } \references{ Yang, Y.H., Xiao, Y. and Segal M.R.: Selecting differentially expressed genes from microarray experiment by sets of statistics. \emph{Bioinformatics} 2005 21:1084-1093. } \author{Yuanyuan Xiao, \email{yxiao@itsa.ucsf.edu}, \cr Jean Yee Hwa Yang, \email{jean@biostat.ucsf.edu}. } \seealso{\code{\link{deds.pval}}, \code{\link{deds.stat}}.} \examples{ X <- matrix(rnorm(1000,0,0.5), nc=10) L <- rep(0:1,c(5,5)) # genes 1-10 are differentially expressed X[1:10,6:10]<-X[1:10,6:10]+1 # DEDS summarizing t, fc and sam d <- deds.stat.linkC(X, L, B=200) } \keyword{htest}