%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Comments:
% * set opts_chunk$set(eval=FALSE) for 'knitting' 
%   without running R code
%

\documentclass{article}

%\VignetteEngine{knitr::knitr}
%\VignetteIndexEntry{nethet}

\newcommand{\X}{\mathbf{X}}
\newcommand{\argmin}{\mathop{\arg \min}\limits}
\newcommand{\argmax}{\mathop{\arg \max}\limits}


\usepackage{amsmath}
\usepackage{amsfonts}

<<style-knitr, eval=TRUE, echo=FALSE, results="asis" >>=
BiocStyle::latex()
@

\bioctitle[nethet]{A Bioconductor package for investigation of network heterogeneity from high-dimensional data
  % Heterogeneous Clustering and Network Inference with the MixGLasso
}
\author{Nicolas St\"{a}dler\footnote{staedler.n@gmail.com}, Frank Dondelinger\footnote{fdondelinger.work@gmail.com}}

\begin{document}

\maketitle


<<echo=FALSE, include=FALSE, warning=FALSE>>=
# Load package
library(nethet)

set.seed(1)
@

\section{Introduction}

Data analysis in systems biology and medicine often requires analysing 
data whose dynamics can be described as a network of observed and unobserved 
variables. A simple example is a protein signalling network in a cell. 

Simplifying the process greatly, signalling proteins known as kinases can be 
unphosphorylated (inactive) or phosphorylated (active). Cell signalling 
uses the phosphorylation machinery to pass messages from the exterior of the 
cell to the interior where they will be acted upon. This message passing is 
achieved via a relay of kinases and other proteins (the signalling pathway), 
which can be thought of as a network.

Numerous software packages exist for reconstructing networks from observational
data (e.g. \cite{ARACNE}, \cite{bnlearn}, \cite{RTN}, \cite{EDISON}). 
However, most of these packages assume that there is a single underlying network. 
Package \texttt{nethet} was designed with the intent of handling heterogeneous 
datasets arising from a collection of (possibly related) networks. 

Take for example protein measurements of breast cancer tumor cells. It
is known that there exist several subtypes of breast cancer with different 
molecular profiles \cite{tcga_breast}. We might be interested in whether 
the signalling pathways (networks) reconstructed from two subtypes are 
statistically different. If they are not, then we might want to identify new 
subtypes that present different molecular profiles, and reconstruct the 
networks for each identified subtype. The \texttt{nethet} package contains 
functionalities to tackle all of these tasks.

To the best of our knowledge, \texttt{nethet} is currently the only 
implementation of statistical solid methodology enabling the analysis of 
network heterogeneity from high-dimensional data. Package \texttt{nethet} 
combines several implementations of recent statistical innovations useful for 
estimation and comparison of networks in a heterogeneous, high-dimensional 
setting. In particular, we provide code for formal two-sample testing in 
Gaussian graphical models (differential network and GGM-GSA; 
\cite{staedler_diffnet}, \cite{ggmgsa2014}) and make a novel network-based 
clustering algorithm available (mixed graphical lasso, 
\cite{staedler_mixglasso}).

\section{Statistical setup}

We consider independent samples $X_i\in \mathbb{R}^p$
($i=1,\ldots,n$), measuring $p$ molecular variables. We assume that
the collected data can be divided into $K$ different groups. Let
$S_i\in\{1,\ldots,K\}$ be the group assignment of sample $i$, denote
with $n_k$ the group specific sample size and write $\X_k$ for the
$n_k\times p$ data matrix consisting of all samples belonging to group
$k$.

To describe networks we use Gaussian graphical models (\emph{GGMs},
\cite{rue2005}). These models use an undirected graph (or network) to
describe probabilistic relationships between variables.  For each
group $k$, we assume that $\X_k$ is sampled from a multivariate
Gaussian distribution with (unknown) mean $\mu_k$ and (unknown) $p
\times p$ concentration matrix $\Omega_k=\Sigma_k^{-1}$. The matrix
$\Omega_k$ defines the group-specific graph $G_k$ via
\begin{eqnarray*}
&(j,j') \in
E(G_k) \Leftrightarrow \Omega_{k;jj'}\neq 0,\\
&j,j' \in \{ 1, \ldots, p \}\;\textrm{and}\; j \neq j',
\end{eqnarray*}
where $E(G)$ denotes the edge set of graph $G$. 

%The different groups are characterized by group-specific mean vectors
%$\mu_k$ and networks $G_k$ ($k=1,\ldots,K$). These are unknown at the
%outset and have to be estimated from the data. 
 Learning of networks $G_k$ is a so-called high-dimensional statistical problem.
% because of the large
%number of parameters relative to the sample sizes available at the
%group level. 
We employ regularization to learn sparse, parsimonious networks and
thereby control over-fitting. In particular, we use the popular
\emph{graphical Lasso} \cite{friedman2007sic,huge2012}. Frequently 
the group assignments $S_i$, as well as the number of groups
$K$, are unknown at the outset and have to be inferred simultaneously
with the group-specific mean vectors and networks. The method
\emph{mixglasso}, implemented in this package, is a novel tool for
high-dimensional, network-based clustering. It is based on a finite 
mixture of GGMs and employs an adaptive and automatic penalization scheme 
\cite{staedler_mixglasso}.


Network inference is subject to statistical uncertainty and observed
differences between estimated networks may be due to noise in the data
and variability in estimation rather than any true difference in
underlying network topology. Testing hypotheses of the form
\begin{eqnarray*}
  \label{eq:hypothesis}
  \mathbf{H_0}: G_k=G_{k'}, \quad k,k'\in \{1,\ldots,K\},\; k\neq k'
\end{eqnarray*}
is challenging. %and involves non-nested model comparison
%\citep{vuong1989}. 
We build upon a recent approach called \emph{differential network}
\cite{staedler_diffnet,ggmgsa2014} which allows formal two-sample
testing in high-dimensional GGMs.

\section{Package functionalities}

The package consists of the following main parts: 

\begin{itemize}
 \item Simulation functions for creating synthetic data from the underlying 
       Gaussian mixture (network) model.
 \item Network inference using the \texttt{het\_cv\_glasso} function for 
       reconstructing heterogeneous networks from data with the graphical 
       Lasso \cite{glasso} when the group structure is known.
     \item High-dimensional hypothesis testing capabilities, including
       the \texttt{diffnet} functions implementing a statistical test
       for whether the networks underlying a pair of dataset are
       different, the \texttt{ggmgsa} functions allowing for
       differential gene set testing and the \texttt{diffregr}
       functions testing whether two high-dimensional regression
       models are statistically different \cite{staedler_diffnet,ggmgsa2014}.
 \item The \texttt{mixglasso} functions implementing a 
       network-based clustering and reconstruction algorithm also based on the 
       graphical Lasso, for unknown group structure \cite{staedler_mixglasso}.
 \item Plotting and export functions for displaying and saving the results of
       the analysis in a sensible way.
\end{itemize}

\section{Simulate data}
\label{sec:simulation}

In order to demonstrate the functionalities of the package, we will
first simulate data from a Gaussian mixture model with a known
covariance structure. The \textit{nethet} package includes code for
generating random covariance matrices with a given sparsity, and for
simulating from a Gaussian mixture model with given means and
covariances. The function \texttt{sim\_mix\_networks} provides a
convenient wrapper for both:

<<echo=TRUE, include=TRUE>>=
# Specify number of simulated samples and dimensionality
n = 100
p = 25

# Specify number of components of the mixture model and mixture probabilities
n.comp = 4

mix.prob = c(0.1, 0.4, 0.3, 0.2)

# Specify sparsity in [0,1], indicating fraction of off-diagonal zero entries.
s = 0.9


# Generate networks with random means and covariances. Means will be drawn from 
# a standard Gaussian distribution, non-zero covariance values from a 
# Beta(1,1) distribution.
sim.result = sim_mix_networks(n, p, n.comp, s, mix.prob)
@

The data is contained in \texttt{sim.result\$data}, and the components
that each data point belongs to are contained in
\texttt{sim.result\$comp}. Let's check that the mixture probabilities
are correct and then plot the first two dimensions of the data. Note
that we do not expect these to be well-separated in any way.


<<echo=TRUE, include=TRUE, fig.width=5, fig.height=3, fig.align='center'>>=
print(table(sim.result$comp)/n)

component = as.factor(sim.result$comp)

library('ggplot2')
qplot(x=sim.result$data[,1], y=sim.result$data[,2], 
      colour=component) + 
  xlab('Dimension 1') +
  ylab('Dimension 2')
  
@

The means and covariances of the data are contained in
\texttt{sim.result\$Mu} and \texttt{sim.result\$Sig}. If desired, they
can also be specified when calling \texttt{sim\_mix\_networks}.

<<echo=TRUE, include=TRUE, fig.width=5, fig.height=3, fig.align='center'>>=
# Generate new dataset with the same covariances, but different means
sim.result.new = sim_mix_networks(n, p, n.comp, s, mix.prob, Sig=sim.result$Sig)

component = as.factor(sim.result.new$comp)

qplot(x=sim.result.new$data[,1], y=sim.result.new$data[,2], 
      colour=component) + 
  xlab('Dimension 1') +
  ylab('Dimension 2')
@

When the covariance matrices for the components are not specified in
advance, the \texttt{sim\_mix\_networks} function implicitly assumes
that they are generated independently of each other. In order to test
the \texttt{diffnet} functions, we also want to be able to generate
simulated data from pairs of networks that present some common
edges. The \texttt{generate\_2networks} function is used to generate
pairs of networks with an arbitrary overlap.

<<echo=TRUE, include=TRUE, fig.width=7, fig.height=3, fig.align='center'>>=
## Sample size and number of nodes
n <- 40
p <- 10

## Specify sparse inverse covariance matrices,
## with number of edges in common equal to ~ 0.8*p
gen.net <- generate_2networks(p,graph='random',n.nz=rep(p,2),
                              n.nz.common=ceiling(p*0.8))

invcov1 <- gen.net[[1]]
invcov2 <- gen.net[[2]]

plot_2networks(invcov1,invcov2,label.pos=0,label.cex=0.7)
@

\section{Network estimation with known group labels}

If it is known a priori to which component each sample belongs, then
the problem of reconstructing the network reduces to a simple
application of the graphical Lasso to each component. For convenience,
we have included a wrapper function \texttt{het\_cv\_glasso} in
\texttt{nethet} that applies the graphical Lasso \cite{glasso} to each 
component in a heterogeneous dataset with specified component labels. The 
penalisation hyperparameter is tuned individually for each component using
cross-validation.

To demonstrate \texttt{het\_cv\_glasso}, we will generate some data in
the same way as in the previous section:

<<echo=TRUE, include=TRUE>>=
set.seed(10)

n = 100
p = 25

# Generate networks with random means and covariances. 
sim.result = sim_mix_networks(n, p, n.comp, s, mix.prob)

test.data = sim.result$data
test.labels = sim.result$comp

# Reconstruct networks for each component
networks = het_cv_glasso(data=test.data, grouping=test.labels)
@

One way of checking if the reconstructed networks are sensible is
plotting the covariance matrices used for generating the networks
against the reconstructed covariance matrices.

<<echo=TRUE, include=TRUE, fig.width=5, fig.height=3, fig.align='center'>>=
# Component labels for covariance values
components = as.factor(rep(1:n.comp, each=p^2))

qplot(x=c(networks$Sig), y=c(sim.result$Sig),
			colour=components) + 
  xlab('Reconstructed Covariances') +
  ylab('True Covariances')
  
@

\section{High-dimensional two-sample testing}\label{sec:diffnet}

We have demonstrated how to use our package to estimate networks from
heterogeneous data. Often, we would like to perform a statistical
comparison between networks. \emph{Differential network} allows formal
hypothesis testing regarding network differences. It is based on a
novel and very general methodology for high-dimensional two-sample
testing. Other useful tools based on this technology are
\emph{GGM-GSA} (``multivariate gene-set testing based on GGMs'') and
\emph{differential regression} which allows formal two-sample testing
in the high-dimensional regression model. For details on this
methodology we refer the reader to \cite{staedler_diffnet,ggmgsa2014}.

\subsection{Differential network}
Let us consider datasets generated from GGMs $G_1$ and $G_2$
respectively. We would like to know whether networks inferred from
these datasets differ in a statistical significant manner, that is we
would like to test the hypothesis
\begin{eqnarray*}
  \mathbf{H}_0:\quad G_1=G_2.
\end{eqnarray*}
The function \texttt{diffnet\_multisplit} uses repeated sample
splitting to address this task. The main steps are:
\begin{enumerate}
\item Both datasets are randomly split into two halves: the ``\emph{in}-'' and ``\emph{out}-sample''.
\item Networks are inferred using only the \emph{in}-sample (``screening step'').
\item Based on the \emph{out}-sample, a p-value is computed which
  compares the networks obtained in step 2 (``cleaning step'').
\item Steps 1-3 are repeated many times (e.g. 50 times); the resulting
  p-values are aggregated and the final aggregated p-value is reported.
\end{enumerate}

We now illustrate the use of \texttt{diffnet\_multisplit} with an
example. We consider GGMs (i.e. inverse covariance matrices)
previously generated in Section \ref{sec:simulation}.

<<eval=TRUE, echo=TRUE, include=TRUE, warning=TRUE, results="asis">>=
## Set seed
set.seed(1)
 
## Sample size and number of nodes
p <- 30
 
## Specify sparse inverse covariance matrices
gen.net <- generate_2networks(p,graph='random',n.nz=rep(p,2),                               
                             n.nz.common=ceiling(p*0.8))
invcov1 <- gen.net[[1]]
invcov2 <- gen.net[[2]] 

## Get corresponding correlation matrices
cor1 <- cov2cor(solve(invcov1))
cor2 <- cov2cor(solve(invcov2))
@ 

We start with generating data under the ``null-scenario'' where both datasets have
the same underlying network.

<<echo=TRUE, include=TRUE, warning=TRUE, results="asis">>=
## Generate data under null hypothesis
library(mvtnorm) # To generate multivariate Gaussian random samples

## Sample size
n <- 70

x1 <- rmvnorm(n,mean = rep(0,dim(cor1)[1]), sigma = cor1)
x2 <- rmvnorm(n,mean = rep(0,dim(cor1)[1]), sigma = cor1)
@ 

Then, we run a differential network analysis:
<<echo=TRUE, include=TRUE, warning=TRUE, results="asis">>=
## Run diffnet (under null hypothesis)
dn.null <- diffnet_multisplit(x1,x2,b.splits=1,verbose=FALSE)
@ 

We obtain the p-value \Sexpr{dn.null$ms.pval}, which is stored in 
\texttt{dn.null\$ms.pval}.

The same analysis can be performed for data generated under the
alternative hypothesis. 
<<echo=TRUE, include=TRUE, warning=TRUE, results="asis">>=

## Generate data under alternative hypothesis (datasets have different networks)
x1 <- rmvnorm(n,mean = rep(0,dim(cor1)[1]), sigma = cor1)
x2 <- rmvnorm(n,mean = rep(0,dim(cor1)[2]), sigma = cor2)

## Run diffnet (under alternative)
dn.altn <- diffnet_multisplit(x1,x2,b.splits=1,verbose=FALSE)
@ 

The resulting p-value is \Sexpr{dn.altn$ms.pval} which indicates a highly 
significant network difference.

The variable \texttt{b.splits} specifies the number of data splits
used in the differential network procedure. The p-values in the
previous examples were obtained using only a single data split
(\texttt{b.splits=1}). P-values heavily depend on the random split of
the data. This amounts to a "p-value lottery". To get stable and
reproducible results we therefore would typically choose a larger
number for the variable \texttt{b.split} and report the aggregated
p-value.
<<eval=TRUE, echo=TRUE, include=TRUE, warning=TRUE, fig.width=3.5, fig.height=3.5, fig.align='center'>>=
## Typically we would choose a larger number of splits
# Use parallel library (only available under Unix) for computational efficiency
if(.Platform$OS.type == "unix") {
	dn.altn <- diffnet_multisplit(x1,x2,b.splits=50,verbose=FALSE,mc.flag=TRUE)
} else {
  dn.altn <- diffnet_multisplit(x1,x2,b.splits=25,verbose=FALSE,mc.flag=FALSE)
}

par(cex=0.7)
plot(dn.altn, cex=0.5) # histogram over 50 p-values
cat('p-value:',dn.altn$medagg.pval,'\n') # median aggregated p-value
@


\subsection{Multivariate gene-set testing based on GGMs}
In the case where molecular variables can be grouped into various sets
of biologically related features (e.g. gene-sets or pathways),
\texttt{ggmgsa\_multisplit} can be used to perform differential
network analyses iteratively for all gene-sets. This allows us to
identify gene-sets which show a significant network difference. For
illustration we consider data generated from the following networks.

<<echo=TRUE, include=TRUE, warning=TRUE, results="asis", fig.width=5, fig.height=2.5, fig.align='center'>>=
## Generate new networks
set.seed(1)
p <- 9 # network with p nodes
n <- 40
hub.net <- generate_2networks(p,graph='hub',n.hub=3,n.hub.diff=1)#generate hub networks
invcov1 <- hub.net[[1]]
invcov2 <- hub.net[[2]]
plot_2networks(invcov1,invcov2,label.pos=0,label.cex=0.7,
               main=c('network 1', 'network 2'),cex.main=0.7)

## Generate data
library('mvtnorm')
x1 <- rmvnorm(n,mean = rep(0,p), sigma = cov2cor(solve(invcov1)))
x2 <- rmvnorm(n,mean = rep(0,p), sigma = cov2cor(solve(invcov2)))
@ 

%## Run DiffNet
%# fit.dn <- diffnet_multisplit(x1,x2,b.splits=2,verbose=FALSE)
%# fit.dn$medagg.pval

The nodes can be grouped into three gene-sets where only the first has
a different underlying network.
<<eval=TRUE, echo=TRUE, include=TRUE, warning=TRUE, results="asis">>=
## Identify groups with 'gene-sets'
gene.names <- paste('G',1:p,sep='')
gsets <- split(gene.names,rep(1:3,each=3))
@ 

We run GGM-GSA with a single data split (\texttt{b.splits=1}) and note
that only the p-value for the first gene-set has small magnitude. Again, we
would typically use a larger number of data splits in order to obtain
stable p-values.
<<eval=TRUE, echo=TRUE, include=TRUE, warning=TRUE, results="asis">>=
## Run GGM-GSA
fit.ggmgsa <- ggmgsa_multisplit(x1,x2,b.splits=1,gsets,gene.names,verbose=FALSE)

library(xtable)
print(xtable(summary(fit.ggmgsa),digits=6))
@

\subsection{Differential regression}

In addition to differential network, this \textbf{R}-package also
provides an implementation of differential regression. In particular,
the function \texttt{diffregr\_multisplit} allows formal two-sample
testing in the high-dimensional regression model. It is also based on
sample splitting and is very similar to the previously introduced
\texttt{diffnet\_multisplit}.

Consider the following sparse regression models.
<<echo=TRUE, include=TRUE, warning=TRUE, results="asis">>=
## Number of predictors and sample size
p <- 100
n <- 80

## Predictor matrices
x1 <- matrix(rnorm(n*p),n,p)
x2 <- matrix(rnorm(n*p),n,p)

## Active-sets and regression coefficients
act1 <- sample(1:p,5)
act2 <- c(act1[1:3],sample(setdiff(1:p,act1),2))
beta1 <- beta2 <- rep(0,p)
beta1[act1] <- 0.7
beta2[act2] <- 0.7
@ 

We generate data under the null-hypothesis and run differential regression. The histogram shows the distribution of the p-values obtained form ten data splits.
<<echo=TRUE, include=TRUE, warning=TRUE, results="asis", fig.align='center', fig.width=3.5,fig.height=3.5>>=
## Response vectors under null-hypothesis
y1 <- x1%*%as.matrix(beta1)+rnorm(n,sd=1)
y2 <- x2%*%as.matrix(beta1)+rnorm(n,sd=1)

## Differential regression; b.splits=10
fit.null <- diffregr_multisplit(y1,y2,x1,x2,b.splits=10)
par(cex=0.7)
plot(fit.null,cex=0.5) # histogram of p-values from b.split data splits
cat('p-value: ',fit.null$medagg.pval,'\n') # median aggregated p-value
@ 

The following example illustrates differential regression in scenario with different regression models. 
<<echo=TRUE, include=TRUE, warning=TRUE, results="asis",fig.align='center',fig.width=3.5,fig.height=3.5>>=
## Response vectors under alternative-hypothesis
y1 <- x1%*%as.matrix(beta1)+rnorm(n,sd=1)
y2 <- x2%*%as.matrix(beta2)+rnorm(n,sd=1)

## Differential regression (asymptotic p-values)
fit.alt <- diffregr_multisplit(y1,y2,x1,x2,b.splits=10)
par(cex=0.7)
plot(fit.alt)
cat('p-value: ',fit.alt$medagg.pval,'\n')
@ 

For differential regression we have the option to compute
permutation-based p-values by choosing a number of permutations
\texttt{n.perm}.
<<echo=TRUE, include=TRUE, warning=TRUE, results="asis",fig.align='center',fig.width=3.5,fig.height=3.5>>=
## Differential regression (permutation-based p-values; 100 permutations)
fit.alt.perm <- diffregr_multisplit(y1,y2,x1,x2,b.splits=5,n.perm=100)
@ 
The default option (\texttt{n.perm=NULL}) uses
an asymptotic approximation to calculate p-values.


\section{Network estimation and model-based clustering with unknown group labels}
% Network reconstruction with unknown components using \texttt{mixglasso}


Often we do not know a priori which component each sample belongs to. For example 
in the case of samples corresponding to protein measurements in breast cancer 
patients, the particular subtype of breast cancer that a patient suffers from
may be unknown. In these cases, our package allows for network-based clustering
of the samples using the mixture graphical Lasso (mixglasso), which jointly 
clusters the samples and reconstructs the networks for each group or cluster.

To demonstrate the \texttt{mixglasso} function, let us first generate some data 
in the same way as before, but with means defined to ensure separability 
of the groups:

<<echo=TRUE, include=TRUE>>=
# Generate networks with random means and covariances. 
n = 1000
p = 10
s = 0.9
n.comp = 3

# Create different mean vectors
Mu = matrix(0,p,n.comp)

# Define non-zero means in each group (non-overlapping)
nonzero.mean = split(sample(1:p),rep(1:n.comp,length=p))

# Set non-zero means to fixed value
for(k in 1:n.comp){
  Mu[nonzero.mean[[k]],k] = -2/sqrt(ceiling(p/n.comp))
}

# Generate data
sim.result = sim_mix_networks(n, p, n.comp, s, Mu=Mu)
@

Now we will run mixglasso on this dataset to retrieve the original clustering
and reconstruct the underlying networks.

<<echo=TRUE, include=TRUE>>=
# Run mixglasso
mixglasso.result = mixglasso(sim.result$data, n.comp=3)

# Calculate adjusted rand index to judge how accurate the clustering is
# Values > 0.7 indicate good agreement.
library(mclust, quietly=TRUE)
adj.rand = adjustedRandIndex(mixglasso.result$comp, sim.result$comp)
cat('Adjusted Rand Index', round(adj.rand, digits=2), '\n')
@

Table \ref{table:crosstab} shows the cross-tabulation of the number of samples
in predicted versus true groups.

<<echo=FALSE, include=TRUE, cache=FALSE, results='asis'>>=
crosstab <- function(class1,class2){
  tab <- matrix(NA,length(levels(class1)),length(levels(class2)))
  colnames(tab) <- levels(class2)
  rownames(tab) <- levels(class1)
  for (i in levels(class1)){
    tab[i,] <- table(class2[which(class1==i)])
  }
  return(tab)
}

# Relabel true groupings to avoid confusion
sim.grouping = c('A', 'B', 'C')[sim.result$comp]

cross.table = crosstab(factor(mixglasso.result$comp), factor(sim.grouping))

# Generate table
library(xtable)
latex.table = xtable(cross.table, caption=
										 	paste('Cross-tabulation of mixglasso clusters (rows) with',
                    'true group assignments (columns).'),
										 label='table:crosstab')

print(latex.table)
@


What if we don't know the true number groups? Luckily, \texttt{mixglasso} supports
model comparison using BIC \cite{BIC} and minimum description length \cite{MMDL}. 
In the following example we will use BIC to find the correct number of components:

<<echo=TRUE, include=TRUE, fig.width=5, fig.height=3, fig.align='center'>>=

# Run mixglasso over a range of numbers of components
mixglasso.result = mixglasso(sim.result$data, n.comp=1:6)

# Repeat with lambda=0 and lambda=Inf for comparison
mixglasso.result.0 = mixglasso(sim.result$data, n.comp=1:6, lambda=0)
mixglasso.result.Inf = mixglasso(sim.result$data, n.comp=1:6, lambda=Inf)

# Aggregate BIC results for plotting
BIC.vals = c(mixglasso.result$bic, mixglasso.result.0$bic,
						 mixglasso.result.Inf$bic)

lambda.labels = rep(c('Default', 'Lambda = 0', 'Lambda = Inf'), each=6)

# Plot to verify that minimum BIC value corresponds with true 
library(ggplot2)
plotting.frame <- data.frame(BIC=BIC.vals, Num.Comps=rep(1:6, 3), Lambda=lambda.labels)

p <- ggplot(plotting.frame) + 
	geom_line(aes(x=Num.Comps, y=BIC, colour=Lambda)) + 
	geom_vline(xintercept=3, linetype='dotted')

print(p)
@

We note that mixglasso involves a penalization parameter $\lambda$
which trades off goodness-of-fit and model complexity. We recommend to
use the default which employs an adaptive and automatic penalization
scheme \cite{staedler_mixglasso}. Note that in this simplified
example, $\lambda=0$ (no penalization) performs well because %the true
%inverse covariance matrices are not particularly sparse and 
$n >> p$. $\lambda=\infty$ constrains inverse covariance matrices to
be diagonal, hence the inferior performance.

\section{Plotting and exporting results}

Our package includes several functions for plotting and exporting the networks
and results that have been obtained. 

\subsection{Plotting results}

The output of \texttt{het\_cv\_glmnet} and \texttt{mixglasso} can be plotted either
in network form or as individual edges in the networks. For the network plots,
we use the \texttt{network} package \cite{network_package}. This is the default 
plotting when \texttt{plot} is invoked on an object of class 
\texttt{nethetclustering}, and produces one global plot showing edges that 
occur in any group, as well as one plot for each group. For this example we 
will use the networks and clustering obtained using \texttt{mixglasso} in the 
previous section.

<<echo=TRUE, include=TRUE, fig.width=3, fig.height=3, fig.align='center'>>=

# Retrieve best clustering and networks by BIC
mixglasso.clustering = mixglasso.result$models[[mixglasso.result$bic.opt]]

# Plot networks, omitting edges with absolute partial correlation < 0.5 in 
# every group.
# NOTE: Not displayed.
# plot(mixglasso.clustering, p.corrs.thresh=0.5)
@

Usually we are only interested in specific edges, and perhaps we wish to compare
them among groups. Function \texttt{dot\_plot} generates a plot with edges above
a certain threshold along the y-axis, and one circle for each group showing the
smallest mean of the two nodes that make up the edge. We use the \texttt{ggplot2}
package to make the plots \cite{ggplot2_package}.

<<echo=TRUE, include=TRUE, fig.width=5, fig.height=3, fig.align='center'>>=

# Plot edges, omitting those with absolute partial correlation < 0.5 in every 
# group.
g = dot_plot(mixglasso.clustering, p.corrs.thresh=0.5, dot.size.range=c(1,5))
@

Finally, we might want to compare the observed values of the nodes linked by
specific edges across groups. Function \texttt{scatter\_plot} will generate plots for
a specified list of edges.

<<echo=TRUE, include=TRUE, fig.width=7, fig.height=7, fig.align='center'>>=

# Specify edges
node.pairs = rbind(c(9,10), c(2,5),c(4,9))

# Create scatter plots of specified edges
g = scatter_plot(mixglasso.clustering, data=sim.result$data,
						node.pairs=node.pairs, cex=0.5)
@

\subsection{Exporting Results}

Our package offers the option to export the inferred networks as a comma-separated
values (CSV) text file. Like the plotting functions, function 
\texttt{export\_network} can be invoked on the output of \texttt{het\_cv\_glmnet} 
and \texttt{mixglasso}.

<<echo=TRUE, include=TRUE, eval=FALSE>>=

# Save network in CSV format, omitting edges with absolute partial correlation
# less than 0.25.
#export_network(mixglasso.clustering, file='nethet_network.csv',
#							 p.corrs.thresh=0.25)
@

This creates a CSV file encoding a table with one row for each edge with partial
correlation above the threshold, and columns indicating the nodes linked by the
edge, the absolute partial correlation, the sign of the partial correlation, and
the group or cluster in which the edge occurred.

If the user wishes to use the Cytoscape \cite{cytoscape} software to analyse 
the network further, we note that the output of \texttt{export\_network} can be
loaded into Cytoscape, provided the option \texttt{quote=FALSE} is set.

<<echo=TRUE, include=TRUE, eval=FALSE>>=

# Save network in CSV format suitable for Cytoscape import
#export_network(mixglasso.clustering, file='nethet_network.csv',
#							 p.corrs.thresh=0.25, quote=FALSE)
@

<<echo=TRUE, include=TRUE, eval=TRUE>>=
sessionInfo()
@

\bibliography{nethet}
\end{document}