%\VignetteIndexEntry{Analysis of Flow Cytometry Bead Data}
%\VignetteDepends{flowBeads}
%\VignetteKeywords{}
%\VignettePackage{flowBeads}
\documentclass[11pt]{article}

\SweaveOpts{keep.source=TRUE,pdf=TRUE,eps=FALSE}


\newcommand{\scscst}{\scriptscriptstyle}
\newcommand{\scst}{\scriptstyle}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rcode}[1]{{\texttt{#1}}}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textsf{#1}}}
\newcommand{\Rdata}[1]{{\textsf{#1}}}
\newcommand{\Rclass}[1]{{\textit{#1}}}


\title{flowBeads: Bead Normalisation in Flow Cytometry} 
\date{\today}


\usepackage[text={7.5in,9in},centering]{geometry}
\usepackage{Sweave}
\setkeys{Gin}{width=0.95\textwidth}
\usepackage[round]{natbib}

\usepackage{graphicx}
\usepackage{url}
\usepackage{hyperref} 
\usepackage{amsmath}


% \usepackage{setspace}
% \setlength{\parindent}{0in}


\begin{document}
\SweaveOpts{concordance=TRUE}
\maketitle


\begin{abstract}
    The \Rpackage{flowBeads} package is an extension of \Rpackage{flowCore} for bead data.
    It provides basic functionality for loading, gating and doing normalisation with bead data.
    Beads specially manufactured to known fluoresence,  defined in terms of standard units of fluorescence, are routinely run in flow cytometry
    for the purpose of instrument quality control and normalisation.
    The transformation of measured intensity (Mean Fluorescence Intensity) to standard units of fluorescence,
    Molecules of Equivalent Fluorochrome, allows for sensible comparison of data acquired on different days and on different instruments.
    The parameters of the transform also correspond to basic quality control estimates of the detector linearity and the background.
\end{abstract}


<<loadPackage, echo=false,results=hide>>=
library(flowBeads)
@


\section{Theory}

The expected fluorescent signal of a bead is determined by:
\begin{itemize}
    \item the amount of fluorochrome carried by the bead
    \item the properties of the excitation source (wavelength of laser)
    \item the properties of the detector channel (bandpass and voltage)
\end{itemize}
Given these properties, the \textbf{Molecules of Equivalent Fluorochrome (MEF)} or Molecules of Equivalent Soluble Fluorochrome (MESF)
standard unit of fluorescence is calculated by the manufacturer.
This theoretical value provides an absolute scale for measuring fluorescence 
to compare samples analysed at different times or under different laser/detector configurations \citep{Schwartz:1996,Dendrou:2009}.
The transform from relative fluorescence to standard fluorescence is a linear transform which is estimated 
by linear regression of the \textbf{Mean Fluorescence Intensity (MFI)} of beads belonging to a number (usually six) of different populations
of increasing brightness against their expected MEF fluorescence.
Since the MEF of the bead populations scales multiplicatively, a chosen transform \texttt{f} is appropriate to linearise the data.
In the case of FCS2 data, \texttt{f} is $\text{log}_{10}$, and in the case of FCS3, the default choice is the \Rfunction{logicleTransform} of the \Rpackage{flowCore}
package.
On the \texttt{f} linearised fluorescence scale the transform is therefore:

\[
\text{f}(\text{MEF}) = \beta \times \text{f}(\text{MFI}) + \alpha
\]

\[
\text{MEF} = \text{f}^{-1} ( \beta \times \text{f}(\text{MFI}) + \alpha )
\]

In the special case where the transform \texttt{f} is $\text{log}_{10}$, this can be further simplified to:

\[
\text{MEF} = 10^\alpha \times \text{MFI}^\beta 
\]


Provided the linearity of the detector is good, the $\beta$ parameter, representing the slope, is generally close to one.
When the beads are run on different days, the MFIs of the bead populations move little relative to each other but instead shift together as a whole,
thus the intercept $\alpha$, which can be interpreted as the background fluorescence, varies more than the the slope $\beta$.
%In fact, \citet{Dendrou:2009}  assumed in their MEF transform that $\beta=1$ and only estimated the $\alpha$ parameter.

In order to apply the MEF transform, the MEF of the beads for a given laser/detector setup, as supplied by the manufacturer, 
needs to be matched to the laser/detector configuration provided in the FCS bead file.
However, since not all required laser/detector properties are stored as part of the FCS 2 or 3 file format,
we rely instead on the names of the detectors channels in the FCS file matching those in the MEF configuration file.
As part of the \Rpackage{flowBeads} package,
there is support for 
\textbf{Dakocytomation FluoroSpheres} beads (see Table 1)
and
\textbf{ThermoFischer Scientific Cytocal} 
for the standard \textbf{LSRII} and \textbf{LSRFortessa} laser/detector setup,
but any other type of bead can be supported provided the MEF configuration file is specified.
\noindent
And to load the Dakocytomation configuration file into the current workspace:
<<<eval=F>>=
data(dakomef)
@

<<MEF, echo=FALSE, results=tex>>=
library(xtable)
data(dakomef)
print(xtable(dakomef,
             caption=
             "
             FluoroSpheres from Dakocytomation.
             The Molecules of Equivalent Fluorochromes (MEF) values for the six bead populations as provided by the manufacturer
             for the LSRII.
             The first bead population are blank as they contain contain no flurochrome by design.
             "
             ))
@

\noindent
To load the Cytocal configuration file into the current workspace:
<<>>=
data(cytocalmef)
@

Note that the underlying assumption in using beads as a reference is that the physical MEF property of these beads is more stable
than the detected MFI of the bead population as reported by the instrument.
For this to be true, the quality of the beads must not be compromised by age or poor storage conditions. 
Also it is important to keep in that in mind that if any properties of the laser/detector change, for example the voltage of the detector,
then the beads need to be run again to recompute the correct transform.


%Calibration can be accomplished by tweaking the parameters of the instrument so that the recorded MFI is always the same across days.
%However this method is quite coarse and may lead to other undesired effects.
%The preferred approach is instead to keep the instrumental parameters stable and instead use the beads to measure the change in the MFIs of the bead populations and apply a transformation to normalise the data.

%The instrument we used is the LSR II from BD Bioscience.
%<<<instrument>>=
%data(gbeads1)
%gbeads1@description[['$CYT']]
%@


%LSRII setup
%Lasers: Blue 488; Red 633; Violet 405
%Detectors: "FITC";"PE 575/26";"PE-Tx Red 610/20";"PerCP/Cy5.5 695/40";"PE Cy5 695/40";"PE Cy5 660/20";"PE Cy7 780/60";"APC 660/20";"AlexaFluor 700 730/45";"APC Cy7 780/60";"Pac Blue 440/50";"Pac Orange 525/50"

\section{Loading Bead Files}

Two example FCS 3 bead files, Dakocytomation beads ran on two different days, are included as part of the \Rpackage{flowBeads} package.
These files may be loaded like so:

<<loadBeads, eval=F>>=
beads1 <- BeadFlowFrame(fcs.filename=system.file('extdata', 'beads1.fcs', package='flowBeads'),
                        bead.filename=system.file('extdata', 'dakomef.csv', package='flowBeads'))
beads2 <- BeadFlowFrame(fcs.filename=system.file('extdata', 'beads2.fcs', package='flowBeads'),
                        bead.filename=system.file('extdata', 'dakomef.csv', package='flowBeads'))
@

<<fastLoadBeads, eval=T, echo=F>>=
data(beads1)
data(beads2)
@


\noindent
%\Rclass{BeadFlowFrame} is an extension of \Rclass{flowFrame}.
Here are a few ways of extracting information from \Rclass{BeadFlowFrame} objects:

<<beads1>>=
show(beads1)
@

<<beads2>>=
length(beads1)
getDate(beads1)
getParams(beads1)
getMEFparams(beads1)
@


\noindent
Once the bead files are loaded we can gate them to identify the distinct populations and compute the MEF transform.


\section{Gating Bead Data}

%Beads are manufactured to the same size so provided clumping of beads is rare it is straightforward to identifiy the main population of singlet beads by gating on forward and side scatter.
Gating of bead data is straightforward as the number of bead populations is specified in the MEF configuration file or is known a priori.
%In the forward and side scatter detector channels, a single population is expected since all beads are known to be of identical shape and size.
%Using the \Rfunction{norm2Filter} from \Rpackage{flowCore} \citep{rpackage:flowcore}, we fit a bivariate Gaussian to the data.
%Events lying more than one standard deviation away from the mean of the main bead population are excluded.% as not being singlets.
%Once we have gated on the main bead population in the scatter channels, we know from the MEF file that the beads belong to six populations of increasing brightness.
We know from the MEF file that the beads belong to six populations of increasing brightness.
All channels are gated with the number of expected clusters set to the number of bead populations reported in the bead type file, which is six in the case of Dakocytomation beads.
%The gating is done separately on each fluorescent channel using the \Rfunction{pam} function in the \Rpackage{cluster} \citep{rpackage:cluster} which is an implementation of the K-medoids algorithm:
The cluster assignment is done separately on each fluorescent channel using the \Rfunction{kmeans} function and then each event is assigned to the cluster on which most channels agree on:

<<gateBeads, eval=T>>=
gbeads1 <- gateBeads(beads1)
gbeads2 <- gateBeads(beads2)
@

\noindent
The initial centers for \Rfunction{kmeans} are chosen based on the assumption that there is a similar number of beads in each cluster.
%The \Rfunction{gateBeads} is quite fast as the \Rfunction{kmeans} function is quite computationally intensive.
\Rcode{gbeads1} and \Rcode{gbeads2} are \Rclass{GatedBeadFlowFrame} objects which contain the results of the gating.

\noindent
%The K-medoids algorithm computes a complete distance matrix which makes it quite
To visualise the results of the gating (see Figure \ref{gbeads1plot} for \Robject{gbeads1} and Figure \ref{gbeads2plot} for \Robject{gbeads2}):

<<gbeads1plot, fig=TRUE, include=FALSE, echo=TRUE, prefix=FALSE, height=8, width=8>>= 
plot(gbeads1)
@

<<gbeads2plot, fig=TRUE, include=FALSE, echo=FALSE, prefix=FALSE, height=8, width=8>>= 
plot(gbeads2)
@

\noindent
Individual channels can be plotted like so (Figure not shown):
<<beadplotAPC, eval=F, fig=TRUE, include=FALSE, echo=TRUE, prefix=FALSE, height=4.5, width=8>>= 
plot(gbeads1, 'APC')
@

\noindent
Clustering statistics are also calculated and stored in the \Robject{clustering.stats} slot as
a three way array indexed
by statistic (count, mean, standard deviation, coefficient of variation),
channel (\texttt{ALEXA.488, PEC.Y7, APC, PE, ALEXA.700} and \texttt{PACIFIC.BLUE})
and bead population (one to six).
For example, the clustering stats of bead population one (the blank beads):

<<>>=
getClusteringStats(gbeads1)[,,1]
@

<<echo=FALSE>>=
beads1.mean.fi.APC <- getClusteringStats(gbeads1)['mean.fi','APC',]
beads2.mean.fi.APC <- getClusteringStats(gbeads2)['mean.fi','APC',]
@

The \Rclass{GatedBeadFlowFrame} defines \Robject{mef.transform} slot which contains a list indexed by channel
name, where each element is a list containing the transformation function to apply as well as the coefficients of the transform.
As we do not have MEF values for all detector channels, we only define an MEF transform for ones with matching names in the bead configuration file (in this case APC).
See Figure \ref{fig:absoluteNormalisation} for absolute normalisation of the APC channel for \Rcode{gbeads1} and \Rcode{gbeads2}.

<<>>=
mef.transform <- getMEFtransform(gbeads1)
names(mef.transform)
@

\noindent
Each MEF transform defines the parameters of the affine transform (\Robject{alpha} and \Robject{beta}):
<<>>=
mef.transform$APC$alpha
mef.transform$APC$beta
@

\noindent
As well as the transform itself (\Robject{fun}):
<<>>=
mef.transform$APC$fun
@

<<plotMEFrepeatability, fig=TRUE, include=FALSE, echo=FALSE, prefix=FALSE, height=5, width=10>>= 
old.par <- par(no.readonly=T)
par(mfrow=c(1,2))
plot(getTransformFunction(beads1)(beads1.mean.fi.APC),
     getTransformFunction(beads2)(beads2.mean.fi.APC),
     xlim=c(0,5),
     ylim=c(0,5),
     xlab='APC MFI Beads Day 1',
     ylab='APC MFI Beads Day 2')
abline(b=1,a=0)
plot(getTransformFunction(beads1)(getMEFtransform(gbeads1)$APC$fun(beads1.mean.fi.APC)),
     getTransformFunction(beads2)(getMEFtransform(gbeads2)$APC$fun(beads2.mean.fi.APC)),
     xlim=c(0,5),
     ylim=c(0,5),
     xlab='APC MEF Beads Day 1',
     ylab='APC MEF Beads Day 2')
abline(b=1,a=0)
par(old.par)
@

\noindent
The \Rcode{toMEF} function takes a \Rclass{GatedBeadFlowFrame} and a \Rclass{flowFrame} and normalises the channels
for which there is an MEF transform defined:

<<toMEF, eval=FALSE>>=
toMEF(gbeads1, flow.data)
@

\section{Relative Normalisation}

The MEF provides an absolute reference but we can still normalise in the absence of MEF provided we can align the MFIs across days.
An advantage of relative normalisation is that we can also align the blank bead population as we do not require the MEF.
Let $\text{MFI}_{1}$ be the MFI obtained from the beads on day one, and $\text{MFI}_{2}$ be the MFI obtained from the beads on day two,
then the relative normalisation to compare samples from day one to day two is:

\[
\text{f}(\text{MFI}_{2}) = \beta \times \text{f}(\text{MFI}_{1}) + \alpha
\]

\[
\text{MFI}_{2} = \text{f}^{-1} ( \beta \times \text{f}(\text{MFI}_{1}) + \alpha )
\]

\noindent
To compute the transform:
<<>>=
relative.transforms <- relativeNormalise(gbeads1, gbeads2)
names(relative.transforms)
@
We can then apply the transform, see Figure \ref{fig:relativeNormalisation} for result of applying relative normalisation to \Rcode{gbeads1} and \Rcode{gbeads2}.
<<>>=
fun <- relative.transforms$APC$fun
mfi1 <- getTransformFunction(gbeads1)(getClusteringStats(gbeads1)['mean.fi','APC',])
mfi2 <- getTransformFunction(gbeads2)(getClusteringStats(gbeads2)['mean.fi','APC',])
fun.mfi1 <- fun(mfi1)
@

<<plotMFIrepeatability, fig=TRUE, include=FALSE, echo=FALSE, prefix=FALSE, height=5, width=10>>= 
old.par <- par(no.readonly=T)
par(mfrow=c(1,2))
plot(mfi1,
     mfi2,
     xlim=c(0,5),
     ylim=c(0,5),
     xlab='APC MFI Beads Day 1',
     ylab='APC MFI Beads Day 2')
abline(b=1,a=0)
plot(fun.mfi1,
     mfi2,
     xlim=c(0,5),
     ylim=c(0,5),
     xlab='APC MFI Beads Day 1 (Relative Normalisation)',
     ylab='APC MFI Beads Day 2')
abline(b=1,a=0)
par(old.par)
@

\section{Generating a Report}

Once the bead data has been gated it is possible to generate an HTML report from a template written using Markdown.
These reports can then be viewed as web pages and linked to from a summary page which shows timeline data.
This function is not strictly necessary as one may easily implement his own template.

<<generateReport, eval=FALSE>>=
generateReport(gbeads1, output.file='report.html')
@


%
\begin{figure}
  \centering
  \includegraphics{./gbeads1plot}
  \caption{
  \label{gbeads1plot}
  Plot of \Robject{gbeads1}.
  On each detector channel, six bead populations are clustered.
  The MEF transform is only computed for APC since it is the only channel for which we have MEF values.
  %The detector channel fnamed \texttt{PE.CY7} is not tuned to pick up the signal hence in the noise.
  }
\end{figure}

%
\begin{figure}
  \centering
  \includegraphics{./gbeads2plot}
  \caption{
  \label{gbeads2plot}
  Plot of \Robject{gbeads2}.
  The $\alpha$ for APC is higher than in Figure~\ref{gbeads1plot} which implies a higher background.
  The bead populations on APC are aslo noisier which implies a poorer signal-to-noise ratio on the APC channel on this day.
  %The channel for \texttt{PE.CY7} is not tuned to pick up the signal hence in the noise.
  }
\end{figure}

%
\begin{figure}
  \centering
  \includegraphics{./plotMEFrepeatability}
  \caption{
  \label{fig:absoluteNormalisation}
  The result of the MEF transform is to align the MFI of the five (non-blank) bead populations across days.
  Notice that that the alignment of the blank bead population is not perfect since it is not used in estimating the normalisation parameters ($\alpha$ and $\beta$).
  }
\end{figure}

%
\begin{figure}
  \centering
  \includegraphics{./plotMFIrepeatability}
  \caption{
  \label{fig:relativeNormalisation}
  The result of the relative MFI transform is to align the MFI of the six bead populations across both days.
  Note that after relative normalisation, the MFIs from all bead populations are perfectly aligned since they are all used in estimating the normalisation parameters.
  }
\end{figure}


\clearpage

\bibliographystyle{plainnat} 
\bibliography{beads}

\end{document}