---
title: "miaSim: Microbiome Data Simulation"
author: "miaSim authors"
date: "`r Sys.Date()`"
package: miaSim
output: 
    BiocStyle::html_document:
        fig_height: 7
        fig_width: 10
        toc: yes
        toc_depth: 2
        number_sections: true
vignette: >
    %\VignetteIndexEntry{miaSim}
    %\VignetteEngine{knitr::rmarkdown}
    %\VignetteEncoding{UTF-8}
    \usepackage[utf8]{inputenc}
---

```{r, echo=FALSE}
knitr::opts_chunk$set(cache = FALSE,
                        fig.width = 9,
                        message = FALSE,
                        warning = FALSE)
```

# Introduction

`miaSim` implements tools for microbiome data simulation based on
varying ecological modeling assumptions. These can be used to simulate
species abundance matrices, including time series. Detailed function
documentation is available at the [function reference](https://microbiome.github.io/miaSim/reference/index.html)

The miaSim package supports the R/Bioconductor framework based on the
TreeSummarizedExperiment data container. For more information on
operating with this data format in microbial ecology, see the [online tutorial](https://microbiome.github.io/OMA).


## Installation

The stable Bioconductor release version can be installed as follows.

```{r install-bioc, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
if (!requireNamespace("miaSim", quietly = TRUE))    
    BiocManager::install("miaSim")
```

The experimental Bioconductor devel version can be installed as follows.

```
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
# The following initializes usage of Bioc devel
BiocManager::install(version='devel')
BiocManager::install("miaSim")
```


Load the library

```{r load, eval=TRUE}
library(miaSim)
```


## Examples



### Generating species interaction matrices

Some of the models rely on interaction matrices that represents interaction
heterogeneity between species. The interaction matrix can be generated with
different distributional assumptions.

Generate interactions from normal distribution:

```{r Anormal}
A_normal <- powerlawA(n_species = 4, alpha = 3)
```

Generate interactions from uniform distribution:

```{r Auniform}
A_uniform <- randomA(n_species = 10,
	     	     diagonal = -0.4,
                     connectance = 0.5,
		     interactions = runif(n = 10^2, min = -0.8, max = 0.8))
```


### Ricker model

Ricker model is a discrete version of the gLV:

```{r ricker}
tse_ricker <- simulateRicker(n_species=4, A = A_normal, t_end=100, norm = FALSE)
```

The number of species specified in the interaction matrix must be the
same as the species used in the models.

### Hubbell model

Hubbell Neutral simulation model characterizes diversity and relative
abundance of species in ecological communities assuming migration,
births and deaths but no interactions. Losses become replaced by
migration or birth.

```{r}
tse_hubbell <- simulateHubbell(n_species = 8,
                               M = 10,
			       carrying_capacity = 1000,
                               k_events = 50,
			       migration_p = 0.02,
			       t_end = 100)
```

One can also simulate parameters for the Hubbell model.

```{r}
params_hubbell <- simulateHubbellRates(x0 = c(0,5,10),
    migration_p = 0.1, metacommunity_probability = NULL, k_events = 1, 
    growth_rates = NULL, norm = FALSE, t_end=1000)
```

### Self-Organised Instability (SOI)

The Self-Organised Instability (SOI) model generates time series for
communities and accelerates stochastic simulation.

```{r}
tse_soi <- simulateSOI(n_species = 4, carrying_capacity = 1000,
                       A = A_normal, k_events=5,
		       x0 = NULL,t_end = 150, norm = TRUE)
```

### Stochastic logistic model

Stochastic logistic model is used to determine dead and alive counts
in community.

```{r}
tse_logistic <- simulateStochasticLogistic(n_species = 5)
```

### Consumer-resource model

The consumer resource model requires the `randomE` function. This
returns a matrix containing the production rates and consumption rates
of each species. The resulting matrix is used as a determination of
resource consumption efficiency.

```{r cr}
# Consumer-resource model as a TreeSE object
tse_crm <- simulateConsumerResource(n_species = 2,
                                    n_resources = 4,
                                    E = randomE(n_species = 2, n_resources = 4))
```

You could visualize the simulated dynamics using tools from the [miaTime](https://microbiome.github.io/miaTime/) package.


### Generalized Lotka-Volterra (gLV)

The generalized Lotka-Volterra simulation model generates time-series assuming
microbial population dynamics and interaction.

```{r glv}
tse_glv <- simulateGLV(n_species = 4,
                       A = A_normal,
		       t_start = 0, 
                       t_store = 1000,
		       stochastic = FALSE,
		       norm = FALSE)

```


## Data containers

The simulated data sets are returned as `TreeSummarizedExperiment`
objects. This provides access to a broad range of tools for microbiome
analysis that support this format (see
[microbiome.github.io](http://microbiome.github.io)). More examples on
the object manipulation and analysis can be found at [OMA Online
Manual](https://microbiome.github.io/OMA).

For instance, to plot population density we can use the `miaViz` package:

```{r}
library(miaViz)
p1 <- plotAbundanceDensity(tse_hubbell, assay.type = "counts")
p2 <- plotSeries(tse_hubbell, x = "time")
print(p1+p2)
```


## Case studies 

Source code for replicating the published case studies using the
miaSim package ([Gao et
al. 2023](https://doi.org/10.1111/2041-210X.14129)) is available in
the phyloseq folder (based on the phyloseq data container). Some of
the original case studies have now been replicated also with
TreeSummarized, see the TreeSummarizedExperiment folder.


## Related work

- [micodymora](https://github.com/OSS-Lab/micodymora) Python package for microbiome simulation

# Session info

```{r}
sessionInfo()
```