---
title: A DelayedArray backend for TileDB
author:
- name: Aaron Lun
  email: infinite.monkeys.with.keyboards@gmail.com
date: "Revised: June 12, 2020"
output:
  BiocStyle::html_document:
    toc_float: yes
package: TileDBArray
vignette: >
  %\VignetteIndexEntry{User guide}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}  
---

```{r, echo=FALSE, results="hide"}
knitr::opts_chunk$set(error=FALSE, message=FALSE, warning=FALSE)
library(BiocStyle)
```

# Introduction

TileDB implements a framework for local and remote storage of dense and sparse arrays.
We can use this as a `DelayedArray` backend to provide an array-level abstraction,
thus allowing the data to be used in many places where an ordinary array or matrix might be used.
The `r Biocpkg("TileDBArray")` package implements the necessary wrappers around `r Githubpkg("TileDB-Inc/TileDB-R")`
to support read/write operations on TileDB arrays within the `r Biocpkg("DelayedArray")` framework.

# Creating a `TileDBArray`

Creating a `TileDBArray` is as easy as:

```{r}
X <- matrix(rnorm(1000), ncol=10)
library(TileDBArray)
writeTileDBArray(X)
```

Alternatively, we can use coercion methods:

```{r}
as(X, "TileDBArray")
```

This process works also for sparse matrices:

```{r}
Y <- Matrix::rsparsematrix(1000, 1000, density=0.01)
writeTileDBArray(Y)
```

Logical and integer matrices are supported:

```{r}
writeTileDBArray(Y > 0)
```

As are matrices with dimension names:

```{r}
rownames(X) <- sprintf("GENE_%i", seq_len(nrow(X)))
colnames(X) <- sprintf("SAMP_%i", seq_len(ncol(X)))
writeTileDBArray(X)
```

# Manipulating `TileDBArray`s

`TileDBArray`s are simply `DelayedArray` objects and can be manipulated as such.
The usual conventions for extracting data from matrix-like objects work as expected:

```{r}
out <- as(X, "TileDBArray")
dim(out)
head(rownames(out))
head(out[,1])
```

We can also perform manipulations like subsetting and arithmetic.
Note that these operations do not affect the data in the TileDB backend;
rather, they are delayed until the values are explicitly required, 
hence the creation of the `DelayedMatrix` object.

```{r}
out[1:5,1:5] 
out * 2
```

We can also do more complex matrix operations that are supported by `r Biocpkg("DelayedArray")`:

```{r}
colSums(out)
out %*% runif(ncol(out))
```

# Controlling backend creation

We can adjust some parameters for creating the backend with appropriate arguments to `writeTileDBArray()`.
For example, the example below allows us to control the path to the backend 
as well as the name of the attribute containing the data.

```{r}
X <- matrix(rnorm(1000), ncol=10)
path <- tempfile()
writeTileDBArray(X, path=path, attr="WHEE")
```

As these arguments cannot be passed during coercion, 
we instead provide global variables that can be set or unset to affect the outcome.

```{r}
path2 <- tempfile()
setTileDBPath(path2)
as(X, "TileDBArray") # uses path2 to store the backend.
```

# Session information

```{r}
sessionInfo()
```