--- title: "***tanggle***: Visualization of phylogenetic networks in a *ggplot2* framework" author: - name: Klaus Schliep affiliation: Graz University of Technology - name: Marta Vidal-García affiliation: University of Calgary - name: Leann Biancani affiliation: University of Rhode Island - name: Francisco Henao Diaz affiliation: University of British Columbia - name: Eren Ada affiliation: University of Rhode Island - name: Claudia Solís-Lemus affiliation: University of Wisconsin-Madison email: klaus.schliep@gmail.com package: tanggle output: BiocStyle::html_document vignette: | %\VignetteIndexEntry{***tanggle***: Visualization of phylogenetic networks in a *ggplot2* framework} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: tanggle_references.bib --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) suppressPackageStartupMessages({ library(tanggle, quietly=TRUE) library(phangorn, quietly=TRUE) library(ggtree, quietly=TRUE) }) ``` \center ![](../inst/extdata/logo.png){width=25%} \center # Introduction Here we present a vignette for the R package ***tanggle***, and provide an overview of its functions and their usage. ***Tanggle*** extends the *ggtree* R package [@Yu2017] to allow for the visualization of several types of phylogenetic networks using the *ggplot2* [@Wickham2016] syntax. More specifically, *tanggle* contains functions to allow the user to effectively plot: (1) split (i.e. implicit) networks (unrooted, undirected) and (2) explicit networks (rooted, directed) with reticulations. It offers an alternative to the plot functions already available in *ape* [@Paradis2018] and *phangorn* [@Schliep2011]. # List of functions Function name | Brief description | :-------- | :--------------------------------------------| `geom_splitnet` | Adds a *splitnet* layer to a ggplot, to combine visualising data and the network `ggevonet` | Plots an explicit network from a *phylo* object `ggsplitnet` | Plots an implicit network from a *phylo* object `minimize_overlap` | Reduces the number of reticulation lines crossing over in the plot `node_depth_evonet` | Returns the depths or heights of nodes and tips in the phylogenetic network # Getting started Install the package from Bioconductor directly: ```{r install-bioc, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("tanggle") ``` Or install the development version of the package from [Github](https://github.com/KlausVigo/tanggle). ```{r install-gh, eval=FALSE} if (!requireNamespace("remotes", quietly=TRUE)) install.packages("remotes") remotes::install_github("KlausVigo/tanggle") ``` If you need to install *ggtree* from github: ```{r install-ggtree, eval=FALSE} remotes::install_github("YuLab-SMU/ggtree") ``` And load all the libraries: ```{r load-pkg, echo=TRUE, results='hide'} library(tanggle) library(phangorn) library(ggtree) ``` *** # Split Networks Split networks are data-display objects which allow for the definition of 2 (or more) options for non-compatible splits. Split networks are most often used to visualize consensus networks [@Holland2004] or neighbor-nets [@Bryant2004]. This can be done either by using the `consensusNet` or `neighbor-net` functions in *phangorn* [@Schliep2011] or by importing nexus files from SplitsTree [@Huson2006]. ## Data Types *tanggle* accepts three forms of input data for split networks. The following input options all generate a *networx* object for plotting. * Nexus file created with SplitsTree [@Huson2006] and read with the `read.nexus.network` function in *phangorn* [@Schliep2011]. * Read in a split network in nexus format: ```{r} fdir <- system.file("extdata/trees", package = "phangorn") Nnet <- phangorn::read.nexus.networx(file.path(fdir,"woodmouse.nxs")) ``` 2. A collection of gene trees (e.g.~from RAxML [@Stamatakis2014RAxML]) in one of the following formats: + Nexus file read with the function `read.nexus` + Text file in Newick format (one gene tree per line) read with the function `read.tree` A consensus split network is then computed using the function `consensusNet` in *phangorn* [@Schliep2011]. * Sequences in nexus, fasta or phylip format, read with the function `read.phyDat` in *phangorn* [@Schliep2011] or the function `read.dna` in *ape* [@Paradis2018]. Distances matrices are then computed for specific models of evolution using the function `dist.ml` in *phangorn* [@Schliep2011] or `dist.dna` in *ape* [@Paradis2018]. From the distance matrix, a split network is reconstructed using the function `neighborNet` in *phangorn* [@Schliep2011]. ***Optional***: branch lengths may be estimated using the function `splitsNetworks` in *phangorn* [@Schliep2011]. ## Plotting a Split Network: We can plot the network with the default options: ```{r} p <- ggsplitnet(Nnet) + geom_tiplab2() p ``` When we can set the limits for the x and y axis so that the labels are readable. ```{r} p <- p + xlim(-0.019, .003) + ylim(-.01,.012) p ``` You can rename tip labels. Here we changed the names to species from 1 to 15: ```{r} Nnet$translate$label <- seq_along(Nnet$tip.label) ``` We can include the tip labels with `geom_tiplab2`, and customize some of the options. For example, here the tip labels are in blue and both in bold and italics, and we show the internal nodes in green: ```{r} ggsplitnet(Nnet) + geom_tiplab2(col = "blue", font = 4, hjust = -0.15) + geom_nodepoint(col = "green", size = 0.25) ``` Nodes can also be annotated with `geom_point`. ```{r} ggsplitnet(Nnet) + geom_point(aes(shape=isTip, color=isTip), size=2) ``` ## Plotting Explicit Networks The function `ggevonet` plots explicit networks (phylogenetic trees with reticulations). A recent addition to *ape* [@Paradis2018] made it possible to read in trees in extended newick format [@Cardona2008]. Read in an explicit network (example from Fig. 2 in Cardona et al. 2008): ```{r} z <- read.evonet(text = "((1,((2,(3,(4)Y#H1)g)e,(((Y#H1,5)h,6)f)X#H2)c)a, ((X#H2,7)d,8)b)r;") ``` Plot an explicit network: ```{r} ggevonet(z, layout = "rectangular") + geom_tiplab() + geom_nodelab() p <- ggevonet(z, layout = "slanted") + geom_tiplab() + geom_nodelab() p + geom_tiplab(size=3, color="purple") p + geom_nodepoint(color="#b5e521", alpha=1/4, size=10) ``` # Summary This vignette illustrates all the functions in the R package ***tanggle***, and provides some examples on how to plot both explicit and implicit networks. The split network plots should take most of the functions compatible with unrooted trees in ggtree. The layout options for explicit network plots are *rectangular* or *slanted*. \newpage \newpage \appendix # Session info {.unnumbered} ```{r} sessionInfo() ``` # References \bibliography{tanggle}