How to merg/standardize GatingSets ======================================================== Usage ------------------------------------------------------ ```{r eval=FALSE} gs_split_by_tree(x) gs_check_redundant_nodes(x) gs_remove_redundant_nodes(x,toRemove) gs_remove_redundant_channels(gs, ...) ``` Arguments ------------------------------------------------------ **x** 'GatingSet' objects or or list of groups (each group member is a list of 'GatingSet`) **toRemove** list of the node sets to be removed. its length must equals to the length of argument **x** **...** other arguments ```{r echo=FALSE, message=FALSE, results='hide'} library(flowWorkspace) flowDataPath <- system.file("extdata", package = "flowWorkspaceData") gs <- load_gs(file.path(flowDataPath,"gs_manual")) gs1 <- gs_clone(gs) sampleNames(gs1) <- "1.fcs" # simply the tree nodes <- gs_get_pop_paths(gs1) for(toRm in nodes[grepl("CCR", nodes)]) Rm(toRm, gs1) # remove two terminal nodes gs2 <- gs_clone(gs1) sampleNames(gs2) <- "2.fcs" Rm("DPT", gs2) Rm("DNT", gs2) # remove singlets gate gs3 <- gs_clone(gs2) Rm("singlets", gs3) gs_pop_add(gs3, gs_pop_get_gate(gs2, "CD3+"), parent = "not debris") for(tsub in c("CD4", "CD8")) { gs_pop_add(gs3, gs_pop_get_gate(gs2, tsub), parent = "CD3+") for(toAdd in gs_pop_get_children(gs2, tsub)) { thisParent <- gs_pop_get_parent(gs2[[1]], toAdd,path="auto") gs_pop_add(gs3, gs_pop_get_gate(gs2, toAdd), parent = thisParent) } } sampleNames(gs3) <- "3.fcs" # spin the branch to make it isomorphic gs4 <- gs_clone(gs3) # rm cd4 branch first Rm("CD4", gs4) # add it back gs_pop_add(gs4, gs_pop_get_gate(gs3, "CD4"), parent = "CD3+") # add all the chilren back for(toAdd in gs_pop_get_children(gs3, "CD4")) { thisParent <- gs_pop_get_parent(gs3[[1]], toAdd) gs_pop_add(gs4, gs_pop_get_gate(gs3, toAdd), parent = thisParent) } sampleNames(gs4) <- "4.fcs" gs5 <- gs_clone(gs4) # add another redundant node gs_pop_add(gs5, gs_pop_get_gate(gs, "CD4/CCR7+ 45RA+")[[1]], parent = "CD4") gs_pop_add(gs5, gs_pop_get_gate(gs, "CD4/CCR7+ 45RA-")[[1]], parent = "CD4") sampleNames(gs5) <- "5.fcs" library(knitr) opts_chunk$set(fig.show = 'hold', fig.width = 4, fig.height = 4, results= 'asis') ``` ## Remove the redudant leaf/terminal nodes ```{r echo=FALSE} plot(gs1) plot(gs2) ``` Leaf nodes **DNT** and **DPT** are redudant for the analysis and should be **removed** before merging. ## Hide the non-leaf nodes ```{r echo=FALSE} plot(gs2) plot(gs3) ``` **singlets** node is not present in the second tree. But we **can't** remove it because it will remove all its descendants. We can **hide** it instead. ```{r} invisible(gs_pop_set_visibility(gs2, "singlets", FALSE)) plot(gs2) plot(gs3) ``` Note that even gating trees look the same but **singlets** still physically exists so we must refer the populations by **relative path** (`path = "auto"`) instead of **full path**. ```{r results='hold'} gs_get_pop_paths(gs2)[5] gs_get_pop_paths(gs3)[5] ``` ```{r results='hold'} gs_get_pop_paths(gs2, path = "auto")[5] gs_get_pop_paths(gs3, path = "auto")[5] ``` ## Isomorphism ```{r echo=FALSE} #restore gs2 invisible(gs_pop_set_visibility(gs2, "singlets", TRUE)) ``` ```{r echo=FALSE} plot(gs3) plot(gs4) ``` These two trees are **not identical** due to the **different order** of **CD4** and **CD8**. However they are still mergable thanks to the **reference by gating path** instead of `by numeric indices` ## convenient wrapper for merging To ease the process of merging large number of batches of experiments, here is some **internal wrappers** to make it **semi-automated**. ### Grouping by tree structures ```{r} gslist <- list(gs1, gs2, gs3, gs4, gs5) gs_groups <- gs_split_by_tree(gslist) length(gs_groups) ``` This divides all the `GatingSet`s into different groups, each group shares the same tree structure. Here we have `4` groups, ## Check if the discrepancy can be resolved by dropping leaf nodes ```{r error=TRUE} res <- try(gs_check_redundant_nodes(gs_groups), silent = TRUE) print(res[[1]]) ``` Apparently the non-leaf node (`singlets`) fails this check, and it is up to user to decide whether to hide this node or keep this group separate from further merging.Here we try to hide it. ```{r} for(gp in gs_groups) plot(gp[[1]]) ``` Based on the tree structure of each group (usually there aren't as many groups as `GatingSet` objects itself), we will hide `singlets` for `group 2` and `group 4`. ```{r} for(i in c(2,4)) for(gs in gs_groups[[i]]) invisible(gs_pop_set_visibility(gs, "singlets", FALSE)) ``` Now check again with `.gs_check_redundant_nodes` ```{r} toRm <- gs_check_redundant_nodes(gs_groups) toRm ``` Based on this, these groups can be consolidated by dropping * `CCR7+ 45RA+` and `CCR7+ 45RA-` from `group 1`. * `DNT` and `DPT` from `group 2`. Sometime it could be difficult to inspect and distinguish tree difference by simply plotting the enire gating tree or looking at this simple flat list of nodes (especially when the entire subtree is missing from cerntain groups). It is helpful to visualize and highlight only the tree hierarchy difference with the helper function `gs_plot_diff_tree` ```{r, message=FALSE} gs_plot_diff_tree(gs_groups) ``` To proceed the deletion of these nodes, `.gs_remove_redundant_nodes` can be used instead of doing it manually ```{r results='hide'} gs_remove_redundant_nodes(gs_groups, toRm) ``` Now they can be merged into a single `GatingSetList`. ```{r} GatingSetList(gslist) ``` Remove the redundant channels from `GatingSet` ------------------------------------------------------ Sometime there may be the extra `channels` in one data set that prevents it from being merged with other. If these channels are not used by any gates, then they can be safely removed. ```{r} gs_remove_redundant_channels(gs1) ```