--- title: "5. HPAanalyze use case: Export Human Protein Atlas (HPA) data as JSON" author: - name: Anh N. Tran affiliation: Northwestern University, Illinois, USA email: trannhatanh89@gmail.com date: 6/8/2019 output: BiocStyle::html_document: toc: true toc_depth: 2 toc_float: true number_sections: true vignette: > %\VignetteIndexEntry{"5. HPAanalyze use case: Export Human Protein Atlas (HPA) data as JSON"} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse=TRUE, comment="#>", warning=FALSE, error=FALSE, eval=FALSE ) ``` ```{r library, message=FALSE, warning=FALSE, error=FALSE} library(BiocStyle) library(HPAanalyze) library(dplyr) library(jsonlite) ``` # The case In certain situation, users may want to export HPA downloaded data into JavaScript Object Notation (JSON) format to use for purposed such as asynchronous, real-time server-to-browser communication. To reduce package dependencies, `HPAanalyze` does not support exporting to JSON via the `hpaExport` function. However, this can be done using a short script as described below. # The solution Exporting data to JSON can be achieved by converting dataframes resulting from `hpaDownload`/`hpaSubset` to JSON format using the `jsonlite` package and write the files to `.json` file. ## Download and subset data There is no special processing needed to the datasets. You can download and subset data as usual. The resulting object is a list of dataframes. ```{r} data <- hpaDownload(downloadList = "histology", version = "example") data_subset <- hpaSubset(data, targetGene = c('TP53', 'EGFR', 'CD44', 'PTEN', 'IDH1')) ``` ## Convert dataframes to JSON The list of dataframes will then be converted to a list of `json` using `jsonlite::toJSON`. ```{r} data_json <- lapply(data_subset, jsonlite::toJSON) str(data_json) # List of 3 # $ normal_tissue : 'json' chr "[{\"ensembl\":\"ENSG00000026508\",\"gene\":\"CD44\",\"tissue\":\"adrenal gland\",\"cell_type\":\"glandular cell"| __truncated__ # $ pathology : 'json' chr "[{\"ensembl\":\"ENSG00000026508\",\"gene\":\"CD44\",\"cancer\":\"breast cancer\",\"high\":1,\"medium\":6,\"low\"| __truncated__ # $ subcellular_location: 'json' chr "[{\"ensembl\":\"ENSG00000026508\",\"gene\":\"CD44\",\"reliability\":\"Enhanced\",\"enhanced\":\"Golgi apparatus"| __truncated__ ``` ## Write JSON file Finally, the `.json` file can be saved to your working folder using the follow code. Notice that there will be one `.json` file for each dataset. ```{r} for (i in seq_along(data_json)) { write(data_json[[i]], file = paste0("hpa_data_", names(data_json[i]), ".json")) } ``` # In one function If you routinely export HPA data into JSON format, the following function allow you to do so with the same syntax as `hpaExport`. ```{r} ## The function (note that you don't need to put .json into the file name) hpaExportJSON <- function(data, fileName) { data_json <- lapply(data, jsonlite::toJSON) for (i in seq_along(data_json)) { write(data_json[[i]], file = paste0(fileName, "_", names(data_json[i]), ".json")) } } ## Export data subset hpaExportJSON(data_subset, fileName = "hpa_data") ``` # Copyright ```{r child = 'data/copyright', eval = TRUE} ```