The gDRimport
package is a part of the gDR suite. It helps to prepare raw drug response data for downstream processing. It mainly contains helper functions for importing/loading/validating dose response data provided in different file formats.
There are currently four test datasets that can be used to see what’s the expected input data for the gDRimport.
# primary test data
td1 <- get_test_data()
summary(td1)
## Length Class Mode
## 1 gdr_test_data S4
td1
## class: gdr_test_data
## slots: manifest_path result_path template_path ref_m_df ref_r1_r2 ref_r1 ref_t1_t2 ref_t1
# test data in Tecan format
td2 <- get_test_Tecan_data()
summary(td2)
## Length Class Mode
## m_file 1 -none- character
## r_files 1 -none- character
## t_files 1 -none- character
## ref_m_df 1 -none- character
## ref_r_df 1 -none- character
## ref_t_df 1 -none- character
# test data in D300 format
td3 <- get_test_D300_data()
summary(td3)
## Length Class Mode
## f_96w 6 -none- list
## f_384w 6 -none- list
# test data obtained from EnVision
td4 <- get_test_EnVision_data()
summary(td4)
## Length Class Mode
## m_file 1 -none- character
## r_files 28 -none- character
## t_files 2 -none- character
## ref_l_path 1 -none- character
The load_data
is the key function. It wraps load_manifest
, load_templates
and load_results
functions and supports different file formats.
ml <- load_manifest(manifest_path(td1))
summary(ml)
## Length Class Mode
## data 4 data.table list
## headers 27 -none- list
t_df <- load_templates(template_path(td1))
summary(t_df)
## WellRow WellColumn Gnumber Concentration
## Length:768 Length:768 Length:768 Length:768
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## Gnumber_2 Concentration_2 Template
## Length:768 Length:768 Length:768
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
r_df <- suppressMessages(load_results(result_path(td1)))
summary(r_df)
## Barcode WellRow WellColumn ReadoutValue
## Length:4587 Length:4587 Min. : 1.00 Min. : 12627
## Class :character Class :character 1st Qu.: 6.50 1st Qu.: 67905
## Mode :character Mode :character Median :12.00 Median : 140865
## Mean :12.49 Mean : 263996
## 3rd Qu.:18.00 3rd Qu.: 324707
## Max. :24.00 Max. :2423054
## BackgroundValue
## Min. :332.0
## 1st Qu.:351.0
## Median :374.0
## Mean :453.2
## 3rd Qu.:570.0
## Max. :704.0
l_tbl <-
suppressMessages(
load_data(manifest_path(td1), template_path(td1), result_path(td1)))
summary(l_tbl)
## Length Class Mode
## manifest 4 data.table list
## treatments 7 data.table list
## data 5 data.table list
PRISM, the Multiplexed cancer cell line screening platform, facilitates rapid screening of a broad spectrum of drugs across more than 900 human cancer cell line models, employing a high-throughput, multiplexed approach. Publicly available PRISM data can be downloaded from the DepMap website (DepMap).
The gDRimport
package provides support for processing PRISM data at two levels: LEVEL5 and LEVEL6.
LEVEL5 Data: This format encapsulates all information about drugs, cell lines, and viability within a single file. To process LEVEL5 PRISM data, you can use the convert_LEVEL5_prism_to_gDR_input()
function. This function not only transforms and cleans the data but also executes the gDR pipeline for further analysis.
LEVEL6 Data: In LEVEL6, PRISM data is distributed across three separate files:
prism_data: containing collapsed log fold change data for viability assays. cell_line_data: providing information about cell lines. treatment_data: containing treatment data.
Processing LEVEL6 PRISM data can be accomplished using the convert_LEVEL6_prism_to_gDR_input()
function, which requires paths to these three files as input arguments.
To process LEVEL5 PRISM data, you can use the following function:
convert_LEVEL5_prism_to_gDR_input("path_to_file")
Replace “path_to_file” with the actual path to your LEVEL5 PRISM data file. This function will handle the transformation, cleaning, and execution of the gDR pipeline automatically.
To process LEVEL6 PRISM data, you can use the following function:
convert_LEVEL6_prism_to_gDR_input("prism_data_path", "cell_line_data_path", "treatment_data_path")
Replace “prism_data_path”, “cell_line_data_path”, and “treatment_data_path” with the respective paths to your LEVEL6 PRISM data files.
The function installAllDeps
assists in installing package dependencies.
sessionInfo()
## R Under development (unstable) (2024-10-21 r87258)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] BiocStyle_2.35.0 MultiAssayExperiment_1.33.1
## [3] gDRimport_1.5.4 PharmacoGx_3.11.0
## [5] CoreGx_2.11.0 SummarizedExperiment_1.37.0
## [7] Biobase_2.67.0 GenomicRanges_1.59.1
## [9] GenomeInfoDb_1.43.2 IRanges_2.41.2
## [11] S4Vectors_0.45.2 MatrixGenerics_1.19.0
## [13] matrixStats_1.4.1 BiocGenerics_0.53.3
## [15] generics_0.1.3
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-9 formatR_1.14 readxl_1.4.3
## [4] testthat_3.2.2 rlang_1.1.4 magrittr_2.0.3
## [7] shinydashboard_0.7.2 compiler_4.5.0 vctrs_0.6.5
## [10] reshape2_1.4.4 relations_0.6-14 stringr_1.5.1
## [13] pkgconfig_2.0.3 crayon_1.5.3 fastmap_1.2.0
## [16] backports_1.5.0 XVector_0.47.0 caTools_1.18.3
## [19] utf8_1.2.4 promises_1.3.2 rmarkdown_2.29
## [22] UCSC.utils_1.3.0 coop_0.6-3 xfun_0.49
## [25] zlibbioc_1.53.0 cachem_1.1.0 jsonlite_1.8.9
## [28] SnowballC_0.7.1 later_1.4.1 DelayedArray_0.33.3
## [31] BiocParallel_1.41.0 parallel_4.5.0 sets_1.0-25
## [34] cluster_2.1.7 R6_2.5.1 stringi_1.8.4
## [37] bslib_0.8.0 RColorBrewer_1.1-3 qs_0.27.2
## [40] limma_3.63.2 boot_1.3-31 cellranger_1.1.0
## [43] brio_1.1.5 jquerylib_0.1.4 bookdown_0.41
## [46] assertthat_0.2.1 Rcpp_1.0.13-1 knitr_1.49
## [49] downloader_0.4 httpuv_1.6.15 Matrix_1.7-1
## [52] igraph_2.1.2 tidyselect_1.2.1 abind_1.4-8
## [55] yaml_2.3.10 stringfish_0.16.0 gplots_3.2.0
## [58] codetools_0.2-20 lattice_0.22-6 tibble_3.2.1
## [61] plyr_1.8.9 shiny_1.9.1 BumpyMatrix_1.15.0
## [64] evaluate_1.0.1 lambda.r_1.2.4 futile.logger_1.4.3
## [67] RcppParallel_5.1.9 bench_1.1.3 BiocManager_1.30.25
## [70] pillar_1.9.0 lsa_0.73.3 KernSmooth_2.23-24
## [73] checkmate_2.3.2 DT_0.33 shinyjs_2.1.0
## [76] piano_2.23.0 ggplot2_3.5.1 munsell_0.5.1
## [79] scales_1.3.0 RApiSerialize_0.1.4 gtools_3.9.5
## [82] xtable_1.8-4 marray_1.85.0 glue_1.8.0
## [85] slam_0.1-55 tools_4.5.0 data.table_1.16.4
## [88] gDRutils_1.5.3 fgsea_1.33.0 visNetwork_2.1.2
## [91] fastmatch_1.1-4 cowplot_1.1.3 grid_4.5.0
## [94] colorspace_2.1-1 GenomeInfoDbData_1.2.13 cli_3.6.3
## [97] futile.options_1.0.1 fansi_1.0.6 S4Arrays_1.7.1
## [100] rematch_2.0.0 dplyr_1.1.4 gtable_0.3.6
## [103] sass_0.4.9 digest_0.6.37 SparseArray_1.7.2
## [106] htmlwidgets_1.6.4 htmltools_0.5.8.1 lifecycle_1.0.4
## [109] httr_1.4.7 statmod_1.5.0 mime_0.12