This document serves as a reporting tool for errors that occur when running our utility functions on the cBioPortal datasets.
)Typically, the number of errors encountered via the API are low. There are only a handful of packages that error when we apply the utility functions to provide a MultiAssayExperiment data representation.
First, we load the error Rda
api_errs <- system.file(
"extdata", "api", "err_api_info.json",
package = "cBioPortalData", mustWork = TRUE
err_api_info <- fromJSON(api_errs)
We can now inspect the contents of the data:
## [1] "list"
## [1] 6
## Barcodes must start with 'TCGA'
## 2
## group length is 0 but data length > 0
## 1
## Frequency of NA values higher than the cutoff tolerance
## 2
## Inconsistent build numbers found
## 33
## `n` must be a single number, not an integer `NA`.
## 1
## Argument 1 must be a data frame or a named atomic vector.
## 1
There were about 6 unique errors during the last build run.
## [1] "Barcodes must start with 'TCGA'"
## [2] "group length is 0 but data length > 0"
## [3] "Frequency of NA values higher than the cutoff tolerance"
## [4] "Inconsistent build numbers found"
## [5] "`n` must be a single number, not an integer `NA`."
## [6] "Argument 1 must be a data frame or a named atomic vector."
The most common error was Inconsistent build numbers found
. This is
due to annotations from different build numbers that were not able to
be resolved.
To see what datasets (cancer_study_id
s) have that error we can use:
err_api_info[['Inconsistent build numbers found']]
## [1] "msk_ch_2020" "msk_access_2021"
## [3] "mixed_msk_tcga_2021" "mixed_impact_subset_2022"
## [5] "pan_origimed_2020" "prad_msk_stopsack_2021"
## [7] "pancan_pcawg_2020" "prad_pik3r1_msk_2021"
## [9] "skcm_tcga" "stad_tcga"
## [11] "stad_tcga_pub" "skcm_tcga_pan_can_atlas_2018"
## [13] "stad_tcga_pan_can_atlas_2018" "stes_tcga_pub"
## [15] "summit_2018" "cfdna_msk_2019"
## [17] "blca_bcan_hcrn_2022" "nsclc_ctdx_msk_2022"
## [19] "thyroid_mskcc_2016" "skcm_mskcc_2014"
## [21] "tmb_mskcc_2018" "rectal_msk_2019"
## [23] "skcm_tcga_pub_2015" "msk_spectrum_tme_2022"
## [25] "ucec_ccr_cfdna_msk_2022" "paired_bladder_2022"
## [27] "mtnn_msk_2022" "pog570_bcgsc_2020"
## [29] "sarcoma_msk_2023" "bowel_colitis_msk_2022"
## [31] "luad_mskcc_2023_met_organotropism" "coad_silu_2022"
## [33] "paac_msk_jco_2023"
We can also have a look at the entirety of the dataset.
Now let’s look at the errors in the packaged datasets that are used for
pack_errs <- system.file(
"extdata", "pack", "err_pack_info.json",
package = "cBioPortalData", mustWork = TRUE
err_pack_info <- fromJSON(pack_errs)
We can do the same for this data:
## [1] 5
## more columns than column names
## 12
## Frequency of NA values higher than the cutoff tolerance
## 5
## invalid class "ExperimentList" object: \n Non-unique names provided
## 2
## non-character argument
## 2
## 'wget' call had nonzero exit status
## 13
We can get a list of all the errors present:
## [1] "more columns than column names"
## [2] "Frequency of NA values higher than the cutoff tolerance"
## [3] "invalid class \"ExperimentList\" object: \n Non-unique names provided"
## [4] "non-character argument"
## [5] "'wget' call had nonzero exit status"
And finally the full list of errors:
## $`more columns than column names`
## [1] "ccrcc_utokyo_2013" "gbm_cptac_2021"
## [3] "luad_mskimpact_2021" "mbl_dkfz_2017"
## [5] "pan_origimed_2020" "sarcoma_msk_2022"
## [7] "bowel_colitis_msk_2022" "prad_msk_mdanderson_2023"
## [9] "brca_tcga_pan_can_atlas_2018" "coadread_tcga_pan_can_atlas_2018"
## [11] "ov_tcga_pan_can_atlas_2018" "sarc_tcga_pan_can_atlas_2018"
## $`Frequency of NA values higher than the cutoff tolerance`
## [1] "ihch_mskcc_2020" "mixed_selpercatinib_2020"
## [3] "ucec_ccr_msk_2022" "mixed_msk_tcga_2021"
## [5] "ihch_msk_2021"
## $`invalid class "ExperimentList" object: \n Non-unique names provided`
## [1] "mpnst_mskcc" "stad_tcga_pub"
## $`non-character argument`
## [1] "pcpg_tcga_pub" "mbn_mdacc_2013"
## $`'wget' call had nonzero exit status`
## [1] "braf_msk_impact_2024" "braf_msk_archer_2024" "prostate_msk_2024"
## [4] "pcnsl_msk_2024" "pdac_msk_2024" "ucs_msk_2024"
## [7] "asclc_msk_2024" "lms_msk_2024" "crc_orion_2024"
## [10] "brca_aurora_2023" "hcc_msk_2024" "pancreas_msk_2024"
## [13] "pancan_mimsi_msk_2024"
