--- title: "Quick Guide to AssessORFData" shorttitle: "Quick Guide to AssessORFData" author: "Deepank Korandla" date: "`r doc_date()`" package: "`r pkg_ver('AssessORFData')`" output: BiocStyle::pdf_document vignette: > %\VignetteIndexEntry{Using AssessORFData} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Introduction AssessORFData is an accompaniment to the AssessORF package, providing access to mapping and results objects generated by AssessORF as well as the genome sequences for the strains corresponding to those objects. Briefly, a mapping object stores the mapping of proteomics evidence and evolutionary conservation evidence to a particular strain's genome, and a results object stores how much evidence there is supporting or against each gene in a set of predicted genes for a particular strain's genome. Detailed descriptions of the structure and content of those two types of objects can be found in the documentation for the AssessORF package. # Strains and Gene Sources AssessORFData has data for 20 strains and their IDs (within the package) are listed below: ```{r} AssessORFData::GetStrainIDs() ``` For each strain, there are 5 objects: 1 mapping object and 4 results objects. The 4 results objects per strain differ in that for each one, the set of predicted genes came from a different program or database. The same 4 gene sources were used for all 20 strains, and their IDs (within the package) are listed below: ```{r} AssessORFData::GetGeneSources() ``` # Getting the Objects While the `data` function can be used to pull the desired object from the package into the user's workspace, using the the `data` function may be inconvenient for some users because there are 100 mapping and results objects, each of which has a long name. For this reason, AssessORFData has alternative functions, `GetDataMapObj` and `GetResultsObj`, to accomplish a similar task. These functions allow the user to get the object of interest and then save it in their workspace under a different name. Examples on how to use the two functions are described below: ```{r} library(AssessORFData) ## Can replace the character string specifying the strain ID (first ## parameter) with any of the other 19 strain IDs listed above mapObj <- GetDataMapObj("MGAS5005") resObj1 <- GetResultsObj("MGAS5005", "Prodigal") resObj2 <- GetResultsObj("MGAS5005", "GenBank") resObj3 <- GetResultsObj("MGAS5005", "GeneMarkS2") resObj4 <- GetResultsObj("MGAS5005", "Glimmer") ``` # Saving the Genome The `SaveGenomeToPath` function allows the user to save the genome for a strain of their choosing to a file path on their local machine in situations where the user wants to run their own analyses on their strain's genome, e.g. predict genes for the genome using a different gene finding program. An example of how to use the function is provided below: ```{r} library(AssessORFData) ## A path to a temporary file is used in this example. tmpFile <- paste0(tempfile(), ".fasta") ## Replace the second parameter below with a character string specifying ## the desired file path, making sure the file is of type FASTA. SaveGenomeToPath("MGAS5005", tmpFile) unlink(tmpFile) ``` \newpage # Session Info All of the output in this vignette was produced under the following conditions: ```{r echo = FALSE} print(sessionInfo(), locale = FALSE) ```