% % NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. % %\VignetteIndexEntry{ontoCAT package} %\VignetteDepends{} %\VignetteKeywords{Classification, DataRepresentation} %\VignettePackage{ontoCAT} \documentclass[]{article} \usepackage{times} \usepackage{hyperref} \textwidth=6.2in \textheight=8.5in %\parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in \newcommand{\R}{\textsf{R}} \newcommand{\ontoCAT}{\Rpackage{ontoCAT}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\term}[1]{\emph{#1}} \newcommand{\mref}[2]{\htmladdnormallinkfoot{#2}{#1}} \begin{document} \title{\ontoCAT{}: package for basic operations with ontologies} \author{\mref{mailto:natalja@ebi.ac.uk}{Natalja Kurbatova}, Tomasz Adamusiak, Pavel Kurnosov, Morris Swertz and Misha Kapushesky } \date{15 March, 2011} \maketitle Version 1.2.1. released 6 June, 2011. \section{Introduction} The \ontoCAT{} package: \begin{itemize} \item gives unified, format-independent access to ontology terms and the ontology hierarchy represented in OWL and OBO formats; \item provides basic methods for ontology traversal, such as searching for terms, listing a specific term's relations, showing paths to the term from the root element of the ontology, showing flattened-tree representations of the ontology hierarchy; \item supports working with groups of ontologies and with major public ontology repositories: searching for terms across ontologies, listing available ontologies and loading ontologies for further analysis as necessary. \end{itemize} In \ontoCAT{} the subsumption ``subclass/superclass'' is supported in a user friendly form of ``child -- parent'' relationship. No distinction is made between universals (classes) and particulars (instances) as they are both treated as ontology terms. The package is based primarily on the Ontology Common API Tasks Java library, on the OWL API and depends on rJava R package. HermiT reasoner is used to support relationships. OBO ontologies are translated by OWL API into valid OWL format that can be reasoned over. We provide two versions of \ontoCAT{}: \begin{itemize} \item Light-weight \ontoCAT{} package version is available in Bioconductor starting from release 2.7, and includes all single-ontology functionality except for methods to work with multiple ontologies and search in OLS and BioPortal. \item Full version includes batch methods and due to package size limitations is available only from the project website \url{http://www.ontocat.org/wiki/r}. \end{itemize} \section{Basic operations with ontology} The \ontoCAT{} package can load an ontology from a local file or on-the-fly from a URI. An inferred ontology view is created internally, using \textbf{Pellet} to classify the ontology. Ontologies described in OWL or OBO format are supported. \subsection{Create Ontology object} To create an object of \Rclass{Ontology} you can use one of the three methods: \begin{itemize} \item \Rfunction{getEFO()} loads the latest EFO version on the fly from the EFO SVN repository and creates \Robject{Ontology} object. \item \Rfunction{getOntology("pathToOntology")} loads the ontology described in OWL or OBO format from a local file or a UR and creates \Robject{Ontology} object. Reasoning over ontologies and extracting relationships is supported by using HermiT reasoner. \item \Rfunction{getOntologyNoReasoning("pathToOntology")} loads the ontology described in OWL or OBO format from a local file or a UR and creates \Robject{Ontology} object without reasoning over loaded ontology. \end{itemize} <>= library(ontoCAT) efo <- getEFO() biotop <- getOntology("http://purl.org/biotop/biotop.owl") @ \subsection{Ontology traversal and search} \begin{itemize} \item To find all ontology terms two functions can be used: \Rfunction{getAllTerms} - returns a list of \Rclass{OntologyTerm} objects. In turn, \Rfunction{getAllTermIds} - returns a list of term accessions. <>= getAllTerms(biotop) getAllTermIds(efo) @ \item Function \Rfunction{getTermById} returns the accession number of the term. In turn, \Rfunction{getTermNameById} returns the name of the term. <>= term_efo <- getTermById(efo,"EFO_0000322") term_biotop <- getTermById(biotop,"DeadBody") getTermNameById(efo,"EFO_0000311") getTermNameById(biotop,"EmbryonicStructure") @ \item To find out all term parents or children the following functions can be used. <>= getAllTermParentsById(efo,"EFO_0000322") getAllTermChildrenById(biotop,"DeadBody") getAllTermParents(efo,term_efo) getAllTermChildren(biotop,term_biotop) @ Arguments: appropriate \Robject{Ontology} object, the term's accession or \Robject{OntologyTerm} object. \item To find out only direct parents or children of the term functions \Rfunction{getTermParents} or \Rfunction{getTermChildren} can be used. <>= getTermParentsById(efo,"EFO_0000322") getTermChildrenById(efo,"EFO_0000322") getTermParents(efo,term_efo) getTermChildren(biotop,term_biotop) @ Arguments: appropriate \Robject{Ontology} object, the term's accession or \Robject{OntologyTerm} object. \item One more function to get term children together with the queried term accession is \Rfunction{getTermAndAllChildrenIds}. <>= getTermAndAllChildren(efo,term_efo) getTermAndAllChildrenById(biotop,"DeadBody") @ Arguments: appropriate \Robject{Ontology} object, the term's accession or \Robject{OntologyTerm} object. \item To create a flat subtree representation of the ontology "opened" down to the specified term function \Rfunction{showHierarchyDownToTerm} can be used. Hierarchy of ontology will be shown down to requested term. <>= showHierarchyDownToTermById(efo,"EFO_0000322") showHierarchyDownToTerm(efo,term_efo) @ Arguments: appropriate \Robject{Ontology} object, the term's accession or \Robject{OntologyTerm} object. \item A few simple functions allow to get term definition and synonyms: <>= getTermDefinitionsById(efo,"EFO_0000322") getTermSynonymsById(efo,"EFO_0000322") getTermDefinitions(efo,term_efo) getTermSynonyms(efo,term_efo) @ Arguments: appropriate \Robject{Ontology} object, the term's accession or \Robject{OntologyTerm} object. \item A few simple functions allow to get some metadata about the used ontology: <>= getOntologyAccession(efo) getOntologyDescription(efo) @ Arguments: appropriate \Robject{Ontology}. \item To check if the term is present in the ontology function \Rfunction{hasTerm} can be used. <>= hasTerm(efo,"CL000023") hasTerm(efo,"EFO_0000322") @ Arguments: appropriate \Robject{Ontology} and the term accession. \item The following functions can be used to search terms in the ontology: <>= searchTerm(efo,"thymus") searchTermPrefix(efo,"thym") @ Arguments: appropriate \Robject{Ontology} and string\/string prefix to search for. \item The following functions can be used to investigate the ontology hierarchy: <>= isRootById(efo,"EFO_0000322") isRoot(efo,term_efo) getRoots(efo) getRootIds(efo) @ For some ontologies these functions might fail when the ontology used was not design to have root classes. \end{itemize} \subsection{Relations} \begin{itemize} \item To list all supported relations in ontology the following function can be used: <>= getOntologyRelationNames(efo) @ \item To list relations that has particular term the following function can be used: <>= getTermRelationNamesById(efo,"EFO_0000322") getTermRelationNames(efo,term_efo) @ Arguments: appropriate \Robject{Ontology} object, the term's accession or \Robject{OntologyTerm} object. \item Please use the following functions to find out terms that are in some relation with the term of interest: <>= getTermRelationsById(efo,"EFO_0000322","has_part") getTermRelations(efo,term_efo,"has_part") @ Arguments: appropriate \Robject{Ontology} object, the term's accession or \Robject{OntologyTerm} object, relation name. \end{itemize} \subsection{Functions specific for EFO ontology} There are few functions specific for EFO class hierarchy to work with EFO branch roots. <>= getEFOBranchRootIds(efo) isEFOBranchRootById(efo,"EFO_0000322") isEFOBranchRoot(efo,term_efo) @ \subsection{Ontology term object} There are only three functions for the \Rclass{OntologyTerm} class. \begin{itemize} \item \Rfunction{getLabel} to get term name; \item \Rfunction{getAccession} to get term accession; \item \Rfunction{show} to view the term. \end{itemize} <>= term <- getTermById(efo,"EFO_0000322") getLabel(term) getAccession(term) term @ \section{Methods to Work Across Multiple Ontologies} The \ontoCAT{} package provides methods to work with groups or ``batches'' of ontologies, local or web-based. Users can search for the terms across such resources and load specified individual ontologies by accession number. An Internet connection is required to load remote ontologies. \subsection{Create Ontology Batch} To create an object of \Rclass{OntologyBatch} you can use one of the two methods: \begin{itemize} \item \Rfunction{getEFOBatch()} without any arguments loads the EFO ontology. Ontologies can be added to an existing batch as needed via the \Rfunction{addOntology()} method. \item \Rfunction{getOntologyBatch("pathToDir")} creates a local batch of ontologies, taking a single argument: the path to the local directory containing ontology files. \end{itemize} <>= library(ontoCAT) batch <- getEFOBatch() @ \subsection{Term Searching} After a batch of ontologies is created, various methods for term searching become available. \begin{itemize} \item To search term in all ontlogies included into the batch: <>= searchTermInBatch(batch,"thymus") @ \item Searches in two public ontology repositories are supported. The NCBO BioPortal is a web-based application for accessing and sharing biomedical ontologies, currently hosting \textbf{241} ontologies. The Ontology Lookup Service (OLS) is a web services SOAP API for querying multiple ontologies, hosting \textbf{81} ontologies. <>= searchTermInBioportal(batch,"thymus") searchTermInOLS(batch,"thymus") @ \item Search by using all available sources will search for the term in local batch and addionally in OLS and BioPortal <>= searchTermInAll(batch,"thymus") @ \end{itemize} \subsection{Other Batch Methods} \begin{itemize} \item To add ontology into the batch: <>= addOntology(batch,"http://purl.org/biotop/biotop.owl") addEFO(batch) @ \item To list all available ontologies in the local batch: <>= listLoadedOntologies(batch) @ \item To get ontology parser in order to work with single ontology parsing and querying methods: <>= ontology <- getOntologyFromBatch(batch,"EFO") @ The second argument is the ontology accession. \end{itemize} \section*{References} \begin{enumerate} \item Adamusiak T, Burdett T, van der Velde K J, Abeygunawardena N, Antonakaki D, Parkinson H and Swertz M: OntoCAT -- a simpler way to access ontology resources. \emph{Available from Nature Precedings} \url{http://dx.doi.org/10.1038/npre.2010.4666.1} (2010) \item Malone J, Holloway E, Adamusiak T, Kapushesky M, Zheng J, Kolesnikov N, Zhukova A, Brazma A, Parkinson H: Modeling Sample Variables with an Experimental Factor Ontology. \emph{Bioinformatics} 2010, \textbf{26}(8):1112--1118 \item Experimental Factor Ontology \url{http://www.ebi.ac.uk/efo} \item Ontology Common API Tasks java library \url{http://www.ontocat.org} \item OntoCAT website \url{http://www.ontocat.org/wiki/r} \item Java sources and javadocs: \url{http://sourceforge.net/projects/ontocat/files/} \end{enumerate} \end{document}