Ontology integration for biomedical data

More Info
expand_more

Abstract

Gene similarity has been an area of great interest in numerous fields for decades, as it can provide insights into the evolutionary relationships among different species. This knowledge is particularly useful for advancing biotechnologies, discovering new drugs and treatments for various issues and improving the characteristics in breed crops or animals. DNA sequencing enables gene annotation, which facilitates the identification of similarities between genes. Similar genes from different species are interesting candidates for studying gene functional similarity. Gene ontology (GO) provides a standardized vocabulary for gene annotation, which is considered to be the ground truth when describing their properties. Nevertheless, an interesting source for gathering additional information about genes can be the plethora of biomedical articles accessible online. These are the pillars on which is foundered the incentive of this paper - an endeavor to investigate how using graph theory could benefit scientists in transferring knowledge about gene functionalities between different plants. We present an overview of the methodology and the design of the system we used in order to convey to what extent using subgraphs similarity based on annotated data proves to yield results similar to those already established as ground truth for the model plant Arabidopsis Thaliana and its counterpart, Solanum Lycopersicum, as well as our conclusions and discussions regarding the quality of the datasets used throughout this research.