Electronic Theses and Dissertations

Identifier

782

Date

2013

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Electrical and Computer Engr

Committee Chair

Mohammed Yeasin

Committee Member

Ramin Homayouni

Committee Member

Russell Jerry Deaton

Committee Member

Eugene C Eckstein

Abstract

The effective mining of biological literature can provide a range of services such as hypothesis-generation, semantic-sensitive information retrieval, and knowledge discovery, which can be important to understand the confluence of different diseases, genes, and risk factors. Furthermore, integration of different tools at specific levels could be valuable. The main focus of the dissertation is developing and integrating tools in finding network of semantically related entities. The key contribution is the design and implementation of an Adaptive Robust and Integrative Analysis for finding Novel Associations. ARIANA is a software architecture and a web-based system for efficient and scalable knowledge discovery. It integrates semantic-sensitive analysis of text-data through ontology-mapping with database search technology to ensure the required specificity. ARIANA was prototyped using the Medical Subject Headings ontology and PubMed database and has demonstrated great success as a dynamic-data-driven system. ARIANA has five main components: (i) Data Stratification, (ii) Ontology-Mapping, (iii) Parameter Optimized Latent Semantic Analysis, (iv) Relevance Model and (v) Interface and Visualization. The other contribution is integration of ARIANA with Online Mendelian Inheritance in Man database, and Medical Subject Headings ontology to provide gene-disease associations. Empirical studies produced some exciting knowledge discovery instances. Among them was the connection between the hexamethonium and pulmonary inflammation and fibrosis. In 2001, a research study at John Hopkins used the drug hexamethonium on a healthy volunteer that ended in a tragic death due to pulmonary inflammation and fibrosis. This accident might have been prevented if the researcher knew of published case report. Since the original case report in 1955, there has not been any publications regarding that association. ARIANA extracted this knowledge even though its database contains publications from 1960 to 2012. Out of 2,545 concepts, ARIANA ranked “Scleroderma, Systemic”, “Neoplasms, Fibrous Tissue”, “Pneumonia”, “Fibroma”, and “Pulmonary Fibrosis” as the 13th, 16th, 38th, 174th and 257th ranked concept respectively. The researcher had access to such knowledge this drug would likely not have been used on healthy subjects.In today's world where data and knowledge are moving away from each other, semantic-sensitive tools such as ARIANA can bridge that gap and advance dissemination of knowledge.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.

Share

COinS