A Machine Learning-Based Information Retrieval Framework for Molecular Medicine Predictive Models

Wehbe, Firas Hazem

A Machine Learning-Based Information Retrieval Framework for Molecular Medicine Predictive Models

dc.creator	Wehbe, Firas Hazem
dc.date.accessioned	2020-08-22T00:10:49Z
dc.date.available	2013-04-16
dc.date.issued	2011-04-16
dc.identifier.uri	https://etd.library.vanderbilt.edu/etd-03282011-223440
dc.identifier.uri	http://hdl.handle.net/1803/11612
dc.description.abstract	Molecular medicine encompasses the application of molecular biology techniques and knowledge to the prevention, diagnosis and treatment of diseases and disorders. Statistical and computational models can predict clinical outcomes, such as prognosis or response to treatment, based on the results of molecular assays. For advances in molecular medicine to translate into clinical results, clinicians and translational researchers need to have up-to-date access to high-quality predictive models. The large number of such models reported in the literature is growing at a pace that overwhelms the human ability to manually assimilate this information. Therefore the important problem of retrieving and organizing the vast amount of published information within this domain needs to be addressed. The inherent complexity of this domain and the fast pace of scientific discovery make this problem particularly challenging. This dissertation describes a framework for retrieval and organization of clinical bioinformatics predictive models. A semantic analysis of this domain was performed. The semantic analysis informed the design of the framework. Specifically, it allowed the development of a specialized annotation scheme of published articles that can be used for meaningful organization and for indexing and efficient retrieval. This annotation scheme was codified using an annotation form and accompanying guidelines document that were used by multiple human experts to annotate over 1000 articles. These datasets were then used to train and test support vector machine (SVM) machine learning classifiers. The classifiers were designed to provide a scalable mechanism to replicate human experts’ ability (1) to retrieve relevant MEDLINE articles and (2) to annotate these articles using the specialized annotation scheme. The machine learning classifiers showed very good predictive ability that was also shown to generalize to different disease domains and to datasets annotated by independent experts. The experiments highlighted the need for providing unambiguous operational definitions of the complex concepts used for semantic annotations. The impact of the semantic definitions on the quality of manual annotations and on the performance of the machine learning classifiers was discussed.
dc.format.mimetype	application/pdf
dc.subject	information retrieval
dc.subject	molecular medicine
dc.subject	translational bioinformatics
dc.subject	machine learning
dc.title	A Machine Learning-Based Information Retrieval Framework for Molecular Medicine Predictive Models
dc.type	dissertation
dc.contributor.committeeMember	Steven H. Brown
dc.contributor.committeeMember	Daniel R. Masys
dc.contributor.committeeMember	Pierre Massion
dc.contributor.committeeMember	Hua Xu
dc.type.material	text
thesis.degree.name	PHD
thesis.degree.level	dissertation
thesis.degree.discipline	Biomedical Informatics
thesis.degree.grantor	Vanderbilt University
local.embargo.terms	2013-04-16
local.embargo.lift	2013-04-16
dc.contributor.committeeChair	Cynthia S. Gadd
dc.contributor.committeeChair	Constantin Aliferis

Files in this item

Name:: Firas_Wehbe_Dissertation_20110 ...
Size:: 3.623Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Electronic Theses and Dissertations
Electronic theses and dissertations of masters and doctoral students submitted to the Graduate School.

Show simple item record