Show simple item record

Evaluating uses of machine learning in propensity score estimation on time-to-event data: A simulation study

dc.contributor.advisorHackstadt, Amber
dc.contributor.advisorStewart, Thomas
dc.creatorJi, Xiangyu
dc.date.accessioned2022-01-10T16:47:12Z
dc.date.created2021-12
dc.date.issued2021-11-19
dc.date.submittedDecember 2021
dc.identifier.urihttp://hdl.handle.net/1803/16993
dc.description.abstractObservational studies with time-to-event outcomes in electronic healthcare records have been widely used to estimate the effects of treatments, exposures, and medical interventions on health outcomes in a typical clinical setting. Compared to random clinical trials, the lack of random distribution assignment in observational studies confounds the effects of exposures, due to the potential differences in the distribution of covariates between treatment and control groups. It is essential to minimize the confounding effects in observational studies with statistical methods such as propensity scores (PS). Propensity scores are commonly estimated with parametric methods like logistic regression models based on baseline covariates, which hold strong assumptions on data distribution and model specification. Machine learning methods recently have been recognized as alternative PS estimation methods and Super Learner allows optimal combination of multiple algorithms. The objective of this study is to implement parametric model (logistic regression models) and data-driven models (machine learning methods) to estimate PS, adjust the treatment effect estimation models with or without weighting based on PS, and evaluate the performances of different combinations of PS estimation and application methods on simulated survival data inspired by Right Heart Catheterization dataset. We found that machine learning methods did not perform better than the logistic method for PS estimation in terms of bias of treatment effect estimator. Compared to model weighting, adjusting covariates in the treatment effect estimation model might reduce bias in large samples but could increase bias in small samples.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectmachine learning
dc.subjectsuper learner
dc.subjectpropensity score weighting
dc.subjectpropensity score matching weight
dc.subjectpropensity score overlap weight
dc.subjectsurvival data
dc.subjectelectronic health record
dc.titleEvaluating uses of machine learning in propensity score estimation on time-to-event data: A simulation study
dc.typeThesis
dc.date.updated2022-01-10T16:47:12Z
dc.type.materialtext
thesis.degree.nameMS
thesis.degree.levelMasters
thesis.degree.disciplineBiostatistics
thesis.degree.grantorVanderbilt University Graduate School
local.embargo.terms2023-12-01
local.embargo.lift2023-12-01
dc.creator.orcid0000-0002-4875-1108


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record