Empirical Bayes Methods for Everyday Statistical Problems

Smith, Derek Kyle

Empirical Bayes Methods for Everyday Statistical Problems

dc.creator	Smith, Derek Kyle
dc.date.accessioned	2020-08-23T16:25:00Z
dc.date.available	2019-01-27
dc.date.issued	2017-01-27
dc.identifier.uri	https://etd.library.vanderbilt.edu/etd-12232016-090335
dc.identifier.uri	http://hdl.handle.net/1803/15344
dc.description.abstract	This work develops an empirical Bayes approach to statistical difficulties that arise in real-world applications. Empirical Bayes methods use Bayesian machinery to obtain statistical estimates, but rather than having a prior distribution for model parameters that is assumed, the prior is estimated from the observed data. Misuse of these methods as though the resulting “posterior distributions” were true Bayes posteriors has lead to limited adoption, but careful application can result in improved point estimation in a wide variety of circumstances. The first problem solved via an empirical Bayes approach deals with surrogate outcome measures. Theory for using surrogate outcomes for inference in clinical trials has been developed over the last 30 years starting with the development of the Prentice criteria for surrogate outcomes in 1989. Theory for using surrogates outside of the clinical trials arena or to develop risk score models is lacking. In this work we propose criteria similar to the Prentice criteria for using surrogates to develop risk scores. We then identify a particular type of surrogate which violates the proposed criteria in a particular way, which we deem a partial surrogate. The behavior of partial surrogates is investigated through a series of simulation studies and an empirical Bayes weighting scheme is developed which alleviates their pathologic behavior. It is then hypothesized that a common clinical measure, change in perioperative serum creatinine level from baseline, is actually a partial surrogate. It is demonstrated that it displays the same sort of pathologic behaviors seen in the simulation study and that they are similarly rectified using the proposed method. The result is a more acurate predictive model for both short and long-term measure of kidney function. The second problem solved deals with likelihood support intervals. Likelihood intervals are a way to quantify statistical uncertainty. Unlike other, more common methods for interval estimation, every value that is included in a support interval must be supported by the data at a specified level. Support intervals have not seen wide usage in practice due to a philosophic belief amongst many in the field that frequency-based or probabilistic inference is somehow stronger than inference based soley on the likelihood. In this work we develop a novel procedure based on the bootstrap for estimating the frequency characteristics of likelihood intervals. The resulting intervals have both the frequency properties of the set prized by frequentists as well as each individual member of the set attaining a specified support level. An R package, supportInt, was developed to calculate these intervals and published on the Comprehensive R Archive Network. The third problem addressed deals with the design of clinical trials when the potential protocols for the intervention are highly variable. A meta-analysis is presented in which the difficulties this situation presents becomes apparent. The results of this analysis of randomized trials of perioperative beta-blockade as a potential intervention to prevent my- ocardial infarction in the surgical setting are completely dependent on the statistical model chosen. In particular, which elements of the trial protocol are pooled and which are al- lowed by the model to impact the estimate of treatment efficacy completely determine the inference drawn from the data. This problem occurs largely because the trials conducted on the intervention of interest are not richly variable in some aspects of protocol. In this section it is demonstrated that large single protocol designs that are frequently advocated for can be replaced by multi-arm protocols to more accurately assess the question of an intervention’s potential efficacy. Simulation studies are conducted that make use of a novel adaptive randomization scheme based on an empirically estimated likelihood function. A tool is made available in a Shiny app that allows for the conduct of further studies by the reader under a wide variety of conditions.
dc.format.mimetype	application/pdf
dc.subject	empirical bayes
dc.subject	shrinkage
dc.subject	probabilistic calibration
dc.subject	clinical trial design
dc.subject	acute kidney injury
dc.title	Empirical Bayes Methods for Everyday Statistical Problems
dc.type	dissertation
dc.contributor.committeeMember	Jeffrey Blume
dc.contributor.committeeMember	Sonya Sterba
dc.contributor.committeeMember	Robert Greevy
dc.type.material	text
thesis.degree.name	PHD
thesis.degree.level	dissertation
thesis.degree.discipline	Biostatistics
thesis.degree.grantor	Vanderbilt University
local.embargo.terms	2019-01-27
local.embargo.lift	2019-01-27
dc.contributor.committeeChair	William Dupont

Files in this item

Name:: DerekKSmith.pdf
Size:: 1.790Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Electronic Theses and Dissertations
Electronic theses and dissertations of masters and doctoral students submitted to the Graduate School.

Show simple item record