Exploring Adverse Drug Effect Discovery from Data Mining of Clinical Notes
Smith, Joshua Carl
:
2012-07-05
Abstract
Many medications have potentially serious adverse effects detected only after FDA approval. After 80 million people worldwide received prescriptions for the drug rofecoxib (Vioxx), its manufacturer withdrew it from the marketplace in 2004. Epidemiological data showed that it increases risk of heart attack and stroke. Recently, the FDA warned that the commonly prescribed statin drug class (e.g., Lipitor, Zocor, Crestor) may increase risk of memory loss and Type 2 diabetes. These incidents illustrate the difficulty of identifying adverse effects of prescription medications during premarketing trials. Only post-marketing surveillance can detect some types of adverse effects (e.g., those requiring years of exposure). We explored the use of data mining on clinical notes to detect novel adverse drug effects. We constructed a knowledge base using UMLS and other data sources that could classify drug-finding pairs as “currently known adverse effects” (drug causes finding), “known indications” (drug treats/prevents finding), or “unknown relationship”. We used natural language processing (NLP) to extract current medications and clinical findings (including diseases) from 360,000 de-identified history and physical examination (H&P) notes. We identified 35,000 “interesting” co-occurrences of medication-finding concepts that exceeded threshold probabilities of appearance. These involved ~600 drugs and ~2000 findings. Among the identified pairs are several that the FDA recognized as harmful in postmarketing surveillance, including rofecoxib and heart attack, rofecoxib and stroke, statins and diabetes, and statins and memory loss. Our preliminary results illustrate both the problems and potential of using data mining of clinical notes for adverse drug effect discovery.