Friday, August 30, 2013
Data scientists must also be research methodology scientists
I had the chance last week to attend a conference in Singapore, Big Data and Analytics in Health Care. It was an interesting blend of academics, operational health information technology professionals, and data scientists from companies in the emerging analytics market. I was also in Singapore for the end in-person session of the 10x10 ("ten by ten") introductory informatics course we offer there.
The talks were all interesting, but I was struck by the difference in the content and tone of the academic and clinical operations speakers compared to those from analytics companies and who called themselves "data scientists." Whereas the academic and clinical operational types were cautious in their methods and results, the data scientists implied their techniques would revolutionize healthcare and threw around terms like "big data" and "analytics" at every turn.
One of the latter types showed a "model" of the pathways leading to good (conservative) and bad (surgery) outcomes in back pain, with the intermediate nodes representing actions along the path, such as medication use, physical therapy, and chiropractic care. It was not clear to me how this model could be used to improve care, and I am not sure the speaker really understood that correlations do not prove causality. A second such speaker showed some interesting correlations between words and phrases that occur in clinical narratives of patients with diabetes and aspects of their care. I understand machine learning and how it might be used to "learn" things about patients with diabetes, but I did not see any evidence that this work would lead to any kind of improved patient outcomes.
Another concern I have about proponents of clinical data analytics is their presumption that their algorithms can somehow take all of the growing amount of operational electronic health record (EHR) data and automatically turn it into medical knowledge, as if they could turn a crank with data going in and knowledge emerging. I do have great enthusiasm for some of what can be done with this data, but I also have concerns about the quality and completeness of this data as well as the causality issues that arise without controlling observations in experimental ways.
I had the opportunity to speak at the conference as well, and gave a talk pulling together my cautious enthusiasm for using operational clinical data for research and other analytical purposes. This was the first public talk I have given on this topic since publication of a paper with 10 other colleagues on caveats for the use of operational electronic health record data in comparative effectiveness research in the journal Medical Care . The paper was commissioned by AcademyHealth and is part of a special supplement of the journal devoted to electronic data methods.
Our paper notes that while there are many opportunities for using clinical data for research and analytics, we also must remember the limitations of such data. In particular, EHR and other clinical data may be:
--Inaccurate - data entry is not always a top priority for clinicians, and they may take shortcuts, such as copy-and-paste--Incomplete - patients do not get all of their care in one setting--Transformed in ways that undermine meaning - coding for billing is the best known example of this--Unrecoverable for research - data may be in clinical narratives or other less accessible places--Of unknown provenance - we need to know where data comes from and how likely it is to be accurate--Of inappropriate granularity - data too coarse for research purposes--Incompatible with research protocols - patients are not always diagnosed and treated consistently with best practices
Despite these caveats, I am optimistic that there will be uses for this data, especially if we can generate it in a standards-based way and otherwise improve its quality. Hopefully clinicians, researchers, patients, public health authorities, quality improvement leaders, and other who might benefit from the data will have incentive to improve it by more meticulous entry as well as use of standards-based, such as those proscribed by Stage 2 of the meaningful use program . For many clinicians especially these days, the EHR can be a data sink hole into which they enter data, spending a great deal of time but getting little in return.
The bottom line is that while data scientists may be able to generate interesting and important results with their methods, they must also understand basic principles of research science, such as inferential statistics, clinical significance, and cause and effect. In addition, they must demonstrate their methods lead to improvements in health and/or healthcare, and are not just generating interesting associations. In other words, they must show evidence that their methods add value, just as medical care and informatics are required to do.
1. Hersh, WR, Weiner, MG, et al. (2013). Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care. 51(Suppl 3): S30-S37, http://journals.lww.com/lww-medicalcare/Fulltext/2013/08001/Caveats_for_the_Use_of_Operational_Electronic.7.aspx.
2. Metzger, J and Rhoads, J (2012). Summary of Key Provisions in Final Rule for Stage 2 HITECH Meaningful Use. Falls Church, VA, Computer Sciences Corp. http://assets1.csc.com/health_services/downloads/CSC_Key_Provisions_of_Final_Rule_for_Stage_2.pdf.
Contact ACP Internist
Send comments to ACP Internist staff at email@example.com.
- QD: News Every Day--Little evidence supports opioi...
- Electronic medical records save money! (Never mind...
- We should require meaningful notes
- QD: News Every Day--Statins may benefit people ove...
- Crazy ideas
- Medical tourism--some ideas, and maybe what not to...
- 'Miracle' powers of the flu vaccine
- QD: News Every Day--Possible pathway found for inc...
- Dietary supplements: magic or medicine?
- Does AHEAD set us back?
Members of the American College of Physicians contribute posts from their own sites to ACP Internistand ACP Hospitalist. Contributors include:
Albert Fuchs, MD, FACP, graduated from the University of California, Los Angeles School of Medicine, where he also did his internal medicine training. Certified by the American Board of Internal Medicine, Dr. Fuchs spent three years as a full-time faculty member at UCLA School of Medicine before opening his private practice in Beverly Hills in 2000.
And Thus, It Begins
Amanda Xi, ACP Medical Student Member, is a first-year medical student at the OUWB School of Medicine, charter class of 2015, in Rochester, Mich., from which she which chronicles her journey through medical training from day 1 of medical school.
Ira S. Nash, MD, FACP, is the senior vice president and executive director of the North Shore-LIJ Medical Group, and a professor of Cardiology and Population Health at Hofstra North Shore-LIJ School of Medicine. He is Board Certified in Internal Medicine and Cardiovascular Diseases and was in the private practice of cardiology before joining the full-time faculty of Massachusetts General Hospital.
Zackary Berger, MD, ACP Member, is a primary care doctor and general internist in the Division of General Internal Medicine at Johns Hopkins. His research interests include doctor-patient communication, bioethics, and systematic reviews.
Controversies in Hospital
Run by three ACP Fellows, this blog ponders vexing issues in infection prevention and control, inside and outside the hospital. Daniel J Diekema, MD, FACP, practices infectious diseases, clinical microbiology, and hospital epidemiology in Iowa City, Iowa, splitting time between seeing patients with infectious diseases, diagnosing infections in the microbiology laboratory, and trying to prevent infections in the hospital. Michael B. Edmond, MD, FACP, is a hospital epidemiologist in Iowa City, IA, with a focus on understanding why infections occur in the hospital and ways to prevent these infections, and sees patients in the inpatient and outpatient settings. Eli N. Perencevich, MD, ACP Member, is an infectious disease physician and epidemiologist in Iowa City, Iowa, who studies methods to halt the spread of resistant bacteria in our hospitals (including novel ways to get everyone to wash their hands).
db's Medical Rants
Robert M. Centor, MD, FACP, contributes short essays contemplating medicine and the health care system.
Suneel Dhand, MD, ACP Member
Suneel Dhand, MD, ACP Member, is a practicing physician in Massachusetts. He has published numerous articles in clinical medicine, covering a wide range of specialty areas including; pulmonology, cardiology, endocrinology, hematology, and infectious disease. He has also authored chapters in the prestigious "5-Minute Clinical Consult" medical textbook. His other clinical interests include quality improvement, hospital safety, hospital utilization, and the use of technology in health care.
Juliet K. Mavromatis, MD, FACP, provides a conversation about health topics for patients and health professionals.
Dr. Mintz' Blog
Matthew Mintz, MD, FACP, has practiced internal medicine for more than a decade and is an Associate Professor of Medicine at an academic medical center on the East Coast. His time is split between teaching medical students and residents, and caring for patients.
Toni Brayer, MD, FACP, blogs about the rapid changes in science, medicine, health and healing in the 21st century.
Vineet Arora, MD, FACP, is Associate Program Director for the Internal Medicine Residency and Assistant Dean of Scholarship & Discovery at the Pritzker School of Medicine for the University of Chicago. Her education and research focus is on resident duty hours, patient handoffs, medical professionalism, and quality of hospital care. She is also an academic hospitalist.
John H. Schumann, MD, FACP, provides transparency on the workings of medical practice and the complexities of hospital care, illuminates the emotional and cognitive aspects of caregiving and decision-making from the perspective of an active primary care physician, and offers behind-the-scenes portraits of hospital sanctums and the people who inhabit them.
Ryan Madanick, MD, ACP Member, is a gastroenterologist at the University of North Carolina School of Medicine, and the Program Director for the GI & Hepatology Fellowship Program. He specializes in diseases of the esophagus, with a strong interest in the diagnosis and treatment of patients who have difficult-to-manage esophageal problems such as refractory GERD, heartburn, and chest pain.
Mike Aref, MD, PhD, FACP, is an academic hospitalist with an interest in basic and clinical science and education, with interests in noninvasive monitoring and diagnostic testing using novel bedside imaging modalities, diagnostic reasoning, medical informatics, new medical education modalities, pre-code/code management, palliative care, patient-physician communication, quality improvement, and quantitative biomedical imaging.
William Hersh, MD, FACP, Professor and Chair, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, posts his thoughts on various topics related to biomedical and health informatics.
David Katz, MD
David L. Katz, MD, MPH, FACP, is an internationally renowned authority on nutrition, weight management, and the prevention of chronic disease, and an internationally recognized leader in integrative medicine and patient-centered care.
Richard Just, MD, ACP Member, has 36 years in clinical practice of hematology and medical oncology. His blog is a joint publication with Gregg Masters, MPH.
Kevin Pho, MD, ACP Member, offers one of the Web's definitive sites for influential health commentary.
Michael Kirsch, MD, FACP, addresses the joys and challenges of medical practice, including controversies in the doctor-patient relationship, medical ethics and measuring medical quality. When he's not writing, he's performing colonoscopies.
Elaine Schattner, MD, FACP, shares her ideas on education, ethics in medicine, health care news and culture. Her views on medicine are informed by her past experiences in caring for patients, as a researcher in cancer immunology, and as a patient who's had breast cancer.
Mired in MedEd
Alexander M. Djuricich, MD, FACP, is the Associate Dean for Continuing Medical Education (CME), and a Program Director in Medicine-Pediatrics at the Indiana University School of Medicine in Indianapolis, where he blogs about medical education.
Rob Lamberts, MD, ACP Member, a med-peds and general practice internist, returns with "volume 2" of his personal musings about medicine, life, armadillos and Sasquatch at More Musings (of a Distractible Kind).
David M. Sack, MD, FACP, practices general gastroenterology at a small community hospital in Connecticut. His blog is a series of musings on medicine, medical care, the health care system and medical ethics, in no particular order.
Reflections of a Grady
Kimberly Manning, MD, FACP, reflects on the personal side of being a doctor in a community hospital in Atlanta.
The Blog of Paul Sufka
Paul Sufka, MD, ACP Member, is a board certified rheumatologist in St. Paul, Minn. He was a chief resident in internal medicine with the University of Minnesota and then completed his fellowship training in rheumatology in June 2011 at the University of Minnesota Department of Rheumatology. His interests include the use of technology in medicine.
Technology in (Medical)
Neil Mehta, MBBS, MS, FACP, is interested in use of technology in education, social media and networking, practice management and evidence-based medicine tools, personal information and knowledge management.
Peter A. Lipson,
Peter A. Lipson, MD, ACP Member, is a practicing internist and teaching physician in Southeast Michigan. The blog, which has been around in various forms since 2007, offers musings on the intersection of science, medicine, and culture.
Why is American Health Care So Expensive?
Janice Boughton, MD, FACP, practiced internal medicine for 20 years before adopting a career in hospital and primary care medicine as a locum tenens physician. She lives in Idaho when not traveling.
World's Best Site
Daniel Ginsberg, MD, FACP, is an internal medicine physician who has avidly applied computers to medicine since 1986, when he first wrote medically oriented computer programs. He is in practice in Tacoma, Washington.
Other blogs of note:
American Journal of
Also known as the Green Journal, the American Journal of Medicine publishes original clinical articles of interest to physicians in internal medicine and its subspecialities, both in academia and community-based practice.
A collaborative medical blog started by Neil Shapiro, MD, ACP Member, associate program director at New York University Medical Center's internal medicine residency program. Faculty, residents and students contribute case studies, mystery quizzes, news, commentary and more.
Michael Benjamin, MD, ACP member, doesn't accept industry money so he can create an independent, clinician-reviewed space on the Internet for physicians to report and comment on the medical news of the day.
The Public Library of Science's open access materials include a blog.
One of the most popular anonymous blogs written by an emergency room physician.