Friday, March 22, 2013
Data mining systems improve cost and quality of health care, or do they?
Several e-mail lists I am on were abuzz last week about the publication of a paper that was described in a press release from Indiana University to demonstrate that "machine learning, the same computer science discipline that helped create voice recognition systems, self-driving cars and credit card fraud detection systems, can drastically improve both the cost and quality of health care in the United States." The press release referred to a study published by an Indiana faculty member in the journal, Artificial Intelligence in Medicine .
While I am a proponent of computer applications that aim to improve the quality and cost of healthcare, I also believe we must be careful about the claims being made for them, especially those derived from results from scientific research.
After reading and analyzing the paper, I am skeptical of the claims made not only by the press release but also by the authors themselves. My concern is less about their research methods, although I have some serious qualms about them I will describe below, but more so with the press release that was issued by their university public relations office. Furthermore, as always seems to happen when technology is hyped, the press release was picked up and echoed across the Internet, followed by the inevitable conflation of its findings. Sure enough, one high-profile blogger wrote, "physicians who used an AI framework to make patient care decisions had patient outcomes that were 50% better than physicians who did not use AI." It is clear from the paper that physicians did not actually use such a framework, which was only applied retrospectively to clinical data.
What exactly did the study show? Basically, the researchers obtained a small data set for one clinical condition in one institution's electronic health record and applied some complex data mining techniques to show that lower cost and better outcomes could be achieved by following the options suggested by the machine learning algorithm instead of what the clinicians actually did. The claim, therefore, is that if the data mining were followed by the clinicians instead of their own decision-making, then better and cheaper care would ensue.
As done in many scientific papers about technology, the paper goes into exquisite detail about the data mining algorithms and the experiments comparing them. But the paper unfortunately provides very little description about the clinical data itself. There is a reference to another paper from a conference that appears to describe the data set , but it is still not clear how the data was applied to evaluate the algorithms.
I have a number of methodological problems with the paper. First is the paucity of clinical details about the data. The authors refer to a metric called the "outcomes rating scale" of the "client-directed outcome informed (CDOI) assessment." No details are provided as to exactly what this scale measures or how differences in measurement correlate with improved clinical outcome. Furthermore, the variables of the details of care for the patient that the data mining algorithm supposedly outperforms are not described either. Therefore anyone hoping to understand the clinical value that this approach is claimed to have improved is not able to do so.
A second problem is that there is no discussion about the cost data or what cost perspective (e.g., system, clinician, societal, etc.) is taken. This is a common problem that plagues many studies in healthcare that attempt to measure costs . Given the relatively modest amounts of money spent on the care that is reported in their results, amounting only to a few hundred dollars per patient, it is unlikely that the data includes the full amount of the costs of treatment for each patient, or over an appropriate time period. If my interpretation of the low value of the cost data is correct (which is difficult to discern from reading the paper due, again due to lack of details), the data do not include the cost of clinician time, facilities, or longer-term costs beyond the time frame of the data set.
If that is indeed the case, then it would be particularly problematic for a machine learning system, since such systems make inferences limited only to the data that is provided to the model. Therefore if poor data is provided to the model, its "conclusions" are suspect. (This raises a side issue as to whether there is truly "artificial intelligence" here, since the only intelligence applied by the system is the models developed by their human creators.)
A third concern is that this is a modeling study. As every evaluation methodologist knows, modeling studies are limited in their ability to assign cause and effect. There is certainly a role in informatics science for modeling studies, although we saw recently that such studies have their limits, especially when revisited over the long run. In this study, there may have been reasons for the clinicians following the more expensive path or confounding reasons why such patients had worse outcomes, but they cannot be captured by the approach used in this study.
This is related to the final and most serious problem of the work, which is that the modeling evaluation is a very weak form of evidence to demonstrate the value of an intervention. If the authors truly wanted to show the benefits of the system and approach they developed, they should have performed a randomized controlled trial that compared their intervention with an appropriate control group. This would have led to the type of study that the blogger mentioned above erroneously described this to be. Such a study design would assess some of the more vexing problems we face in informatics, such as whether the advice coming from a computer will change clinician behavior. Or, when such systems are introduced into the "real world," whether the "advice" provided will prospectively lead to better outcomes.
I do believe that the kind of work addressed by this paper is important, especially as we move into the area of personalized medicine. As eloquently described by Stead and colleagues, healthcare will soon be reaching the point where the number of data points required for clinical decisions will exceed the bounds of human cognition . (It probably already has.) Therefore clinicians will require aids to their cognition provided by information systems, perhaps one like that described in the study.
But such aids require, like everything else in medicine, robust evaluative research to demonstrate their value. The methods used in this paper may indeed be the methods to provide this value, but the implementation and evaluation described miss the mark. That miss is further exacerbated by the hype and conflation the ensued after the paper was published.
What can we learn from this paper and its ensuing hype? First, bold claims require bold evidence to back them up. In the case of showing value for an approach in healthcare - be it test, treatment, or informatics application - we must use evaluation methods that provide best evidence for the claim. That is not always a randomized controlled trial, but in this situation, it would be, and the modeling techniques used are really just preliminary data that (might) justify an actual clinical trial. Second, when we perform technology evaluation, we need to describe, and ideally release, all of the clinical data so that others can analyze and even replicate the results. Finally, while we all want to disseminate the results of our research to the widest possible audience, we need to be realistic in explaining what we accomplished and what are its larger implications.
 Bennett, C. and K. Hauser (2013). Artificial intelligence framework for simulating clinical decision-making: a Markov decision process approach. Artificial Intelligence in Medicine. Epub ahead of print.
 Bennett, C., T. Doub, A. Bragg, J. Luellen, C. VanRegenmorter, J. Lockman and R. Reiserer (2011). Data mining session-based patient reported outcomes (PROs) in a mental health setting: toward data-driven clinical decision support and personalized treatment. 2011 First IEEE International Conference on Healthcare Informatics, Imaging and Systems Biology (HISB 2011), San Jose, CA. 229-236.
 Drummond, M. and M. Sculpher (2005). Common methodological flaws in economic evaluations. Medical Care. 43(7 Suppl): 5-14.
 Stead, W., J. Searle, H. Fessler, J. Smith and E. Shortliffe (2011). Biomedical informatics: changing what physicians need to know and how they learn. Academic Medicine. 86: 429-434.
This post by William Hersh, MD, FACP, Professor and Chair, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, appeared on his blog Informatics Professor, where he posts his thoughts on various topics related to biomedical and health informatics.
Contact ACP Internist
Send comments to ACP Internist staff at firstname.lastname@example.org.
- QD: News Every Day--Knee meniscus surgery, therapy...
- New technology, same old reimbursement
- Death of an evangelist
- QD: News Every Day--Putting numbers on how primary...
- Finding theme in team
- What Starbucks can teach doctors
- QD: News Every Day--Modafinil prescriptions rise 1...
- Potential scientific breakthroughs plagued by un-c...
- Can computers replace physicians?
- QD: News Every Day--Oncology clinicians, patients ...
Members of the American College of Physicians contribute posts from their own sites to ACP Internistand ACP Hospitalist. Contributors include:
Albert Fuchs, MD, FACP, graduated from the University of California, Los Angeles School of Medicine, where he also did his internal medicine training. Certified by the American Board of Internal Medicine, Dr. Fuchs spent three years as a full-time faculty member at UCLA School of Medicine before opening his private practice in Beverly Hills in 2000.
And Thus, It Begins
Amanda Xi, ACP Medical Student Member, is a first-year medical student at the OUWB School of Medicine, charter class of 2015, in Rochester, Mich., from which she which chronicles her journey through medical training from day 1 of medical school.
Zackary Berger, MD, ACP Member, is a primary care doctor and general internist in the Division of General Internal Medicine at Johns Hopkins. His research interests include doctor-patient communication, bioethics, and systematic reviews.
Controversies in Hospital
Run by three ACP Fellows, this blog ponders vexing issues in infection prevention and control, inside and outside the hospital. Daniel J Diekema, MD, FACP, practices infectious diseases, clinical microbiology, and hospital epidemiology in Iowa City, Iowa, splitting time between seeing patients with infectious diseases, diagnosing infections in the microbiology laboratory, and trying to prevent infections in the hospital. Michael B. Edmond, MD, FACP, is a hospital epidemiologist in Richmond, Va., with a focus on understanding why infections occur in the hospital and ways to prevent these infections, and sees patients in the inpatient and outpatient settings. Eli N. Perencevich, MD, ACP Member, is an infectious disease physician and epidemiologist in Iowa City, Iowa, who studies methods to halt the spread of resistant bacteria in our hospitals (including novel ways to get everyone to wash their hands).
db's Medical Rants
Robert M. Centor, MD, FACP, contributes short essays contemplating medicine and the health care system.
Juliet K. Mavromatis, MD, FACP, provides a conversation about health topics for patients and health professionals.
Dr. Mintz' Blog
Matthew Mintz, MD, FACP, has practiced internal medicine for more than a decade and is an Associate Professor of Medicine at an academic medical center on the East Coast. His time is split between teaching medical students and residents, and caring for patients.
Toni Brayer, MD, FACP, blogs about the rapid changes in science, medicine, health and healing in the 21st century.
Vineet Arora, MD, FACP, is Associate Program Director for the Internal Medicine Residency and Assistant Dean of Scholarship & Discovery at the Pritzker School of Medicine for the University of Chicago. Her education and research focus is on resident duty hours, patient handoffs, medical professionalism, and quality of hospital care. She is also an academic hospitalist.
John H. Schumann, MD, FACP, provides transparency on the workings of medical practice and the complexities of hospital care, illuminates the emotional and cognitive aspects of caregiving and decision-making from the perspective of an active primary care physician, and offers behind-the-scenes portraits of hospital sanctums and the people who inhabit them.
Ryan Madanick, MD, ACP Member, is a gastroenterologist at the University of North Carolina School of Medicine, and the Program Director for the GI & Hepatology Fellowship Program. He specializes in diseases of the esophagus, with a strong interest in the diagnosis and treatment of patients who have difficult-to-manage esophageal problems such as refractory GERD, heartburn, and chest pain.
Mike Aref, MD, PhD, FACP, is an academic hospitalist with an interest in basic and clinical science and education, with interests in noninvasive monitoring and diagnostic testing using novel bedside imaging modalities, diagnostic reasoning, medical informatics, new medical education modalities, pre-code/code management, palliative care, patient-physician communication, quality improvement, and quantitative biomedical imaging.
William Hersh, MD, FACP, Professor and Chair, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, posts his thoughts on various topics related to biomedical and health informatics.
David Katz, MD
David L. Katz, MD, MPH, FACP, is an internationally renowned authority on nutrition, weight management, and the prevention of chronic disease, and an internationally recognized leader in integrative medicine and patient-centered care.
Richard Just, MD, ACP Member, has 36 years in clinical practice of hematology and medical oncology. His blog is a joint publication with Gregg Masters, MPH.
Kevin Pho, MD, ACP Member, offers one of the Web's definitive sites for influential health commentary.
Michael Kirsch, MD, FACP, addresses the joys and challenges of medical practice, including controversies in the doctor-patient relationship, medical ethics and measuring medical quality. When he's not writing, he's performing colonoscopies.
Elaine Schattner, MD, FACP, shares her ideas on education, ethics in medicine, health care news and culture. Her views on medicine are informed by her past experiences in caring for patients, as a researcher in cancer immunology, and as a patient who's had breast cancer.
Mired in MedEd
Alexander M. Djuricich, MD, FACP, is the Associate Dean for Continuing Medical Education (CME), and a Program Director in Medicine-Pediatrics at the Indiana University School of Medicine in Indianapolis, where he blogs about medical education.
Rob Lamberts, MD, ACP Member, a med-peds and general practice internist, returns with "volume 2" of his personal musings about medicine, life, armadillos and Sasquatch at More Musings (of a Distractible Kind).
David M. Sack, MD, FACP, practices general gastroenterology at a small community hospital in Connecticut. His blog is a series of musings on medicine, medical care, the health care system and medical ethics, in no particular order.
Reflections of a Grady
Kimberly Manning, MD, FACP, reflects on the personal side of being a doctor in a community hospital in Atlanta.
The Blog of Paul Sufka
Paul Sufka, MD, ACP Member, is a board certified rheumatologist in St. Paul, Minn. He was a chief resident in internal medicine with the University of Minnesota and then completed his fellowship training in rheumatology in June 2011 at the University of Minnesota Department of Rheumatology. His interests include the use of technology in medicine.
Technology in (Medical)
Neil Mehta, MBBS, MS, FACP, is interested in use of technology in education, social media and networking, practice management and evidence-based medicine tools, personal information and knowledge management.
Peter A. Lipson,
Peter A. Lipson, MD, ACP Member, is a practicing internist and teaching physician in Southeast Michigan. The blog, which has been around in various forms since 2007, offers musings on the intersection of science, medicine, and culture.
Why is American Health Care So Expensive?
Janice Boughton, MD, FACP, practiced internal medicine for 20 years before adopting a career in hospital and primary care medicine as a locum tenens physician. She lives in Idaho when not traveling.
World's Best Site
Daniel Ginsberg, MD, FACP, is an internal medicine physician who has avidly applied computers to medicine since 1986, when he first wrote medically oriented computer programs. He is in practice in Tacoma, Washington.
Other blogs of note:
American Journal of
Also known as the Green Journal, the American Journal of Medicine publishes original clinical articles of interest to physicians in internal medicine and its subspecialities, both in academia and community-based practice.
A collaborative medical blog started by Neil Shapiro, MD, ACP Member, associate program director at New York University Medical Center's internal medicine residency program. Faculty, residents and students contribute case studies, mystery quizzes, news, commentary and more.
Michael Benjamin, MD, ACP member, doesn't accept industry money so he can create an independent, clinician-reviewed space on the Internet for physicians to report and comment on the medical news of the day.
The Public Library of Science's open access materials include a blog.
One of the most popular anonymous blogs written by an emergency room physician.