Friday, July 12, 2013
What is a thinking informatician to think of IBM's Watson?
One of the computer applications that has received the most attention in health care is Watson, the IBM system that achieved fame by beating humans at the television game show, Jeopardy!. Sometimes it seems there is such hype around Watson that people do not realize what the system actually does.
Watson is a type of computer application known as a "question-answering system." It works similarly to a search engine, but instead of retrieving "documents" (e.g., articles, Web pages, images, etc.), it outputs "answers" (or at least short snippets of text that are likely to contain answers to questions posed to it).
As one who has done research in information retrieval (IR, also sometimes called "search") for over two decades, I am interested in how Watson works and how well it performs on the tasks for which it is used. As someone also interested in IR applied to health and biomedicine, I am even more curious about its health care applications.
Since winning at Jeopardy!, Watson has "graduated medical school" and "started its medical career". The latter reference touts Watson as an alternative to the "meaningful use" program providing incentives for electronic health record (EHR) adoption, but I see Watson as a very different application, and one potentially benefitting from the growing quantity of clinical data, especially the standards-based data we will hopefully see in Stage 2 of the program. (I also have skepticism for some of these proposed uses of Watson, such as its "crunching" through EHR data to "learn" medicine. Those advocating Watson performing this task need to understand the limits to observational studies in medicine.)
One concern I have had about Watson is that the publicity around it has been mostly news articles and press releases. As an evidence-based informatician, I would like to see more scientific analysis, i.e., what does Watson do to improve healthcare and how successful is it at doing so? I was therefore pleased to come across a journal article evaluating Watson . In this first evaluation in the medical domain, Watson was trained using several resources from internal medicine, such as ACP Medicine, PIER, Merck Manual, and MKSAP.
Watson was applied, and further trained with 5,000 questions, in Doctor's Dilemma, a competition somewhat like Jeopardy! that is run by American College of Physicians and in which medical trainees participate each year. A sample question from the paper is, Familial adenomatous polyposis is caused by mutations of this gene, with the answer being, APC Gene. (Googling the text of the question gives the correct answer at the top of its ranking to this and the two other sample questions provided in the paper).
Watson was evaluated on an additional 188 unseen questions . The primary outcome measure was recall (number of correct answers) at 10 results shown, and performance varied from 0.49 for the baseline system to 0.77 for the fully adapted and trained system. In other words, looking at the top ten answers for these 188 questions, 77% of those Watson provided were correct.
We can debate whether or not this is good performance for a computer system, or a computer system being touted to provide knowledge to expert users. But a more disappointing aspect of the study is its limitations that I would have brought up had I been asked to peer-review the paper.
The first question I had was, how does Watson's performance compare with other systems, including IR systems such as Google or Pubmed? As noted above, for the three example questions provided in the paper, Google gave the answer in the snippet of text from the top-ranking Web page each time. It would be interesting to know how other online systems would compare with Watson's performance on the questions used in this study.
Another problem with the paper is that none of the 188 questions were provided, not even as an appendix. In all of the evaluation studies I have performed (e.g., [2-4]), I have always provided some or all of the questions used in the study so the reader could better assess the results.
A final concern was that Watson was not evaluated in the context of a real user. While systems usually need to be evaluated from the "system perspective" before being assessed with users, it would have been informative to see whether Watson provided novel information or altered decision-making in real-world clinical scenarios.
Nonetheless, I am encouraged that a study like this was done, and I hope that more comprehensive studies will be undertaken in the near future. I do maintain enthusiasm for systems like Watson and am confident they will find a role in medicine. But we need to be careful about hype and we must employ robust evaluation methods to test our claims as well as determine how they are best used.
1. Ferrucci, D, Levas, A, et al. (2012). Watson: Beyond Jeopardy! Artificial Intelligence. Epub ahead of print.
2. Hersh, WR, Pentecost, J, et al. (1996). A task-oriented approach to information retrieval evaluation. Journal of the American Society for Information Science. 47: 50-56.
3. Hersh, W, Turpin, A, et al. (2001). Challenging conventional assumptions of automated information retrieval with real users: Boolean searching and batch retrieval evaluations. Information Processing and Management. 37: 383-402.
4. Hersh, WR, Crabtree, MK, et al. (2002). Factors associated with success for searching MEDLINE and applying evidence to answer clinical questions. Journal of the American Medical Informatics Association. 9: 283-293.
This post by William Hersh, MD, FACP, Professor and Chair, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, appeared on his blog Informatics Professor, where he posts his thoughts on various topics related to biomedical and health informatics.
Contact ACP Internist
Send comments to ACP Internist staff at email@example.com.
- How I realized I am a senior (doctor)
- QD: News Every Day--U.S. health an issue of quanti...
- Why the FDA should lift warnings on Avandia, even ...
- QD: News Every Day--Abdominal fat linked to heart ...
- The 1% solution to hospital finances
- Have you planned your retirement from driving?
- QD: News Every Day--4 of 10 health facilities have...
- A gastroenterologist preaches healthy food choices...
- Reduce MRSA by getting horizontal
Members of the American College of Physicians contribute posts from their own sites to ACP Internistand ACP Hospitalist. Contributors include:
Albert Fuchs, MD, FACP, graduated from the University of California, Los Angeles School of Medicine, where he also did his internal medicine training. Certified by the American Board of Internal Medicine, Dr. Fuchs spent three years as a full-time faculty member at UCLA School of Medicine before opening his private practice in Beverly Hills in 2000.
And Thus, It Begins
Amanda Xi, ACP Medical Student Member, is a first-year medical student at the OUWB School of Medicine, charter class of 2015, in Rochester, Mich., from which she which chronicles her journey through medical training from day 1 of medical school.
Ira S. Nash, MD, FACP, is the senior vice president and executive director of the North Shore-LIJ Medical Group, and a professor of Cardiology and Population Health at Hofstra North Shore-LIJ School of Medicine. He is Board Certified in Internal Medicine and Cardiovascular Diseases and was in the private practice of cardiology before joining the full-time faculty of Massachusetts General Hospital.
Zackary Berger, MD, ACP Member, is a primary care doctor and general internist in the Division of General Internal Medicine at Johns Hopkins. His research interests include doctor-patient communication, bioethics, and systematic reviews.
Controversies in Hospital
Run by three ACP Fellows, this blog ponders vexing issues in infection prevention and control, inside and outside the hospital. Daniel J Diekema, MD, FACP, practices infectious diseases, clinical microbiology, and hospital epidemiology in Iowa City, Iowa, splitting time between seeing patients with infectious diseases, diagnosing infections in the microbiology laboratory, and trying to prevent infections in the hospital. Michael B. Edmond, MD, FACP, is a hospital epidemiologist in Richmond, Va., with a focus on understanding why infections occur in the hospital and ways to prevent these infections, and sees patients in the inpatient and outpatient settings. Eli N. Perencevich, MD, ACP Member, is an infectious disease physician and epidemiologist in Iowa City, Iowa, who studies methods to halt the spread of resistant bacteria in our hospitals (including novel ways to get everyone to wash their hands).
db's Medical Rants
Robert M. Centor, MD, FACP, contributes short essays contemplating medicine and the health care system.
Suneel Dhand, MD, ACP Member
Suneel Dhand, MD, ACP Member, is a practicing physician in Massachusetts. He has published numerous articles in clinical medicine, covering a wide range of specialty areas including; pulmonology, cardiology, endocrinology, hematology, and infectious disease. He has also authored chapters in the prestigious "5-Minute Clinical Consult" medical textbook. His other clinical interests include quality improvement, hospital safety, hospital utilization, and the use of technology in health care.
Juliet K. Mavromatis, MD, FACP, provides a conversation about health topics for patients and health professionals.
Dr. Mintz' Blog
Matthew Mintz, MD, FACP, has practiced internal medicine for more than a decade and is an Associate Professor of Medicine at an academic medical center on the East Coast. His time is split between teaching medical students and residents, and caring for patients.
Toni Brayer, MD, FACP, blogs about the rapid changes in science, medicine, health and healing in the 21st century.
Vineet Arora, MD, FACP, is Associate Program Director for the Internal Medicine Residency and Assistant Dean of Scholarship & Discovery at the Pritzker School of Medicine for the University of Chicago. Her education and research focus is on resident duty hours, patient handoffs, medical professionalism, and quality of hospital care. She is also an academic hospitalist.
John H. Schumann, MD, FACP, provides transparency on the workings of medical practice and the complexities of hospital care, illuminates the emotional and cognitive aspects of caregiving and decision-making from the perspective of an active primary care physician, and offers behind-the-scenes portraits of hospital sanctums and the people who inhabit them.
Ryan Madanick, MD, ACP Member, is a gastroenterologist at the University of North Carolina School of Medicine, and the Program Director for the GI & Hepatology Fellowship Program. He specializes in diseases of the esophagus, with a strong interest in the diagnosis and treatment of patients who have difficult-to-manage esophageal problems such as refractory GERD, heartburn, and chest pain.
Mike Aref, MD, PhD, FACP, is an academic hospitalist with an interest in basic and clinical science and education, with interests in noninvasive monitoring and diagnostic testing using novel bedside imaging modalities, diagnostic reasoning, medical informatics, new medical education modalities, pre-code/code management, palliative care, patient-physician communication, quality improvement, and quantitative biomedical imaging.
William Hersh, MD, FACP, Professor and Chair, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, posts his thoughts on various topics related to biomedical and health informatics.
David Katz, MD
David L. Katz, MD, MPH, FACP, is an internationally renowned authority on nutrition, weight management, and the prevention of chronic disease, and an internationally recognized leader in integrative medicine and patient-centered care.
Richard Just, MD, ACP Member, has 36 years in clinical practice of hematology and medical oncology. His blog is a joint publication with Gregg Masters, MPH.
Kevin Pho, MD, ACP Member, offers one of the Web's definitive sites for influential health commentary.
Michael Kirsch, MD, FACP, addresses the joys and challenges of medical practice, including controversies in the doctor-patient relationship, medical ethics and measuring medical quality. When he's not writing, he's performing colonoscopies.
Elaine Schattner, MD, FACP, shares her ideas on education, ethics in medicine, health care news and culture. Her views on medicine are informed by her past experiences in caring for patients, as a researcher in cancer immunology, and as a patient who's had breast cancer.
Mired in MedEd
Alexander M. Djuricich, MD, FACP, is the Associate Dean for Continuing Medical Education (CME), and a Program Director in Medicine-Pediatrics at the Indiana University School of Medicine in Indianapolis, where he blogs about medical education.
Rob Lamberts, MD, ACP Member, a med-peds and general practice internist, returns with "volume 2" of his personal musings about medicine, life, armadillos and Sasquatch at More Musings (of a Distractible Kind).
David M. Sack, MD, FACP, practices general gastroenterology at a small community hospital in Connecticut. His blog is a series of musings on medicine, medical care, the health care system and medical ethics, in no particular order.
Reflections of a Grady
Kimberly Manning, MD, FACP, reflects on the personal side of being a doctor in a community hospital in Atlanta.
The Blog of Paul Sufka
Paul Sufka, MD, ACP Member, is a board certified rheumatologist in St. Paul, Minn. He was a chief resident in internal medicine with the University of Minnesota and then completed his fellowship training in rheumatology in June 2011 at the University of Minnesota Department of Rheumatology. His interests include the use of technology in medicine.
Technology in (Medical)
Neil Mehta, MBBS, MS, FACP, is interested in use of technology in education, social media and networking, practice management and evidence-based medicine tools, personal information and knowledge management.
Peter A. Lipson,
Peter A. Lipson, MD, ACP Member, is a practicing internist and teaching physician in Southeast Michigan. The blog, which has been around in various forms since 2007, offers musings on the intersection of science, medicine, and culture.
Why is American Health Care So Expensive?
Janice Boughton, MD, FACP, practiced internal medicine for 20 years before adopting a career in hospital and primary care medicine as a locum tenens physician. She lives in Idaho when not traveling.
World's Best Site
Daniel Ginsberg, MD, FACP, is an internal medicine physician who has avidly applied computers to medicine since 1986, when he first wrote medically oriented computer programs. He is in practice in Tacoma, Washington.
Other blogs of note:
American Journal of
Also known as the Green Journal, the American Journal of Medicine publishes original clinical articles of interest to physicians in internal medicine and its subspecialities, both in academia and community-based practice.
A collaborative medical blog started by Neil Shapiro, MD, ACP Member, associate program director at New York University Medical Center's internal medicine residency program. Faculty, residents and students contribute case studies, mystery quizzes, news, commentary and more.
Michael Benjamin, MD, ACP member, doesn't accept industry money so he can create an independent, clinician-reviewed space on the Internet for physicians to report and comment on the medical news of the day.
The Public Library of Science's open access materials include a blog.
One of the most popular anonymous blogs written by an emergency room physician.