Big Data & Health Care: Speaking with Dr. Hallie Prescott
For the latest installment in our series on Big Data & Health Care, we sat down with Dr. Hallie Prescott to discuss the use of structured data and unstructured data in continuously learning health systems. Hallie Prescott, MD, MSc, is Assistant Professor in Internal Medicine in the Division of Pulmonary & Critical Care Medicine at the University of Michigan Health System. She is also a research scientist with the HSR&D Center for Clinical Management Research and staff physician at the VA Ann Arbor Healthcare System.
Real World Health Care: Why do you think Big Data is pervasive in the business world, but not in the health care world?
Hallie Prescott: That’s a fairly common observation and one that is difficult to get to the bottom of. There are probably several factors limiting the uptake of big data in health care. First, there is the issue of information privacy. Health care data needs to be highly secure, which can make it difficult to share data across health systems. This type of roadblock tends to limit big data initiatives in health care. The health care systems leading the way in data analytics — the VA and Kaiser Permanente, for example — are successful because they are integrated health care delivery systems.
A second reason why big data initiatives are more widely pursued in the business world is the clear financial incentive to do so. Just look at Netflix. Their use of big data algorithms has given them a competitive advantage. We don’t have that sort of free market environment in health care.
Finally, there is the issue of physician and clinician acceptance of big data tools. Physicians still value the art of medicine and like to use their individual decision-making talents to diagnose and manage disease. So, we see some resistance to having a computer tell us what to do.
But even with all these limitations, progress is being made.
RWHC: Health care seems to be moving from the use of structured data to unstructured data. What is the difference between the two when it comes to clinical utility and improving patient outcomes?
HP: Structured data is data that already exists in a spreadsheet format. For example, when vitals signs (temperature, heart rate, blood pressure, etc.) get entered into the electronic medical record, they are stored in a spreadsheet. This data can be examined easily, but does not contain all the necessary information for answering many questions.
There is a vast amount of patient information that’s not entered into basic spreadsheets: things like doctors’ written notes, radiologists’ interpretations of chest x-rays, or pathology reports. This non-spreadsheet data is so-called “unstructured” data, and it often contains very useful information for predicting patients’ health outcomes. For example, important lifestyle indicators of health, such as smoking status, are often included within doctors’ notes, but not in a structured format.
Traditionally, the only way to learn from unstructured data was to review the medical chart. But, fortunately, we now have automated tools for extracting information from unstructured data sources. For example, natural language processing tools can search for specific words to determine if a patient smokes and how much he smokes.
RWHC: How can big data make positive impacts in a continuously learning health system?
HP: The Institute of Medicine published a report in 2012 on continuously learning health care systems. In a such a system, information is reliably captured, curated, and delivered back to clinicians in order to improve clinical-decision making for individual patients and to improve efficiency and quality of the overall health care system. Learning health care systems require an infrastructure to capture and analyze large amounts of data to inform patient care and system improvement. So, big data is key to a continuously learning health care system.
One way health systems can become better and more efficient is by learning from mistakes at the macro level. As an example, consider what happens to patients in the Emergency Department (ED). As clinicians, we make decisions on where patients should go next: intensive care unit, general medical admission or even sent home. Sometimes, those decisions are wrong, and a patient you send to the hospital ward (or even to home) quickly deteriorates and ends up in the ICU. If we have data-driven models of various factors to consider in making that decision and apply real-time data analytics, we can use them to inform policies and protocols in the ED in order to provide safer care for future patients.
RWHC: Can you give us an example of how you’ve applied big data in your practice to improve patient outcomes?
HP: At this stage, it’s rare to find individual physicians using big data to inform their personal clinical practice. But there are tremendous benefits when you look system-wide. I’m currently studying hospital readmissions after sepsis. We’re developing a tool to predict who is at a high risk of coming back to the hospital for specific problems after sepsis, such as for kidney failure, or heart failure, or infection. Because each individual type of hospital readmission happens to only a small portion of the population, we need to identify patterns, and those patterns are only possible when you have huge amounts of data. I’m now looking at the issue within the VA Health System, using over eight years of data to understand these patterns and feed them back to the clinical community to improve patient care.