Big data promises a health care remedy

 

Connecting state and local government leaders

Government agencies are making strides testing uses of big data to predict risks of disease or the path of a killer virus, but hurdles remain, including linking legacy datasets and setting up common vocabularies.

The use of big data to rapidly analyze costs, understand public behaviors and anticipate security threats continues to attract the interest of government agencies that see the technology as a way to gain measurable insights into their most demanding problems.

Nowhere are researchers more active in exploring the uses of big data than in government health care organizations, where data scientists are working toward creating reliable tools for predicting a patient’s risk of disease or a virus’s path of infection.

To some extent health care programs are an obvious target for big data investment. Agencies already have large databases with years of information on diseases and patient health, and they have an urgent need to provide better and more productive information for researchers, doctors and nurses.

The Veterans Health Administration (VHA), for example, has created several big data analytics tools to help it improve health services to its 6.5 million primary care patients.

The VHA’s care assessments needs (CAN) score is a predictive analytic tool that indicates how a given veteran compares with other individuals in terms of likelihood of hospitalization or death. The scores are analyzed by VHA’s patient care assessment system (PCAS), which uses these scores and other data to help medical teams coordinate patient care.

The technology has changed the whole approach at the VHA from being purely reactive to one in which patients at the highest risk of being hospitalized can be identified in advance and provided services that can help keep them out of emergency rooms and other critical care facilities, according to Stephan Fihn, director of the VHA’s Office of Analytics and Business Intelligence.

While still considered fairly rudimentary tools, the CAN score and PCAS demonstrate that big data predictive analytics can work for large populations.

The agency now needs to “markedly ramp that effort up,” Fihn said, and to that end the VHA is working on dozens of predictive models that can be deployed over the next decade. The models  will show patients  that “this what we know about you, here’s what we think you need,” he said, and be able to do that in a rapid, medically relevant manner.

Big data, open data

Big data tools are also being rapidly developed by the Department of Health and Human Services, a sprawling, 90,000-person enterprise that that both creates and uses data for genomics research, disease surveillance and epidemiology studies.

“There are efforts across the department to try and leverage the data we have,” said Bryan Sivak, HHS’ chief technology officer.

“At the same time a lot of the datasets we maintain, collect, create or curate can be extended to external entities to help them understand aspects of the HHS ecosystem and try to improve on them, such as with CMS (Centers for Medicare and Medicaid Services) claims data,” he said.

One such effort is the OpenFDA project, which essentially took  three massive Food and Drug Administration datasets through an intensive cleaning process, Sivak said, and then added an application programming interface (API)  so people could access the data in machine-readable ways.

OpenFDA was also linked to other data sources, so that users could access related information from the National Institutes of Health and the National Library of Medicine’s MedlinePlus .

The project, which launched as a beta program in June 2014, has already helped to create “a lot of different applications that have the potential to really help reshape that part of the (HHS) ecosystem,” Sivak said.

Also within HHS, the National Institutes of Health has committed to several big data programs, including its Big Data to Knowledge (BD2K) initiative. The program, begun in late 2013, is aimed at improving researchers’ use of biomedical data to predict who is at increased risk of conditions such as breast cancer and heart disease and to come up with better treatments.

BD2K’s goal is to help develop a “vibrant biomedical data science ecosystem,” that will include standards for dataset description, tools and methods for finding, accessing and working with datasets stored in other locations and training biomedical scientists in big data techniques.

In October last year it announced grants of nearly $32 million for fiscal 2014 to create 11 centers of excellence for big data computing, a consortium to develop a data discovery index and measures to boost data science training and workforce development. NIH hopes to invest a total of $656 million in these projects through 2020.

While physical infrastructure for computational biomedical research has been growing for many years, the NIH said, as data gets bigger and more widely distributed, “an appropriate virtual infrastructure become vital.”

Fundamental challenges

There are significant challenges to applying big data to health care, especially with so many legacy datasets to be integrated and shared. Even the use of the term big data can cause confusion.

“Within agencies there are different definitions and types of big data,” said Tim Hayes, senior director for customer health solutions at Creative Computing Solutions, Inc., and a former HHS employee who worked on data analytics there.

“You need to be sure, when mapping data from one database to another, that you can match various labels that are used. Two different agencies might use the term ‘research,’ for example, but they may not be compatible.”

There are “very arcane differences” between what you would assume are fundamental and consistent definitions that turn out not to be consistent at all, Sivak agreed. “It’s a big problem for sure.”

Another barrier is the lack of data scientists capable of working with and understanding the needs of data analytics programs. The solution starts with recognizing that such people are not IT workers, but occupy a niche all their own.

“A lot of what they do is not working with technology, but is in understanding data,” said Brand Niemann, a former senior enterprise architect and data scientist at the Environmental Protection Agency, who now heads up the Federal Big Data Working Group, an interest group of federal and non-federal big data experts.

The fact is, many agencies may already have people with such expertise on staff but don’t recognize it. It’s a matter of identifying the statisticians that are already working with data, and giving them more of a mandate and outlet to mine the agency’s data, Neimann said.

Get it right, and the results can be transformative.

Accuracy counts

Any analysis of big data has limited usefulness if the information in the dataset is not accurate to begin with. Until only recently, VHA’s Fihn said he had been skeptical that data analytics could reach the levels of accuracy required for clinical use across the VHA. One reason is that, until just a few years ago, the only data available was from health insurance claims.

“In terms of predictive accuracy we use what we call a C statistic,” he said. “A wholly accurate predictive model has a C level of 1.0, and the least accurate has a level of zero. Using (health insurance) claims data, the most accurate level we could get was around 0.65, which is not much better than flipping a coin.”

Between 2010 and 2011, however, the VHA brought online a corporate data warehouse that combined clinical data from some 126 different versions of the VISTA (Veterans Health Information Systems and Technology Architecture) electronic health record the agency had been using since the late 1990s.

With that, Fihn said, and greater availability of data on patient medications and vital signs, predictive models are regularly reaching C levels of 0.85, and are pushing 0.9.

It was a “quantum jump” in terms of the usefulness of predictive analytics, he said, and VHA medical staff feel they can now predict with confidence who the high-risk patients are. And even though predictions are still being published using claims data alone, he said, “for our considerations, we now reject those below C levels of 0.85, and we are actually moving to push things as close as we can to 0.9.”

HHS doesn’t have any global metrics or milestones it wants to reach for big data, Sivak said, though there are specific goals for individual programs. In fact, NIH may have the most expansive set of goals, with BD2K just part of a larger portfolio of activities that NIH is promoting, including cross-agency and international collaboration on big data initiatives and policies.

It’s all a marker for just how quickly minds have changed over big data, Sivak believes. “Back in the day,” nobody would have given any thought to making datasets public or making them available widely within HHS. But over the past five years the value of that has been conclusively demonstrated, he said, “and as a result, the default setting within HHS has changed from closed to open.”

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.