Big data presents big challenges, big opportunities in environmental health
By John Yewell
The challenges of processing today’s avalanche of information and the opportunities gained by making it available to scientists and the public were the subjects of a June 24 webinar, "Integrating Data from Multidisciplinary Research."
The program was the first in a series exploring how complex data sets are being used to solve environmental health problems. It was part of the National Institutes of Health (NIH) Big Data to Knowledge (BD2K) initiative, which aims to advance understanding of human health and disease by taking advantage of the wealth of information contained in biomedical big data.
Underscoring the NIEHS strategic plan goal of integrating data tools across disciplines, webinar speakers drew from both in-house and grantee research experience in their presentations.
From floppy disks to terabytes
How big has Big Data become? Consider this — one system alone, the Intergovernmental Oceanographic Commission’s Global Ocean Observing System collects a terabyte of data every day, according to Stephen DiMarco, Ph.D., from Texas A&M University. His presentation described how scientists use the data to address impacts on human health and society.
Tackling modern data challenges provides new opportunities to the NIEHS Superfund Research Program (SRP), said SRP Director William Suk, Ph.D., another presenter. “The next step is increased collaboration to enable sharing of data, which is essential for expedited translation of research results into knowledge to improve human health.”
Privacy and precision medicine
According to Allen Dearry, Ph.D., director of the NIEHS Office of Scientific Information Management, the confluence of better data collection, greater computing power, and heightened public perception around environmental health has made the large-scale studies necessary to test medical interventions possible. These studies can be tailored to individuals and are known as the precision medicine initiative (see story)
“Seventy-four percent of the public are now willing to share their health information to improve prevention and treatment of disease,” Dearry said. “It’s really just now that we have the capacity to carry out this kind of effort.”
Technical capacity alone is not enough; issues of privacy and confidentiality must also be addressed. “NIH is trying to ensure that we are able to protect confidentiality, especially of electronic health records,” said Dearry, explaining how work to make data more widely available, by using cloud-based platforms, has been slowed by concerns over data security.
PROTECT — breaking the exposure-disease link
David Kaeli, Ph.D., from Northeastern University, described his work managing and integrating data from the Puerto Rico Testsite for Exploring Contamination Threats (PROTECT), which is supported by SRP. Researchers study the effects of environmental contaminants on preterm birth in Puerto Rico by following 1,800 expectant mothers and by gathering data about dozens of chemicals, trace metals, and pesticides from more than 1,000 wells, springs, and other sources.
Scientists hope to prevent exposure to environmental hazards through better detection, while minimizing the environmental impact of cleanup activities, which is known as green remediation. This contributes to the central goal of SRP, which is to understand and break the links between chemical exposure and disease.
Michelle Heacock, Ph.D., health scientist administrator for the NIEHS SRP, served as moderator for the webinar, which was hosted by SRP and the U.S. Environmental Protection Agency.
(John Yewell is a contract writer for the NIEHS Office of Communications and Public Liaison)