Environmental health science research in the 21st century generates massive amounts of data, creating many challenges for the field. At the Feb. 11-12 meeting of the National Advisory Environmental Health Sciences Council, these challenges were front and center.
Acting NIEHS and National Toxicology Program Director Rick Woychik, Ph.D., described the considerable resources and energy that the institute has devoted in recent years to solidifying its data science infrastructure, including both people and technology.
“One of the problems we had when I first got to NIEHS in 2011 was that we weren’t developing plans for how to manage our cyberinfrastructure, our IT infrastructure, or our data management,” he said. “We now have people whose job it is to develop and integrate those plans and determine what we are going to do, when we are going to do it, and how much it is going to cost.”
In the past two years, NIEHS has created an Office of Environmental Science Cyberinfrastructure, an Office of Data Science (ODS), and an Office of Scientific Computing.
ODS Director Charles Schmitt, Ph.D., said the group is charged with fulfilling the NIEHS commitment to systems that embody the principle known as FAIR-plus — findable, accessible, interoperable, and reusable, along with computable, socialized, and mineable.
The diversity of data in environmental health makes getting to FAIR-plus especially challenging, he said. For example, research involving air pollution, personal exposure sensors, epidemiology, in vivo animal studies, and in vitro testing generates extensive and diverse datasets. Their outputs are often difficult to integrate and analyze.
The quest for commonality
Lack of common terminologies poses a major challenge in terms of integrating environmental health science data. Several council members emphasized the importance of establishing common language standards to enable data integration and sharing.
Schmitt noted that NIEHS plans to hire a semantic engineer soon to work on the problem. Council member Lynn Goldman, M.D., from George Washington University, approved that idea and other intentions to expand the NIEHS data workforce. “I would be very supportive of an effort that brings more data science expertise into NIEHS to help you develop a more standardized set of resources,” she said.
Former council member Kenneth Fasman, Ph.D., from The Jackson Laboratory, agreed. “Given the unique mission of NIEHS and the diversity of data types and scientific disciplines that you work with, it’s really important that you continue to do exactly what you’ve been describing today. Keep going!” he said.
J. Patrick Mastin, Ph.D., acting director of the Division of Extramural Research and Training, summarized challenges for advancing data science in environmental health research.
- Inconsistent formats and vocabularies.
- Cultural issues, including distinct perspectives of researchers and study participants.
- High cost of data resources.
- Insufficient data science expertise.
- Diversity of research topics and data types.
Stay the course
Woychik told the attendees that as acting director, his intent is to continue to pursue the three primary themes of the 2018-2023 NIEHS Strategic Plan: advancing environmental health science, promoting translation, and enhancing environmental health science through stewardship and support.
“I’m often asked as acting director, what are we going to be doing during your tenure?” he said. “My response is that we have a great strategic plan, so we’re not going to change course.”
He described holding 34 listening sessions across the institute since October 2019. The overall themes that emerged were communication, collaboration, career development, resources, strategic focus, and management of science. “We are committed to addressing all of the things that have come up in the listening sessions,” he added.
(Ernie Hood is a contract writer for the NIEHS Office of Communications and Public Liaison.)