U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Environmental Factor

Environmental Factor

Your Online Source for NIEHS News

September 2018

Building data-competent scientists, training ideas explored

Experts grappled with approaches to training environmental health scientists to make the most of big data at an NIEHS gathering in August.

Specialized training will help environmental health scientists take advantage of the promise of big data, according to an interdisciplinary group of researchers who gathered Aug. 15-16 at NIEHS.

Experts in data science, epidemiology, genetics, biostatistics, and other fields exchanged insights about training students, postdoctoral researchers, and junior faculty to be the next generation of environmental health scientists. Discussion also focused on how to increase data science skills among biomedical scientists focused on environmental health studies, at all educational levels.

Carol Shreffler speaks to the audience Shreffler said that bringing together environmental health scientists and data scientists proved to be fun and fruitful. (Photo courtesy of Steve McCaw)

Organizers in the NIEHS Division of Extramural Research and Training (DERT) sought to develop strategic recommendations for data science training and NIEHS priorities, said organizer Carol Shreffler, Ph.D., the DERT program director for training and career development.

NIEHS and National Toxicology Program Director Linda Birnbaum, Ph.D., challenged participants in her welcoming remarks. “Work with us to develop an overall strategy to build data science-competent environmental health science workforce,” she said, pointing out that the 2018-2023 NIEHS Strategic Plan (see related story) names data science as a key goal in each of its focus areas.

Participants responded enthusiastically. “Everyone is grappling with the issue of how best to incorporate more data science training into biomedical programs,” reported Jenny Collins, from DERT and a member of the organizing committee.

The challenge of data

As vast amounts of data are generated by rapidly developing technology, the relatively new field of data science is meeting the challenges of sharing, accessing, analyzing, and interpreting big data.

Marie Lynn Miranda speaks Miranda said that Rice University has trained 42 of 50 state health departments in geospatial methods. (Photo courtesy of Steve McCaw)

For example, Marie Lynn Miranda, Ph.D., from Rice University, demonstrated how big data can reframe research questions. She reported that racial isolation is geographically associated with fundamental causes of racial disparities in health. “This shifts the conversation from race, which is nonmodifiable, to the experience of minorities in segregated communities, which is modifiable,” she said.

The National Institutes of Health (NIH) Big Data to Knowledge (BD2K) Initiative made early efforts to address the skills gap in biomedical data science expertise through investments in training and education.

“Methods work really well until you encounter real data,” said BD2K grantee John Quackenbush, Ph.D., from the Harvard T.H. Chan School of Public Health. He explained how epidemiology and laboratory projects help students understand the challenges they will face.

The challenge of education

Speakers discussed challenges such as course development, recruitment of students and lab members, funding strategies, and attracting and retaining data science students in environmental health sciences, or EHS.

“I’ve never seen a generation of students that cares more about making a difference in the world,” said Miranda.

“EHS is perceived as being more meaningful and interesting, but students are worried about salary levels,” said Jim Gauderman, Ph.D., from the University of Southern California. His experience suggests that early successes are important. “Published papers, software development, etc., help trainees feel both value and success,” he said.

Jim Gauderman addresses the crowd “[Students] need to be able to tell a story about big data,” Gauderman said, “so they must invest in understanding the data, be able to work in diverse groups, and be able to explain it to the outside world.” (Photo courtesy of Steve McCaw)

Partnership is key

“EHS researchers draw on data from genetic sequences to geographic data to survey data,” said attendee Lisa Federer, from the National Library of Medicine (NLM). “Data science involves many different types of expertise and knowledge, and there’s not just one training model.” Her suggestion that data science training will require engaging with researchers who have not typically worked with NIH was echoed by others.

Society of Toxicology (SOT) Vice-President Ronald Hines, Ph.D., from the U.S. Environmental Protection Agency, said that SOT is reaching out to other societies for symposia collaborations. Similarly, Miranda advised NIEHS to exchange ideas with leaders of professional groups in data science, computer science, electrical engineering, and applied mathematics.

Jeanette Stingone listens as Ron Hines speaks Hines, right, served on a panel with Jeanette Stingone, Ph.D., from the Icahn School of Medicine at Mount Sinai. Icahn researcher Susan Teitelbaum, Ph.D., described Stingone as a translator who is skilled at interfacing between epidemiologists and data. (Photo courtesy of Steve McCaw)

The data challenges or hackathons such groups offer could be more widely used in EHS, according to Charles Schmitt, Ph.D., a contractor in the NIEHS Office of Data Science. “In natural language processing, there are conferences that have challenges every year, aimed at advancing the state of the art,” he said. “Results from prior years make a great teaching resource.”

“Partnerships are key,” emphasized grantee Cheryl Walker, Ph.D., from the Baylor College of Medicine and moderator of the meeting’s final session.

“Some of the most powerful and innovative data science research comes out of interdisciplinary teams,” agreed Federer, adding that data science is a major focus of the new NLM strategic plan, and they hope to collaborate with colleagues across NIH.

Gwen Collman speaks After a day and a half of productive exchange, DERT Director Gwen Collman, Ph.D., suggested that the next group to weigh in should be trainees.
Daniel Gatti speaks “All of them recognize that they have to get some programming experience to operate in the 21st century,” said Daniel Gatti, Ph.D., from the Jackson Laboratory, describing a two-day class in data-intensive genetics analyses that attracted participants from graduate students to full professors.
Chirag Patel and Danielle Carlin have a conversation DERT scientist Danielle Carlin, Ph.D., right, spoke on a break with Chirag Patel, Ph.D., left, from Harvard University. Patel described the challenge of accommodating both quantitative and biomedical students in a course.
Fred Wright and Wesley Gray participate on a panel Panelists Fred Wright, Ph.D., left, from North Carolina State University, and Wesley Gray, Ph.D., from Southern University and A&M College, shared thoughts on institutional barriers to interdisciplinary programs, among other topics.
Rick Woychik and Cheryl Walker chat Walker, left, shown speaking with NIEHS Deputy Director Rick Woychik, Ph.D., said the meeting was the best organized workshop she had ever participated in.
Back To Top