Using advanced data science tools to support environmental health research in Africa was the focus of a Sept. 23 seminar, part of a series on the state of data science. The series is sponsored by Harnessing Data Science for Health Discovery and Innovation in Africa (DS-I Africa), a program of the National Institutes of Health (NIH) Common Fund (see sidebar).
Integrating diverse data sources
Collman emphasized the complex challenge of documenting a person’s exposures. Researchers draw on sources such as those listed below, combined with geospatial (geographic information system, or GIS) or temporal satellite data, she explained.
- Sampling air, water, and soil either directly or with sensors.
- Personal biomonitoring.
- Molecular-level markers of exposure in biosamples.
“One of the major challenges of our field is integrating all these streams of data [so] that we can usefully depict exposure over a person’s lifetime, a field known as exposomics,” Collman said.
Three panelists shared their experiences.
- NIEHS grantee Kiros Berhane, Ph.D., from Columbia University.
- Engineer Bainomugisha, Ph.D., from Makerere University, Uganda.
- Caradee Wright, Ph.D., from the South African Medical Research Council and the University of Pretoria.
Overcoming human limitations
Exposome data are collected in Africa but gaps remain in the content and access to it. Computational methods and tools are needed to make complex data useful to health professionals and policymakers.
Berhane discussed machine learning, particularly predictive models. Because more data are not necessarily better data, researchers can use human-supervised machine learning to weed out messy or incomplete information. For example, hospital data, available in electronic form in much of the developed world, is manually collected in African hospitals.
“Africa is already facing multiple challenges, such as a wide range of exposures combined with rapid urbanization and industrialization,” said Berhane. “In the face of all this, there’s a lack of high-quality data and limited human capacity in these areas.”
End-to-end data systems are being developed to address pressing environmental health problems like air pollution. From data collection hardware to applying machine learning and data science methods, such systems can generate spatial and temporal air quality patterns across a city.
In Uganda’s capital Kampala, Bainomugisha applies computational methods and tools to the city’s environmental health challenges. He is the project lead for AirQo (see sidebar), which builds and deploys custom internet-connected devices to measure air quality, for example, mounting them on roofs or motor scooters. Policymakers can use the data to develop regulations to protect health, he explained.
“When experts in fields like computer science work on environmental health issues, we can get new innovations,” Collman observed. “For example, the boda boda scooters are one of the main sources of transportation around Kampala. The novelty here is that they take readings every 90 seconds and create very large, geo-tagged, real-time datasets that can map air pollution levels around the city.”
Taking data back to the people
In South Africa, Wright merges GIS, meteorological, socioeconomic, qualitative, clinical, and other data to collect evidence for action and decision-making. Her studies of human health effects of the increase in hotter days supported development of a heat health early warning system based on predictive modeling.
However, high-level policymakers are not the only audience of interest, Wright emphasized. The people most affected by exposures should also have access, for making local decisions. “We really need to think about how to make data meaningful,” she said, suggesting science communicators can help close the gap.
New research funding
“This program aims to build [research] capacity and advance data science research to catalyze innovation in health research and health care on the African continent,” said Duncan.
“There are several new funding opportunities on the street,” he noted. “This symposium is associated with the launch of the new program.” The first due dates for applying are in late November.
(Kelley Christensen is a contract writer and editor for the NIEHS Office of Communications and Public Liaison.)