DNTP’s automated tool makes data extraction easy
A semi-automated data-extraction tool called Dextr has great potential to enhance the speed and accuracy of conducting literature reviews, according to researchers from the NIEHS Division of the National Toxicology Program.
Data extraction is a time- and resource-intensive step in the analysis of scientific literature. Machine-learning methods for automating this process have been explored to address this challenge. Previous approaches have had limited utility, particularly in the field of environmental health sciences.
To address this need, the researchers developed a data-extraction tool that combines machine-learning models with an effective user interface to enable oversight and user verification. This powerful, flexible, web-based approach supports sophisticated features and capabilities when applied to scientific articles. Unlike other workflows, it supports extraction of complex concepts such as multiple experiments, exposures, or doses, allowing users to connect elements within a study.
Dextr performed as well as, or better than, manual extraction of environmental health animal studies. The tool reduced the time required for data extraction by 47% and achieved similar precision and recall for entities such as species, strain, and sex. According to the researchers, Dextr could reduce the workload and resources required for systematic literature reviews in various fields without compromising necessary rigor and transparency. (JW)
Citation: Walker VR, Schmitt CP, Wolfe MS, Nowak AJ, Kulesza K, Williams AR, Shin R, Cohen J, Burch D, Stout MD, Shipkowski KA, Rooney AA. 2022. Evaluation of a semi-automated data extraction tool for public health literature-based reviews: Dextr. Environ Int 159:107025.