Published in

F1000Research, F1000Research, (9), p. 210, 2020

DOI: 10.12688/f1000research.22781.2

F1000Research, F1000Research, (9), p. 210, 2020

DOI: 10.12688/f1000research.22781.1

Links

Tools

Export citation

Search in Google Scholar

Data extraction methods for systematic review (semi)automation: A living review protocol

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Red circle
Preprint: archiving forbidden
Red circle
Postprint: archiving forbidden
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Background: Researchers in evidence-based medicine cannot keep up with the amounts of both old and newly published primary research articles. Support for the early stages of the systematic review process – searching and screening studies for eligibility – is necessary because it is currently impossible to search for relevant research with precision. Better automated data extraction may not only facilitate the stage of review traditionally labelled ‘data extraction’, but also change earlier phases of the review process by making it possible to identify relevant research. Exponential improvements in computational processing speed and data storage are fostering the development of data mining models and algorithms. This, in combination with quicker pathways to publication, led to a large landscape of tools and methods for data mining and extraction. Objective: To review published methods and tools for data extraction to (semi)automate the systematic reviewing process. Methods: We propose to conduct a living review. With this methodology we aim to do constant evidence surveillance, bi-monthly search updates, as well as review updates every 6 months if new evidence permits it. In a cross-sectional analysis we will extract methodological characteristics and assess the quality of reporting in our included papers. Conclusions: We aim to increase transparency in the reporting and assessment of automation technologies to the benefit of data scientists, systematic reviewers and funders of health research. This living review will help to reduce duplicate efforts by data scientists who develop data mining methods. It will also serve to inform systematic reviewers about possibilities to support their data extraction.