A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability.

Saad M Khan; Xiaoxuan Liu; Siddharth Nath; Edward Korot; Livia Faes; Siegfried K Wagner; Pearse A Keane; Neil J Sebire; Matthew J Burton ORCID logo; Alastair K Denniston; (2020) A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability. The Lancet Digital Health, 3 (1). e51-e66. ISSN 2589-7500 DOI: 10.1016/S2589-7500(20)30240-5
Copy

Health data that are publicly available are valuable resources for digital health research. Several public datasets containing ophthalmological imaging have been frequently used in machine learning research; however, the total number of datasets containing ophthalmological health information and their respective content is unclear. This Review aimed to identify all publicly available ophthalmological imaging datasets, detail their accessibility, describe which diseases and populations are represented, and report on the completeness of the associated metadata. With the use of MEDLINE, Google's search engine, and Google Dataset Search, we identified 94 open access datasets containing 507 724 images and 125 videos from 122 364 patients. Most datasets originated from Asia, North America, and Europe. Disease populations were unevenly represented, with glaucoma, diabetic retinopathy, and age-related macular degeneration disproportionately overrepresented in comparison with other eye diseases. The reporting of basic demographic characteristics such as age, sex, and ethnicity was poor, even at the aggregate level. This Review provides greater visibility for ophthalmological datasets that are publicly available as powerful resources for research. Our paper also exposes an increasing divide in the representation of different population and disease groups in health data repositories. The improved reporting of metadata would enable researchers to access the most appropriate datasets for their needs and maximise the potential of such resources.


picture_as_pdf
PIIS2589750020302405.pdf
subject
Published Version
Available under Creative Commons: NC-ND 3.0

View Download

Atom BibTeX OpenURL ContextObject in Span Multiline CSV OpenURL ContextObject Dublin Core Dublin Core MPEG-21 DIDL EndNote HTML Citation JSON MARC (ASCII) MARC (ISO 2709) METS MODS RDF+N3 RDF+N-Triples RDF+XML RIOXX2 XML Reference Manager Refer Simple Metadata ASCII Citation EP3 XML
Export

Downloads