Published in

American Association for the Advancement of Science, Science Advances, 32(8), 2022

DOI: 10.1126/sciadv.abq6147

Links

Tools

Export citation

Search in Google Scholar

Disparities in dermatology AI performance on a diverse, curated clinical image set

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Red circle
Postprint: archiving forbidden
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

An estimated 3 billion people lack access to dermatological care globally. Artificial intelligence (AI) may aid in triaging skin diseases and identifying malignancies. However, most AI models have not been assessed on images of diverse skin tones or uncommon diseases. Thus, we created the Diverse Dermatology Images (DDI) dataset—the first publicly available, expertly curated, and pathologically confirmed image dataset with diverse skin tones. We show that state-of-the-art dermatology AI models exhibit substantial limitations on the DDI dataset, particularly on dark skin tones and uncommon diseases. We find that dermatologists, who often label AI datasets, also perform worse on images of dark skin tones and uncommon diseases. Fine-tuning AI models on the DDI images closes the performance gap between light and dark skin tones. These findings identify important weaknesses and biases in dermatology AI that should be addressed for reliable application to diverse patients and diseases.