Intersectional Fairness in Vision-Language Models for Medical Image Disease Classification
Published in arXiv preprint (under review), 2025
Vision–language models such as CLIP, BLIP-2, and BioMedGPT are increasingly used for medical image disease classification, but can encode unfair behaviour across demographic subgroups. This work proposes a cross-modal alignment consistency framework that equalises diagnostic confidence distributions across intersectional subgroups, demonstrating 20–35% reductions in fairness disparities across multiple benchmarks, including external out-of-distribution evaluation.
Recommended citation: Yupeng Zhang, Adam G. Dunn, Usman Naseem, Jinman Kim. (2025). "Intersectional Fairness in Vision-Language Models for Medical Image Disease Classification." arXiv preprint arXiv:2512.15249. https://arxiv.org/abs/2512.15249
