The Indian Skin AI Accuracy Gap
Why Western-trained models fail on Fitzpatrick 3–6 — and what the data actually shows.
30–40%
accuracy drop on Fitzpatrick 3–6
1.3%
datasets with ethnicity data
79%
datasets from the West
The Data Gap
The dataset powering skin AI is fundamentally biased.
A systematic review of the training data behind dermatology AI reveals a geographic and ethnic concentration that makes global claims of accuracy misleading.
“Of 70 image datasets, 79% of images originated from Europe, North America, and Oceania. Only 1.3% of datasets contained ethnicity metadata.”
— Wen et al., Lancet Digital Health (2022)
79%
of skin datasets from Europe, North America, Oceania
1.3%
of datasets include any ethnicity metadata
The Scale
Where Indian skin sits on the Fitzpatrick scale
Indian skin predominantly falls in Fitzpatrick types III–VI. Most AI skin analysis models are validated primarily on types I–III — meaning they were never built for the majority of Indian consumers.
I
Very light
II
Light
III
Medium
IV
Moderate brown
V
Dark brown
VI
Very dark
Types I–II represent very light to light skin that burns easily — these types dominate Western training datasets.
Types III–VI, where most Indian skin falls, have higher melanin density. This changes how conditions like pigmentation, acne scarring, and dark circles present — meaning AI models trained on lighter skin systematically misclassify these concerns.
Rupam.ai is the only AI skin analysis engine trained exclusively on Fitzpatrick 3–6 data from Indian subjects, with expert labels from Indian dermatologists.
Benchmark: Rupam.ai vs Global Models
The accuracy gap, concern by concern.
Internal validation on a balanced Fitzpatrick 3–6 test set. External audit in progress.
| Concern | Global avg. | Rupam.ai |
|---|---|---|
| Acne | 65% | 89% |
| Pigmentation | 58% | 91% |
| Dark circles | 52% | 86% |
| Texture | 60% | 85% |
| Redness (erythema) | 68% | 87% |
| Overall (Fitzpatrick 3–6) | 62% | 88%+ |
Top 4 concerns — at a glance
Accuracy questions
How was the benchmark study conducted?
We tested Rupam.ai and four leading global skin analysis models on a balanced dataset of 5,000+ expert-labelled images across Fitzpatrick skin types III–VI. Each image was graded by at least two board-certified dermatologists.
Which global models were compared?
We tested against four widely-used commercial skin analysis APIs. Due to licensing restrictions, we report aggregate results rather than naming individual vendors.
What does '88%+ accuracy' mean specifically?
88%+ refers to the overall diagnostic agreement between Rupam.ai's assessments and expert dermatologist consensus across all 12 skin parameters on Fitzpatrick 3–6 skin tones.
Is the benchmark independently verified?
An independent third-party audit is currently in progress. We expect results by Q3 2026. Until then, all figures are from internal validation.
Why do Western models fail on Indian skin?
Training data bias. 79% of dermatology datasets originate from Europe, North America, and Oceania. Conditions like hyperpigmentation, melasma, and post-inflammatory changes present differently on darker skin tones and are systematically underrepresented.
How can I reproduce these results?
We offer a research partnership programme that provides access to our benchmark methodology and anonymised test sets. Contact research@rupam.ai for details.