Scoring Methodology

This page provides a detailed technical description of the AI scoring pipeline for alopecia severity assessment. It covers the segmentation model, regional SALT weighting, score computation, severity classification, and reproducibility characteristics.

Scalp segmentation model

Pixel-level hair loss detection

The segmentation stage uses a convolutional neural network (CNN) trained for semantic segmentation on scalp images. The model classifies each pixel in a scalp photograph as either hair-bearing or non-hair-bearing, producing a detailed map of the hair loss pattern.

This approach differs fundamentally from the acne or psoriasis scoring pipelines, which use object detection (bounding boxes) or component scoring. For alopecia, the relevant measurement is the area of scalp affected by hair loss, not the detection of individual lesions.

Training data

The model was trained on a dataset of scalp images captured from multiple perspectives (left, right, top, back), annotated by board-certified dermatologists specialising in hair disorders. Annotations consist of pixel-level segmentation masks delineating hair-bearing and non-hair-bearing regions.

Confidence thresholding

The segmentation model produces a per-pixel probability map for hair loss. A calibrated threshold converts these probabilities into binary hair-bearing/non-hair-bearing classifications. The threshold was optimised to balance sensitivity (detecting subtle patches of hair loss) against specificity (not misclassifying normal scalp partings or shadows as hair loss). The threshold is fixed at inference time and applies identically across all images and sites.

SALT score computation

The SALT formula

The Severity of Alopecia Tool (SALT) partitions the scalp into four quadrants and computes a weighted sum of regional hair loss percentages:

\text{SALT} = \sum_{r \in \{\text{left}, \text{right}, \text{top}, \text{back}\}} w_r \cdot S_r

where:

$S_r$ = percentage of hair loss in scalp region $r$ , estimated by the AI segmentation model (0–100%)
$w_r$ = weight for region $r$ , reflecting its proportion of total scalp area

Regional weights

The SALT methodology defines four scalp quadrants with anatomically determined weights:

Region	Weight	Description
Top	40%	The largest scalp region; encompasses the vertex and crown
Back	24%	Posterior scalp from occiput to nape
Left	18%	Left lateral scalp
Right	18%	Right lateral scalp

These weights ensure that hair loss in larger scalp regions contributes proportionally more to the total score. For example, 50% hair loss in the top quadrant (which covers 40% of the scalp) contributes 20 SALT points, while the same 50% hair loss in a lateral quadrant (18% of scalp) contributes only 9 SALT points.

How the AI estimates regional hair loss percentage

For each quadrant image, the AI:

Segments the image into hair-bearing and non-hair-bearing pixels
Identifies the scalp boundary (excluding background, clothing, and non-scalp areas)
Computes the ratio: $S_r = \frac{\text{non-hair-bearing pixels within scalp boundary}}{\text{total pixels within scalp boundary}} \times 100$

This produces a continuous percentage with sub-percentage resolution, compared to the 5–10% increments typical of manual visual estimation.

Severity classification

The total SALT score maps to severity categories that are standard in the alopecia areata literature:

SALT score	Severity	Clinical description
0	None	No detectable hair loss
1	Limited	1–24% scalp hair loss; localised patches
2	Moderate	25–49% scalp hair loss; multiple or larger patches
3	Severe	50–74% scalp hair loss; extensive involvement
4	Very Severe	75–100% scalp hair loss; near-total (alopecia totalis) or total loss

These categories are used for patient stratification, responder analysis, and severity-based inclusion/exclusion criteria in clinical trial protocols.

SALT response thresholds

The system automatically computes SALT response categories relative to each patient's baseline:

Response	Calculation	Example
SALT 50	$\frac{\text{Baseline} - \text{Current}}{\text{Baseline}} \geq 50\%$	Baseline SALT 60, current SALT 25 = 58% reduction (response)
SALT 75	$\frac{\text{Baseline} - \text{Current}}{\text{Baseline}} \geq 75\%$	Baseline SALT 60, current SALT 12 = 80% reduction (response)
SALT 90	$\frac{\text{Baseline} - \text{Current}}{\text{Baseline}} \geq 90\%$	Baseline SALT 60, current SALT 5 = 92% reduction (response)
SALT 100	Current SALT = 0	Complete hair regrowth

These responder definitions are consistent with FDA guidance and are used as primary and co-primary endpoints in alopecia areata registration trials.

Reproducibility

Inter-rater variability: eliminated

Different investigators scoring the same patient produce different results. AI-powered scoring eliminates this entirely — the same image always produces the same score, regardless of which site captures it.

Intra-rater variability: eliminated

The same investigator may score the same patient differently on different occasions due to fatigue, time pressure, learning effects, or subjective drift over a long study. The AI has no such drift — it is the same model, with the same weights, producing the same output deterministically.

Site-to-site consistency

In multi-center trials, scoring consistency across sites is critical for endpoint integrity. Manual scoring requires extensive calibration exercises, training sessions, and ongoing monitoring for rater drift. AI scoring requires none of this — scores are inherently consistent across all sites.

Impact on trial design

Reduced scoring variability means cleaner endpoint data, which translates to:

Smaller required sample sizes — less noise means smaller samples can detect the same treatment effect
Faster data lock — no queries related to scoring inconsistencies
Stronger regulatory submissions — consistent, reproducible data with documented methodology

Model versioning

The AI model version is locked at study initiation. This ensures every patient in the study is scored by the same model throughout the trial:

No mid-trial model updates: the model version is frozen when the study is configured
Version tracking: the model version number is recorded in every scored report and in the audit trail
Change control: any model changes follow the formal change control process under IEC 62304, including risk assessment per ISO 14971
Reproducibility guarantee: any image can be re-scored at any time and will produce the identical result

Comparison with manual SALT assessment

Dimension	Manual counting	AI counting
Time per patient	5–10 minutes (visual estimation per quadrant)	<2 seconds
Inter-rater variability	Significant; percentage estimation is subjective	Zero — deterministic
Intra-rater variability	Documented; same rater may score differently on different days	Zero — no fatigue
Scalability	Limited by rater availability	Unlimited — API-based
Consistency across sites	Requires calibration exercises and photographic training	Inherent — same model everywhere
Granularity	Raters typically estimate in 5–10% increments	Continuous percentage with sub-percentage resolution
Automated alerts	Not available; requires manual comparison to baseline	Configurable threshold-based notifications (e.g., ≥25% increase from baseline)
Cost	Central reader fees; per-patient costs	Fixed per-study licensing

The fundamental limitation of manual SALT scoring is the subjective estimation of percentage hair loss from visual inspection. Studies have shown that dermatologists typically estimate in 5–10% increments and that inter-rater variability is significant, particularly for intermediate severity levels. The AI eliminates this variability by computing percentages from pixel-level segmentation, providing a continuous, reproducible measurement.

Counting methodology reference

The standard protocol follows the SALT (Severity of Alopecia Tool) (Olsen et al. (2004)), which the SALT score is a validated, standardised method for quantifying the extent of scalp hair loss. The scalp is divided into four quadrants, each assessed for percentage of hair loss, and the results are combined using anatomical weights to produce a total score from 0 (no hair loss) to 100 (complete hair loss).

SALT (Severity of Alopecia Tool) grade	Range
None	SALT 0 (0% hair loss)
Limited	SALT 1–24 (1–24% hair loss)
Moderate	SALT 25–49 (25–49% hair loss)
Severe	SALT 50–74 (50–74% hair loss)
Very Severe	SALT 75–100 (75–100% hair loss)

Scalp segmentation model​

Pixel-level hair loss detection​

Training data​

Confidence thresholding​

SALT score computation​

The SALT formula​

Regional weights​

How the AI estimates regional hair loss percentage​

Severity classification​

SALT response thresholds​

Reproducibility​