T15: Predicting hip Kellgren-Lawrence score


Objective:
This task aims to classify the radiology report by the Hip Kellgren-Lawrence score for hip osteoarthritis, ranging from 0 to 4.

Patient Population:
Radiology reports (in Dutch) for hiposteoarthritis were collected at Radboudumc.

Imaging Data:
Not applicable. The task is based solely on textual data — radiology reports written in Dutch.

Test Data:
The test set contains 172 cases.

Reference Standard:

  • Reports were manually annotated by a trained investigator.
  • Difficult cases were reviewed by a radiologist.
  • Labels include Kellgren-Lawrence scores (0–4) and two additional categories: "hip prosthesis" and "not determinable."
  • The dataset is imbalanced, and while the scores are ordinal, the additional categories are not.

Evaluation Metrics:
Performance is assessed using unweighted Kappa.
Predictions for the left and right hip are concatenated, and unweighted Kappa is computed on the combined predictions.

Relation to Existing Challenges:

  • Task 15 is derived from Task018 (Hip Kellgren-Lawrence scoring) of the DRAGON challenge.
  • Unlike the DRAGON challenge, UNICORN allows training on platform using only a small set of few-shot examples per task.

Additional Resources: