Tasks
The tasks of the challenge are twofold:
 To segment vertebrae from the given spine images that include fractured and nonfractured cases, and provide vertebra segmentation results in the form of corresponding masks.
 To classify vertebrae from the given spine images into fractured and nonfractured cases along with specific morphological grades and cases of vertebral fractures, and provide fracture classification results in the form of corresponding fracture scores.
Participants are invited to develop automated or semiautomated computerassisted algorithms to solve both tasks or an individual task, and to submit their results through this website. The results for the vertebra segmentation task and fracture classification task will be evaluated and ranked separately, so that tasks can be approached individually or jointly, meaning that a qualitative vertebra segmentation does not necessarily imply a qualitative fracture classification, and vice versa.
Rules
The is an open online challenge, meaning that anyone can participate by entering the challenge (open), and that results are regularly updated and posted through this website (online). If a sufficient number of participants enter the challenge, organizers may decide to proceed to a live challenge with corresponding paper submissions within a major medical imaging conference (e.g. MICCAI, ISBI, SPIE MI) and/or prepare a joint journal paper summarizing challenge outcomes for a highimpact journal in the corresponding field (e.g. IEEE TMI, MedIA). The following rules apply:
 Anyone can download the database without registering. However, we kindly ask users to register with their contact information before downloading the database, so that we can keep track of its usage. Moreover, each registered user will be informed via email about future developments and news related to the challenge.
 Contributions to the challenge are not limited to new and unpublished methods, which means that application of existing methods is allowed. Participants agree that they will specifically describe any manual operations in their contributions, which may influence their final ranking.
 Registering with contact information is mandatory at result submission, however, participation in the challenge is anonymous to the extent that identities of participants are known to organizers only. When submitting results, each participant will be given an unique id, the so called participant id (PID), by which the participant will be identified in challenge results posted online.
 Online challenge results will be reported by PID in the form of ranks only. Ranking will be performed separately for the vertebra segmentation task and fracture classification task. Along with PID, each participant will also receive a password for accessing a more detailed report of the corresponding results.
 Participants that originate from the same group can submit at most two (2) substantially different contributions to the challenge. Participants are allowed to resubmit twice (2×) an improved version of their existing contributions, meaning that a maximum of three (3) versions per contribution are allowed, and contribution versions resulting in the highest rank will be assigned to corresponding participants.
 If the challenge organizers decide to proceed to a live challenge and/or prepare a joint journal paper, they reserve the right to decline selected participants (e.g. participants with trivial contributions producing very poor results, etc.). Accepted participants must agree to disclose their identity and provide a short description of the contributed method. For the joint journal paper, a maximum of two (2) coauthors per contribution will be allowed.
Evaluation Metrics
Evaluation of the results will be performed on the basis of the submitted results separately for vertebra segmentation and fracture classification tasks.
Vertebra Segmentation Evaluation Metrics
For vertebra segmentation, the following metrics will be considered to evaluate the quality of volume masks \(M_{seg}\) of segmented vertebrae against corresponding reference volume masks \(M_{ref}\):
 DSC
 The Dice similarity coefficient (DSC) is defined as:
$$DSC = \frac{2N(M_{seg} \cap M_{ref})}{N(M_{seg})+N(M_{ref})}$$
where \(N(M_{seg} \cap M_{ref})\) is the number of voxels in the overlap between volume masks \(M_{seg}\) and \(M_{ref}\), \(N(M_{seg})\) is the number of voxels in the segmented volume mask \(M_{seg}\), and \(N(M_{ref})\) is the number of voxels in the reference volume mask \(N(M_{ref})\).
 MSSD
 The mean symmetric surface distance (MSSD) is defined as:
$$MSSD = \frac{1}{N(M_{seg}) + N(M_{ref})}\left(\sum_{i \in M_{seg}}\min_{j \in M_{ref}}d(i,j) + \sum_{j \in M_{ref}}\min_{i \in M_{seg}}d(j,i)\right)$$
$$MSSD = \frac{1}{N(M_{seg}) + N(M_{ref})} \cdot \\[0.5em]
\left(\sum_{i \in M_{seg}}\min_{j \in M_{ref}}d(i,j) + \sum_{j \in M_{ref}}\min_{i \in M_{seg}}d(j,i)\right)$$
where \(\min_{j \in M_{ref}}d(i,j)\) is the Euclidean distance from surface voxel \(i\) in the segmented volume mask \(M_{seg}\) to the closest surface voxel \(j\) in the reference volume mask \(M_{ref}\), and \(\min_{i \in M_{seg}}d(j,i)\) is the Euclidean distance from surface voxel \(j\) in the reference volume mask \(M_{ref}\) to the closest surface voxel \(i\) in the segmented volume mask \(M_{seg}\).

Fracture Classification Evaluation Metrics
For fracture classification, the following metrics will be considered to evaluate the correctness of detected fracture scores \(S_{det} = (g_{det},c_{det})\) against corresponding reference fracture scores \(S_{ref} = (g_{ref},c_{ref})\), where \(g_{det}\) and \(g_{ref}\) are morphological grades, and \(c_{det}\) and \(c_{ref}\) are morphological cases of vertebral fractures (if either \(g=0\) or \(c=0\), the only possible score is \(S=(0,0)\), representing a nonfractured vertebra):
 MSPP
 The shortest path penalty (SPP) is defined as the sum of individual penalties accumulated along the shortest path from the detected score \(S_{det}\) to the corresponding reference score \(S_{ref}\). In this case, each individual penalty equals 1, representing each change in morphological grade and/or case starting from the detected score and reaching the corresponding reference score. For all vertebrae, the mean shortest path penalty (MSPP) is therefore computed as:
$$MSPP = \frac{\sum\left(g_{det}g_{ref} + [c_{det} \neq c_{ref}]\right)}{\sum[c_{ref} \geq 0]}$$
where \([x]=1\) if condition \(x\) is satisfied, and \([x]=0\) otherwise. For example, for a vertebra with the detected score \(S_{det}=(1,3)\) (mild crush fracture) and the corresponding reference score \(S_{ref}=(2,1)\) (moderate wedge fracture), SPP results in \(SPP=2\).

 Fscore
 The Fscore quantitatively describes the quality of a binary classification and is defined as:
$$F = \frac{2 \cdot PPV \cdot TPR}{PPV + TPR}$$
where the positive predictive value (PPV), also known as precision, is defined as the ratio of all correctly detected fractured vertebrae, and the true positive rate (TPR), also known as sensitivity or recall, is defined as the ratio of fractured vertebrae that are correctly detected as such:
$$\begin{split}
PPV &= \frac{TP}{TP + FP} \\[0.5em]
TPR &= \frac{TP}{TP + FN}
\end{split}$$
where true positives \(TP = \sum\left([c_{det} > 0]\cdot[c_{ref} > 0]\right)\) represent the number of fractured vertebrae that are correctly detected as fractured, false negatives \(FN = \sum\left([c_{det} = 0]\cdot[c_{ref} > 0]\right)\) represent the number of fractured vertebrae that are incorrectly detected as nonfractured, true negatives \(TN = \sum\left([c_{det} = 0]\cdot[c_{ref} = 0]\right)\) represent the number of nonfractured vertebrae that are correctly detected as nonfractured, and false positives \(FP = \sum\left([c_{det} > 0]\cdot[c_{ref} = 0]\right)\) represent the number of nonfractured vertebrae that are incorrectly detected as fractured. Positives \(P=TP+FN=\sum[c_{ref}>0]\) represent the number of all fractured vertebrae, and negatives \(N=FP+TN=\sum[c_{ref}=0]\) represent the number of all nonfractured vertebrae, where \([x]=1\) if condition \(x\) is satisfied, and \([x]=0\) otherwise.

Ranking
Ranking of the results will be performed on the basis of the evaluation metrics separately for vertebra segmentation and fracture classification.
Vertebra Segmentation Ranking
Ranking of vertebra segmentation results will be performed in the following order:
 for each segmented vertebra in each spine image, the DSC and MSSD values will be computed for each participant,
 then, for each segmented vertebra in each spine image, ranks for the DSC and MSSD values will be computed across all participants (if DSC is zero, the participant will be attributed the lowest rank for both DSC and MSSD),
 then, for each segmented vertebra in each spine image, the mean of the DSC and the mean of the MSSD ranks will be computed for each participant, resulting in the participant's rank for that vertebra,
 then, the mean participant's rank across all vertebrae will be computed, resulting in the participant's final vertebra segmentation rank.
Ranks for the DSC and MSSD values are natural numbers, while all remaining ranks, including the participant's final vertebra segmentation rank, are rational numbers.
Fracture Classification Ranking
Ranking of fracture classification results will be performed in the following order:
 for all classified vertebrae, the MSPP and Fscore values will be computed for each participant,
 then, ranks for the MSPP and Fscore values will be computed across all participants,
 then, the mean of the MSPP and Fscore ranks will be computed for each participant, resulting in the participant's final fracture classification rank.
Ranks for the MSPP and Fscore values are natural numbers, while the participant's final fracture classification rank is a rational number.