Full text: Download
Locomotion scores are used for lameness detection in dairy cows. In research, locomotion scores with 5 levels are used most often. Analysis of scores, however, is done after transformation of the original 5-level scale into a 4-, 3-, or 2-level scale to improve reliability and agreement. The objective of this study was to evaluate different ways of merging levels to optimize resolution, reliability, and agreement of locomotion scores for dairy cows. Locomotion scoring was done by using a 5-level scale and 10 experienced raters in 2 different scoring sessions from videos from 58 cows. Intra- and interrater reliability and agreement were calculated as weighted kappa coefficient (κw) and percentage of agreement (PA), respectively. Overall intra- and interrater reliability and agreement and specific intra- and interrater agreement were determined for the 5-level scale and after transformation into 4-, 3-, and 2-level scales by merging different combinations of adjacent levels. Intrarater reliability (κw) ranged from 0.63 to 0.86, whereas intrarater agreement (PA) ranged from 60.3 to 82.8% for the 5-level scale. Interrater κw = 0.28 to 0.84 and interrater PA = 22.6 to 81.8% for the 5-level scale. The specific intrarater agreement was 76.4% for locomotion level 1, 68.5% for level 2, 65% for level 3, 77.2% for level 4, and 80% for level 5. Specific interrater agreement was 64.7% for locomotion level 1, 57.5% for level 2, 50.8% for level 3, 60% for level 4, and 45.2% for level 5. Specific intra- and interrater agreement suggested that levels 2 and 3 were more difficult to score consistently compared with other levels in the 5-level scale. The acceptance threshold for overall intra- and interrater reliability (κw and κ ≥0.6) and agreement (PA ≥75%) and specific intra- and interrater agreement (≥75% for all levels within locomotion score) was exceeded only for the 2-level scale when the 5 levels were merged as (12) (345) or (123) (45). In conclusion, when locomotion scoring is performed by experienced raters without further training together, the lowest specific intra- and interrater agreement was obtained in levels 2 and 3 of the 5-level scale. Acceptance thresholds for overall intra- and interrater reliability and agreement and specific intra- and interrater agreement were exceeded only in the 2-level scale.