The Hawaii high school cross country championship was held this weekend. At first, only the live time and place results were displayed, and there were no team scores calculated. That made me want to figure out how to get team scores out of the live results. Doing this also let me figure out some hypotheticals like what if the teams from division 1 and division 2 were combined in one race.
Data Munging
I downloaded the data from https://hhsaa.org/sports/cross_country/tournament/2019.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate)
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
dc1 <- tbl_df(read.csv("../datasets/xcountry2019.csv", colClasses = "character"))
dc1$place_overall <- as.numeric(dc1$place_overall)
dc1$place_team <- as.numeric(dc1$place_team)
dc1$division <- as.factor(dc1$division)
dc1$school <- as.factor(dc1$school)
dc1$gender <- as.factor(dc1$gender)
I had to convert the running time in mm:ss format into a format that I could manipulate.
dc1$runtime <- ms(dc1$run_time)
dc1$runtime_seconds <- period_to_seconds(ms(dc1$run_time))
Here’s the data format.
glimpse(dc1)
## Observations: 394
## Variables: 11
## $ run_time <chr> "16:15.6", "16:19.5", "16:34.0", "16:46.9", "16:…
## $ place_overall <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1…
## $ place_team <dbl> 1, 2, 3, 4, 5, NA, 6, NA, 7, 8, 9, 10, 11, 12, 1…
## $ bib <chr> "1", "21", "3", "2", "149", "47", "33", "137", "…
## $ division <fct> 1, 2, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, …
## $ name <chr> "Hunter Shields", "Adam Harder", "Damon Wakefiel…
## $ school <fct> Maui High, Hanalani, Maui High, Maui High, Moana…
## $ pace <chr> "05:15", "05:16", "05:20", "05:25", "05:26", "05…
## $ gender <fct> m, m, m, m, m, m, m, m, m, m, m, m, m, m, m, m, …
## $ runtime <Period> 16M 15.6S, 16M 19.5S, 16M 34S, 16M 46.9S, 16M…
## $ runtime_seconds <dbl> 975.6, 979.5, 994.0, 1006.9, 1011.8, 1014.2, 101…
Division place (i.e., boys division 1) was not provided in the original table so I had to calculate this.
dc1 <- dc1 %>% filter(!is.na(place_team)) %>%
group_by(division, gender) %>%
arrange(runtime_seconds) %>%
mutate(place_division = dense_rank(runtime_seconds))
Results: Division Winners by Team Score
I was able to replicate the team scores from the official website.
Division 1 Boys
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 1, gender == "m") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division)) %>%
arrange(team_score)
## # A tibble: 14 x 2
## school team_score
## <fct> <int>
## 1 Punahou 41
## 2 Maui High 57
## 3 Kalani High School 129
## 4 Kamehameha-Kapalama 135
## 5 Waiakea 160
## 6 Radford High School 164
## 7 Pearl City High School 187
## 8 Moanalua High School 207
## 9 Kalaheo High School 227
## 10 Mililani High School 228
## 11 Kealakehe 238
## 12 Hilo 286
## 13 Campbell High School 316
## 14 Kea'au 378
Division 2 Boys
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 2, gender == "m") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division)) %>%
arrange(team_score)
## # A tibble: 7 x 2
## school team_score
## <fct> <int>
## 1 Seabury Hall 47
## 2 Hawaii Baptist Academy 51
## 3 Island School 86
## 4 Kamehameha-Hi 112
## 5 Kauai High School 118
## 6 Hawaii Prep 119
## 7 Hanalani 120
Combined Boys
As a hypothetical, I calculated the team scores with the two divisions combined.
dc1 %>% filter(!is.na(place_team)) %>%
filter(gender == "m") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_team)) %>%
arrange(team_score)
## # A tibble: 21 x 2
## school team_score
## <fct> <dbl>
## 1 Punahou 56
## 2 Maui High 85
## 3 Hawaii Baptist Academy 138
## 4 Seabury Hall 141
## 5 Kalani High School 197
## 6 Kamehameha-Kapalama 210
## 7 Waiakea 244
## 8 Radford High School 248
## 9 Island School 251
## 10 Pearl City High School 293
## # … with 11 more rows
The top division 2 team was able to finish 3rd in the combined race. Interestingly enough the top division 2 team in the combined race (HBA) did not win the division 2 race. Seabury Hall actually won that race. There must have been a bunch of div 1 runners between the last runners for HBA and Seabury Hall.
Division 1 Girls
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 1, gender == "f") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division)) %>%
arrange(team_score)
## # A tibble: 17 x 2
## school team_score
## <fct> <int>
## 1 Punahou 26
## 2 Iolani 92
## 3 Hilo 123
## 4 Kamehameha-Kapalama 131
## 5 Radford High School 147
## 6 Kalaheo High School 152
## 7 King Kekaulike 218
## 8 Kealakehe 238
## 9 Campbell High School 254
## 10 Waiakea 281
## 11 Kalani High School 293
## 12 Roosevelt High School 329
## 13 Kea'au 340
## 14 Baldwin 346
## 15 Pearl City High School 365
## 16 Mililani High School 367
## 17 Hawaii Prep 396
Punahou had a huge margin of victory in the division 1 race.
Division 2 Girls
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 2, gender == "f") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division)) %>%
arrange(team_score)
## # A tibble: 4 x 2
## school team_score
## <fct> <int>
## 1 Hawaii Baptist Academy 29
## 2 Seabury Hall 40
## 3 Kamehameha-Hi 85
## 4 Kauai High School 88
HBA was able to hold off Seabury hall in the division 2 race.
Combined Girls
dc1 %>% filter(!is.na(place_team), gender == "f") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_team)) %>%
arrange(team_score)
## # A tibble: 21 x 2
## school team_score
## <fct> <dbl>
## 1 Punahou 43
## 2 Hawaii Baptist Academy 68
## 3 Iolani 131
## 4 Seabury Hall 137
## 5 Hilo 182
## 6 Kamehameha-Kapalama 194
## 7 Radford High School 206
## 8 Kalaheo High School 210
## 9 King Kekaulike 288
## 10 Kealakehe 320
## # … with 11 more rows
In the hypothetical combined race, HBA (Div 2), would have finished 2nd, defeating the Div 1 second place team Iolani handily.
Results: Division Winners by Time
I recently noticed that collegiate cross country scores show the total team time as well. I wanted to see what that would look like for the high schoolers.
Division 1 Boys
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 1, gender == "m") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
arrange(team_time)
## # A tibble: 14 x 3
## school team_score team_time
## <fct> <int> <dbl>
## 1 Maui High 57 5162.
## 2 Punahou 41 5181.
## 3 Kalani High School 129 5461.
## 4 Kamehameha-Kapalama 135 5512.
## 5 Radford High School 164 5528.
## 6 Waiakea 160 5539.
## 7 Moanalua High School 207 5611.
## 8 Pearl City High School 187 5616.
## 9 Kalaheo High School 227 5664
## 10 Mililani High School 228 5682
## 11 Kealakehe 238 5683
## 12 Hilo 286 5835.
## 13 Campbell High School 316 5866.
## 14 Kea'au 378 6001.
Interestingly enough, Maui was second place by runner place but actually first in total time! Radford and Moanalua also moved up in rankings based on total time.
Division 2 Boys
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 2, gender == "m") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
arrange(team_time)
## # A tibble: 7 x 3
## school team_score team_time
## <fct> <int> <dbl>
## 1 Seabury Hall 47 5357.
## 2 Hawaii Baptist Academy 51 5375.
## 3 Island School 86 5538.
## 4 Hanalani 120 5592.
## 5 Hawaii Prep 119 5604
## 6 Kamehameha-Hi 112 5622.
## 7 Kauai High School 118 5652.
The Division 2 race also showed some schools moving up based on total time including Hanalani and Hawaii Prep.
Combined Boys
dc1 %>% filter(!is.na(place_team)) %>%
filter(gender == "m") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_team), team_time = sum(runtime_seconds)) %>%
arrange(team_time)
## # A tibble: 21 x 3
## school team_score team_time
## <fct> <dbl> <dbl>
## 1 Maui High 85 5162.
## 2 Punahou 56 5181.
## 3 Seabury Hall 141 5357.
## 4 Hawaii Baptist Academy 138 5375.
## 5 Kalani High School 197 5461.
## 6 Kamehameha-Kapalama 210 5512.
## 7 Radford High School 248 5528.
## 8 Island School 251 5538.
## 9 Waiakea 244 5539.
## 10 Hanalani 330 5592.
## # … with 11 more rows
Unlike the place-based score, the division 2 winner (Seabury Hall) remains ahead of the second place division 2 team (HBA). Two other division 2 teams sneaked into the top 10 (Island School and Hanalani).
Division 1 Girls
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 1, gender == "f") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
arrange(team_time)
## # A tibble: 17 x 3
## school team_score team_time
## <fct> <int> <dbl>
## 1 Punahou 26 6112.
## 2 Iolani 92 6370.
## 3 Hilo 123 6559.
## 4 Kalaheo High School 152 6584.
## 5 Radford High School 147 6591.
## 6 Kamehameha-Kapalama 131 6619.
## 7 King Kekaulike 218 6774.
## 8 Kealakehe 238 6896.
## 9 Campbell High School 254 6902.
## 10 Waiakea 281 6951.
## 11 Kalani High School 293 6970.
## 12 Roosevelt High School 329 7058.
## 13 Baldwin 346 7100.
## 14 Hawaii Prep 396 7203.
## 15 Kea'au 340 7242.
## 16 Pearl City High School 365 7291.
## 17 Mililani High School 367 7309.
There wasn’t much difference here in the overall standings based on time other than Kamehameha-Kapalama falling a couple of places in total time.
Division 2 Girls
dc1 %>% filter(!is.na(place_team)) %>%
filter(division == 2, gender == "f") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
arrange(team_time)
## # A tibble: 4 x 3
## school team_score team_time
## <fct> <int> <dbl>
## 1 Hawaii Baptist Academy 29 6179.
## 2 Seabury Hall 40 6389.
## 3 Kamehameha-Hi 85 6931.
## 4 Kauai High School 88 6987.
The Division 2 race was exactly the same as the team score based places.
Combined Girls
dc1 %>% filter(!is.na(place_team)) %>%
filter(gender == "f") %>%
group_by(school) %>%
top_n(5, -runtime_seconds) %>%
summarize(team_score = sum(place_team), team_time = sum(runtime_seconds)) %>%
arrange(team_time)
## # A tibble: 21 x 3
## school team_score team_time
## <fct> <dbl> <dbl>
## 1 Punahou 43 6112.
## 2 Hawaii Baptist Academy 68 6179.
## 3 Iolani 131 6370.
## 4 Seabury Hall 137 6389.
## 5 Hilo 182 6559.
## 6 Kalaheo High School 210 6584.
## 7 Radford High School 206 6591.
## 8 Kamehameha-Kapalama 194 6619.
## 9 King Kekaulike 288 6774.
## 10 Kealakehe 320 6896.
## # … with 11 more rows
No real surprises here either other than KS-Kapalama moving down and Kalaheo moving up in total time.