Cross Country Results

The Hawaii high school cross country championship was held this weekend. At first, only the live time and place results were displayed, and there were no team scores calculated. That made me want to figure out how to get team scores out of the live results. Doing this also let me figure out some hypotheticals like what if the teams from division 1 and division 2 were combined in one race.

Data Munging

I downloaded the data from https://hhsaa.org/sports/cross_country/tournament/2019.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
dc1 <- tbl_df(read.csv("../datasets/xcountry2019.csv", colClasses = "character"))
dc1$place_overall <- as.numeric(dc1$place_overall)
dc1$place_team <- as.numeric(dc1$place_team)
dc1$division <- as.factor(dc1$division)
dc1$school <- as.factor(dc1$school)
dc1$gender <- as.factor(dc1$gender)

I had to convert the running time in mm:ss format into a format that I could manipulate.

dc1$runtime <- ms(dc1$run_time)
dc1$runtime_seconds <- period_to_seconds(ms(dc1$run_time))

Here’s the data format.

glimpse(dc1)
## Observations: 394
## Variables: 11
## $ run_time        <chr> "16:15.6", "16:19.5", "16:34.0", "16:46.9", "16:…
## $ place_overall   <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1…
## $ place_team      <dbl> 1, 2, 3, 4, 5, NA, 6, NA, 7, 8, 9, 10, 11, 12, 1…
## $ bib             <chr> "1", "21", "3", "2", "149", "47", "33", "137", "…
## $ division        <fct> 1, 2, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, …
## $ name            <chr> "Hunter Shields", "Adam Harder", "Damon Wakefiel…
## $ school          <fct> Maui High, Hanalani, Maui High, Maui High, Moana…
## $ pace            <chr> "05:15", "05:16", "05:20", "05:25", "05:26", "05…
## $ gender          <fct> m, m, m, m, m, m, m, m, m, m, m, m, m, m, m, m, …
## $ runtime         <Period> 16M 15.6S, 16M 19.5S, 16M 34S, 16M 46.9S, 16M…
## $ runtime_seconds <dbl> 975.6, 979.5, 994.0, 1006.9, 1011.8, 1014.2, 101…

Division place (i.e., boys division 1) was not provided in the original table so I had to calculate this.

dc1 <- dc1 %>% filter(!is.na(place_team)) %>%
  group_by(division, gender) %>% 
  arrange(runtime_seconds) %>%
  mutate(place_division = dense_rank(runtime_seconds))

Results: Division Winners by Team Score

I was able to replicate the team scores from the official website.

Division 1 Boys

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 1, gender == "m") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division)) %>%
  arrange(team_score)
## # A tibble: 14 x 2
##    school                 team_score
##    <fct>                       <int>
##  1 Punahou                        41
##  2 Maui High                      57
##  3 Kalani High School            129
##  4 Kamehameha-Kapalama           135
##  5 Waiakea                       160
##  6 Radford High School           164
##  7 Pearl City High School        187
##  8 Moanalua High School          207
##  9 Kalaheo High School           227
## 10 Mililani High School          228
## 11 Kealakehe                     238
## 12 Hilo                          286
## 13 Campbell High School          316
## 14 Kea'au                        378

Division 2 Boys

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 2, gender == "m") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division)) %>%
  arrange(team_score)
## # A tibble: 7 x 2
##   school                 team_score
##   <fct>                       <int>
## 1 Seabury Hall                   47
## 2 Hawaii Baptist Academy         51
## 3 Island School                  86
## 4 Kamehameha-Hi                 112
## 5 Kauai High School             118
## 6 Hawaii Prep                   119
## 7 Hanalani                      120

Combined Boys

As a hypothetical, I calculated the team scores with the two divisions combined.

dc1 %>% filter(!is.na(place_team)) %>% 
  filter(gender == "m") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_team)) %>%
  arrange(team_score)
## # A tibble: 21 x 2
##    school                 team_score
##    <fct>                       <dbl>
##  1 Punahou                        56
##  2 Maui High                      85
##  3 Hawaii Baptist Academy        138
##  4 Seabury Hall                  141
##  5 Kalani High School            197
##  6 Kamehameha-Kapalama           210
##  7 Waiakea                       244
##  8 Radford High School           248
##  9 Island School                 251
## 10 Pearl City High School        293
## # … with 11 more rows

The top division 2 team was able to finish 3rd in the combined race. Interestingly enough the top division 2 team in the combined race (HBA) did not win the division 2 race. Seabury Hall actually won that race. There must have been a bunch of div 1 runners between the last runners for HBA and Seabury Hall.

Division 1 Girls

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 1, gender == "f") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division)) %>%
  arrange(team_score)
## # A tibble: 17 x 2
##    school                 team_score
##    <fct>                       <int>
##  1 Punahou                        26
##  2 Iolani                         92
##  3 Hilo                          123
##  4 Kamehameha-Kapalama           131
##  5 Radford High School           147
##  6 Kalaheo High School           152
##  7 King Kekaulike                218
##  8 Kealakehe                     238
##  9 Campbell High School          254
## 10 Waiakea                       281
## 11 Kalani High School            293
## 12 Roosevelt High School         329
## 13 Kea'au                        340
## 14 Baldwin                       346
## 15 Pearl City High School        365
## 16 Mililani High School          367
## 17 Hawaii Prep                   396

Punahou had a huge margin of victory in the division 1 race.

Division 2 Girls

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 2, gender == "f") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division)) %>%
  arrange(team_score)
## # A tibble: 4 x 2
##   school                 team_score
##   <fct>                       <int>
## 1 Hawaii Baptist Academy         29
## 2 Seabury Hall                   40
## 3 Kamehameha-Hi                  85
## 4 Kauai High School              88

HBA was able to hold off Seabury hall in the division 2 race.

Combined Girls

dc1 %>% filter(!is.na(place_team), gender == "f") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_team)) %>%
  arrange(team_score)
## # A tibble: 21 x 2
##    school                 team_score
##    <fct>                       <dbl>
##  1 Punahou                        43
##  2 Hawaii Baptist Academy         68
##  3 Iolani                        131
##  4 Seabury Hall                  137
##  5 Hilo                          182
##  6 Kamehameha-Kapalama           194
##  7 Radford High School           206
##  8 Kalaheo High School           210
##  9 King Kekaulike                288
## 10 Kealakehe                     320
## # … with 11 more rows

In the hypothetical combined race, HBA (Div 2), would have finished 2nd, defeating the Div 1 second place team Iolani handily.

Results: Division Winners by Time

I recently noticed that collegiate cross country scores show the total team time as well. I wanted to see what that would look like for the high schoolers.

Division 1 Boys

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 1, gender == "m") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
  arrange(team_time)
## # A tibble: 14 x 3
##    school                 team_score team_time
##    <fct>                       <int>     <dbl>
##  1 Maui High                      57     5162.
##  2 Punahou                        41     5181.
##  3 Kalani High School            129     5461.
##  4 Kamehameha-Kapalama           135     5512.
##  5 Radford High School           164     5528.
##  6 Waiakea                       160     5539.
##  7 Moanalua High School          207     5611.
##  8 Pearl City High School        187     5616.
##  9 Kalaheo High School           227     5664 
## 10 Mililani High School          228     5682 
## 11 Kealakehe                     238     5683 
## 12 Hilo                          286     5835.
## 13 Campbell High School          316     5866.
## 14 Kea'au                        378     6001.

Interestingly enough, Maui was second place by runner place but actually first in total time! Radford and Moanalua also moved up in rankings based on total time.

Division 2 Boys

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 2, gender == "m") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
  arrange(team_time)
## # A tibble: 7 x 3
##   school                 team_score team_time
##   <fct>                       <int>     <dbl>
## 1 Seabury Hall                   47     5357.
## 2 Hawaii Baptist Academy         51     5375.
## 3 Island School                  86     5538.
## 4 Hanalani                      120     5592.
## 5 Hawaii Prep                   119     5604 
## 6 Kamehameha-Hi                 112     5622.
## 7 Kauai High School             118     5652.

The Division 2 race also showed some schools moving up based on total time including Hanalani and Hawaii Prep.

Combined Boys

dc1 %>% filter(!is.na(place_team)) %>% 
  filter(gender == "m") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_team), team_time = sum(runtime_seconds)) %>%
  arrange(team_time)
## # A tibble: 21 x 3
##    school                 team_score team_time
##    <fct>                       <dbl>     <dbl>
##  1 Maui High                      85     5162.
##  2 Punahou                        56     5181.
##  3 Seabury Hall                  141     5357.
##  4 Hawaii Baptist Academy        138     5375.
##  5 Kalani High School            197     5461.
##  6 Kamehameha-Kapalama           210     5512.
##  7 Radford High School           248     5528.
##  8 Island School                 251     5538.
##  9 Waiakea                       244     5539.
## 10 Hanalani                      330     5592.
## # … with 11 more rows

Unlike the place-based score, the division 2 winner (Seabury Hall) remains ahead of the second place division 2 team (HBA). Two other division 2 teams sneaked into the top 10 (Island School and Hanalani).

Division 1 Girls

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 1, gender == "f") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
  arrange(team_time)
## # A tibble: 17 x 3
##    school                 team_score team_time
##    <fct>                       <int>     <dbl>
##  1 Punahou                        26     6112.
##  2 Iolani                         92     6370.
##  3 Hilo                          123     6559.
##  4 Kalaheo High School           152     6584.
##  5 Radford High School           147     6591.
##  6 Kamehameha-Kapalama           131     6619.
##  7 King Kekaulike                218     6774.
##  8 Kealakehe                     238     6896.
##  9 Campbell High School          254     6902.
## 10 Waiakea                       281     6951.
## 11 Kalani High School            293     6970.
## 12 Roosevelt High School         329     7058.
## 13 Baldwin                       346     7100.
## 14 Hawaii Prep                   396     7203.
## 15 Kea'au                        340     7242.
## 16 Pearl City High School        365     7291.
## 17 Mililani High School          367     7309.

There wasn’t much difference here in the overall standings based on time other than Kamehameha-Kapalama falling a couple of places in total time.

Division 2 Girls

dc1 %>% filter(!is.na(place_team)) %>%
  filter(division == 2, gender == "f") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_division), team_time = sum(runtime_seconds)) %>%
  arrange(team_time)
## # A tibble: 4 x 3
##   school                 team_score team_time
##   <fct>                       <int>     <dbl>
## 1 Hawaii Baptist Academy         29     6179.
## 2 Seabury Hall                   40     6389.
## 3 Kamehameha-Hi                  85     6931.
## 4 Kauai High School              88     6987.

The Division 2 race was exactly the same as the team score based places.

Combined Girls

dc1 %>% filter(!is.na(place_team)) %>% 
  filter(gender == "f") %>%
  group_by(school) %>%
  top_n(5, -runtime_seconds) %>%
  summarize(team_score = sum(place_team), team_time = sum(runtime_seconds)) %>%
  arrange(team_time)
## # A tibble: 21 x 3
##    school                 team_score team_time
##    <fct>                       <dbl>     <dbl>
##  1 Punahou                        43     6112.
##  2 Hawaii Baptist Academy         68     6179.
##  3 Iolani                        131     6370.
##  4 Seabury Hall                  137     6389.
##  5 Hilo                          182     6559.
##  6 Kalaheo High School           210     6584.
##  7 Radford High School           206     6591.
##  8 Kamehameha-Kapalama           194     6619.
##  9 King Kekaulike                288     6774.
## 10 Kealakehe                     320     6896.
## # … with 11 more rows

No real surprises here either other than KS-Kapalama moving down and Kalaheo moving up in total time.