Association between Years at a Private School and Academic Achievement

The kids brought home their yearbooks this past week, and it was time to settle a question that I have had for a long time. Do kids who enter their private school at kindergarten do better or worse than the kids who enter later?

Methods

At this school there are two honor rolls (Headmaster’s List and Honor Roll). To get Headmaster’s List, a student must achieve a 3.5 grade point average with no grade below a B-. To get Honor Roll a student must get a 3.0 grade point average with no grade below a C-.

In the senior section of the yearbook, each student lists how many years they have attended the school, which if any of those honor rolls they achieved, and what other activities they did. So a student’s bio might say: “13 years, Headmaster’s List, Honor Roll, Orchestra 5, JV Basketball”. Another student might say: “4 years, Honor Roll, Speech and Debate, Orchestra.” Yet another might say: “11 years, French Latin Honor Society.”

Given the two honor rolls, a student’s achievement might fall into one of 4 categories, from highest to lowest achieving:

  1. Headmaster’s List Only
  2. Headmaster’s List and Honor Roll
  3. Honor Roll Only
  4. None

I coded each student into one of these 4 categories. I also recorded the gender and the number of years they had attended the school (those entering kindergarten having 13).

Results

Here’s a brief look at the data.

library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag():    dplyr, stats
x <- read.csv("../datasets/grad_data.csv")
x <- tbl_df(x)

There were 229 students represented. There were 124 females and 105 males.

Here was the distribution of students by number of years attended. There were 57 students who attended the school from kindergarten through graduation (13 years).

table(years = x$years)
## years
##  2  3  4  5  6  7  8 11 12 13 
##  7  5 42 20 50 42  4  1  1 57

This was the distribution of the lists across the senior class. HR stands for Honor Roll and HL stands for Headmaster’s List.

x$lists_f <- factor(x$lists)
levels(x$lists_f) <- c("No Lists", "HR", 
                       "HR+HL", "HL")
table(Achievement = x$lists_f)
## Achievement
## No Lists       HR    HR+HL       HL 
##       44       47       42       91

Years of Attendance and Achievement

This was the distribution of student achievement by whether the student had started in kindergarten or later.

tab1 <- table(achievement = x$lists_f, started_in_K = x$sdoi)
tab1
##            started_in_K
## achievement FALSE TRUE
##    No Lists    35    9
##    HR          37   10
##    HR+HL       28   14
##    HL          67   24

To help with comparison, I also calculated the percentages for each category of achievement by whether students had started in kindergarten. It was not clear from looking at this table whether there were significant differences between the two groups of students.

round(prop.table(tab1, 2), 3)
##            started_in_K
## achievement FALSE  TRUE
##    No Lists 0.210 0.158
##    HR       0.222 0.175
##    HR+HL    0.168 0.246
##    HL       0.401 0.421

To determine if these was a significant difference, I performed a Mann Whitney U test (aka Wilcoxon rank sum test). The null hypothesis was that there was no difference in the distribution of the two groups of students. The resulting p-value was 0.398. I was not able to reject the null hypothesis. There did not appear to be a significant difference in the achievement between the two groups.

wilcox.test(x$lists~x$sdoi)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  x$lists by x$sdoi
## W = 4418.5, p-value = 0.3981
## alternative hypothesis: true location shift is not equal to 0

Gender and Achievement

Out of curiousity I also looked at gender effects. There were more girls than boys in the class, and girls appeared to outperform boys (p < 0.00001).

table(x$lists_f, x$gender)
##           
##             f  m
##   No Lists 14 30
##   HR       17 30
##   HR+HL    25 17
##   HL       64 27
round(prop.table(table(x$lists_f, x$gender), 2), 3)
##           
##                f     m
##   No Lists 0.117 0.288
##   HR       0.142 0.288
##   HR+HL    0.208 0.163
##   HL       0.533 0.260
wilcox.test(x$lists~x$gender)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  x$lists by x$gender
## W = 8479.5, p-value = 1.215e-06
## alternative hypothesis: true location shift is not equal to 0

Discussion

There was no statistically significant difference in achievement between students who started at this school in kindergarten vs. after kindergarten. There was a statistically significant difference in achievement between boys and girls.

There are some assumptions here (like maybe a student was terrible except for one quarter when they turned it on and got Headmaster’s List). I suspect that this is a very low number and unlikely to significantly change the results.

What about people who left the school? There were probably people who entered in kindergarten or later whose families moved away, couldn’t afford the tuition, or decided to switch schools for some reason. These students could potentially bias the result but it is difficult to predict in what way.

What conclusion should a parent draw from this? One conclusion might be that we know that kids entering in kindergarten were at least as good as those who enter later so the kindergarten selection methods appear to be as good as the standardized testing that later students have to do.

It’s possible that students who enter later are a mix of lower performing star atheletes and higher performing star students and so the overall mix is not changed. This would suggest that the standardized selection tests are better at predicting success than kindergarten selection methods, at least for those star students.

I think most parents would be interested to see what is the effect of the school on their student. That is, does paying the tuition confer some benefit vs. going to public school? The achievement of students attending 13 years of this school does not appear to be different from students who entered after kindergarten. This might suggest that if you think your student is smart enough to get in, there is no benefit to attending this school starting at kindergarten vs starting later.

However we don’t know the counterfactual here. What if the situation was reversed and the 13 year students started in 7th grade and vice versa? Would the original 13 year students even get in? Would the 7th grade entrants be even higher achieving?

Conclusion

Because we don’t have the whole picture, we can’t determine whether this school is better for students than another school. Nevertheless, we can still see that when it comes to the end result, the 13 year students as a group fare no better or worse than the students who come in later. Girls did better than boys.