Over the last 12 years, I have studied and evaluated early childhood programs. I have seen many people work under the assumption that early childhood education can be a key to closing performance gaps. I don’t think there is any question that early childhood programs from efforts like the Dolly Parton Imagination Library to the Cecil Picard LA4 program can be used to advance the later school performance of children in need. Still, as a data driven researcher, I understand the need to see empirical evidence.
Why is it difficult to identify the effects of early childhood education?
Null Hypothesis: The scientific community sets specific standard for investigation. A key element is to remove researcher bias by turning the tables on the process. The null hypothesis (at least implied) in serious research actually asks the researcher to disprove what they believe. The tests are created to fail in absence of clear evidence. Thus, the lack of support is not necessarily equivalent to the lack of effect. It could mean that the researcher (or the methodology) could not find sufficient support for the evidence.
Program Differences: Education is generally underfunded and preK is no exception. Despite advances in standards, there are implementation differences within state funded preK and even greater differences in nonpublic preK. A preK program can be little more than much needed babysitting or a scientifically designed curriculum with the highest quality teachers. Louisiana’s developing system of evaluating preK programs is an example of seriously creating a measure of program quality. Soon, evaluations will assess more than the mythical presence or absence of preschool. It will be possible to look at the differential effect of preschool quality.
Demographic Differences: Those that are offered and accept public preschool are tend to be demographically different than those that do not. This is not a reflection on the quality of the preschool programs. States (especially Louisiana) have made a sincere effort to provide a high quality data-driven preschool. I have visited preschools and convinced it is not a school issue. Still, the children from affluent families have more options. The children from these families who do not go to public preschool, they do not stay at home. There a private preschools, retired relatives and stay at home parents to fill in the gaps. Quality preschools for children in need are there because the children come from a different starting point.
Longitudinal Effects: Many evaluation programs for preK attempt to demonstrate a long-term return on investment. Does the preK program affect performance later or does the effect fade after the first year. Longitudinal effects studies can be valuable but present their own set of confounding variables. The challenges that justify the need for preK do not fade away after the program. Students who most need the preK programs are more likely to attend schools that face similar funding and quality challenges. Without factoring the quality of later education, it is difficult to isolate a preK effect.
Creating the counterfactual: Providing high quality preK to some children while denying it to others is practically and ethically difficult. Lipsey et al. (2018, https://doi.org/10.1016/j.ecresq.2018.0) made a serious attempt to establish a control group in a study of the Tennessee Prekindergarten program. The natural result of a popular program was that it could not handle the demand. Education leaders created a lottery system to randomly choose children lucky enough to enter the program. The non-served children became the comparison group. Researchers were able to show an effect of the program in kindergarten but not later in the third grade. Thus, many use the study to question the long-lasting effect of early childhood education.
Unfortunately, there are issues with the method.
• There is an inherent assumption that children who did not enter the Tennessee preschool program did not have the benefits of preschool. The fact that the Tennessee program was oversubscribed is evidence itself of demand. It is likely that many of the children who did not enter the program found other options. In the two cohorts, 13% of the students who were given the opportunity to enter the program did not attend.
• Of course it was not possible to administer a pretest for children considered for the program — especially unserved children. The research team did compare the study group to the state on several key demographic measures and random assignment assumes that the distribution subgroups was equivalent in the study and comparison groups. However, distributions were uneven in geographic distribution, and program type. In addition, the widely accepted method of an intent to treat protocol resulted in 21% of the analytic sample crossing over to the other condition. Based on the assumption that students who did not enter the Tennessee Prekindergarten program or nothing, crossover effect is problematic.
• Attrition was a major problem with the third grade analytic sample. Obtaining parental consent to study combined with finding the preK students four years later resulted in a third grade analytic sample around 24% of the original study group. The research team provided evidence that the entire analytic sample for third grade could be weighed to account for differences between it and the kindergarten/statewide demographics. However, the team did not provide evidence that the treatment and control conditions were still similar. With such high attrition, random assignment is no longer an acceptable assumption for baseline equivalence.
The Lipsey study is a serious attempt to track the long term affects of high quality preschool. The research team did a great job creating a study group with a real world random assignment. However, the project is a study in the difficulty attached to real world measures for an issue as complicated as long term effects. Lipsey’s work should be considered but in context of many other sources of evidence.
There is no such thing as a perfect study. All can be criticized in one way or another. It is best to look at the evidence across studies and realize the limitations of each. I will talk about other studies soon.