1 Introduction

Spatial information has always played a central role in public health practice, especially at the local scale. Recent developments, including a spatial turn in academic public health, have resulted in renewed interest in the effects that neighborhoods have on population health and an increasingly important role for Spatial Demography (Diez Roux, 2001, Krieger, Chen, Waterman, Rehkopf, & Subramanian, 2003). We describe the development of New York City’s (NYC) system to monitor childhood obesity at fine spatial and demographic scales and to use the system to evaluate public policy. The data consists of administrative records describing longitudinal growth trajectories for NYC public school children. Detailed home and school information allow for analysis at fine spatial resolution allowing for the characterization of NYC neighborhoods and the investigation of community-level effects. The longitudinal trajectories can be matched with additional data sources describing school-based interventions and potentially obesogenic neighborhood-level built environment measures. We argue that quantile growth curve models can be used to evaluate whether and to what extent interventions alter the growth curve trajectories of children while controlling for demographic and neighborhood effects. We demonstrate the potential of the models and their interpretation using a neighborhood poverty indicator, an individual-level poverty indicator, and a real obesity reduction intervention (installation of water jets).

2 Background and Data

2.1 Childhood Obesity in New York City

In the United States, childhood obesity is a major public health concern and increasing trends since the 1980s have been documented (Ogden, Carroll, Kit, & Flegal, 2014, Ogden, Flegal, Carroll, & Johnson, 2002). Recently there has been much discussion about the current situation with some suggesting childhood obesity may be intractable (Skinner, Perrin, & Skelton, 2016), others noting improvements (RWJF, 2012, Dietz, 2016), and still others re-focusing attention on severe obesity trends (Skinner & Skelton, 2014).

In NYC, child obesity has been a focus of public health for over a decade(Thorpe, List, Marx, May, Helgerson, & Frieden, 2004). Since 2006, basic anthropometric measures have been collected annually in public schools as part of the NYC FITNESSGRAM assessment. These measures are linked to enrollment, attendance, physical fitness and academic outcomes to form a longitudinal dataset that records individual growth trajectories while providing detailed geographic and demographic information. Using this data, obesity patterns and trends are monitored at various scales (Konty, Day, Napier, Irvin, Thompson, & MD’Agostino, 2022, Berger, Konty, Day, Silver, Nonas, Kerker, Greene, Farley, & Harr, 2011; Day, Konty, Leventer-Roberts, Nonas, & Harris, 2014) and links to academic performance have been established (Egger, Konty, Bartley, Benson, Bellino, & Kerker, 2009).

The NYC public school system is the largest and most diverse system in the US with approximately 1.1 million children each year. The student population is characterized by substantial socio-economic heterogeneity and residential segregation by income. Over 40% of NYC public school students speak a language other than English at home, many of whom live in various ethnic enclaves. Administrative school records that include student residential address can be used to describe the demography of households in NYC at arbitrarily fine geographic scale. Additionally, the data goes beyond simply locating residence. It captures home address, school address, identifies siblings, classmates and their addresses. It also records movement by families in NYC, both residential mobility and change in schools.

2.2 Child Obesity Data

The NYC FITNESSGRAM includes approximately 900,000 unique measurements of height and weight per year in kindergarten through 12th grade. For cross-sectional reporting, non-response is currently adjusted for through a post-stratification procedure that gives weights to those measured based on demographic and geographic characteristics. These weights are then used to produce tables at various demographic and geographic scales. Repeated cross sections are compared to monitor for population change (Konty et al. 2022). Comparison of these changes to policy is common (such as difference-in-difference methods). Nonetheless, these methods do not take advantage of the longitudinal nature of the data. The data provides a rich longitudinal resource with a large share of students followed having three or more measurements in consecutive years.

Data of this form are becoming increasingly common. Multiple jurisdictions including Philadelphia (Robbins, Mallya, Polansky, & Schwarz, 2015), California (Babey, Wolstein, Diamant, Bloom, & Goldstein, 2011, ** & Jones-Smith, 2015), and Texas (Welk, Meredith, Ihmels, & Seeger, 2010) currently collect and report on child obesity using school-based collection of body composition data. Additionally, the Institute of Medicine has recommended school-based collection of physical fitness measures including body composition (Pate et al., 2012, Kohl III et al., 2013) and the Presidential Youth Fitness Program has endorsed FITNESSGRAM as a physical fitness assessment to be used in conjunction with an expanded physical education curriculum (Presidential Youth Fitness Program, 2014, Welk & Meredith, 2010). Increasingly this data can be linked to academic and other student outcomes as local governments and school districts adopt integrated data systems.

2.3 Local Public Policy

A number of local initiatives address childhood obesity including school- and community-level interventions; many targeted based on perceived or established patterns of obesity. School-based interventions include changes to food served in schools, changes to the food environment, regulations on vending machines, and targeted programs and interventions addressing diet and exercise. Other policy efforts address food access in communities including establishing green carts and farmers markets, restrictive licensing, increasing access to fresh produce in existing stores and incentivizing the opening of new grocery stores in neglected areas. Finally, efforts to increase community walkability and safety, to enable active transportation (e.g. adding bicycle lanes), or that increase recreation opportunities by adding playgrounds, parks or other centers also serve to address obesity.

Collecting data describing targeted policies is a key challenge to linking demographic data, health outcomes and policy. Often multiple interventions are in effect simultaneously and at various geographic scales. Our preferred approach would record the timing of (all) school-based interventions while establishing a longitudinal dataset describing changes to the built environment resulting from community-based policy. There is currently no monitoring system of interventions or the built environment operating in NYC. Such systems would enable causal interpretations of policy evaluations by quantifying changes in the environment for comparison to changes in outcomes.

Within this data and policy context, extensive efforts are underway to evaluate school- and community-based obesity interventions. Examples from NYC include evaluations of the Breakfast-in-Classroom program (Corcoran, Elbel, & Schwartz, 2014), the installation of water jets in school cafeterias(Schwartz, Leardo, Aneja, & Elbel, 2016), and the format of the information provided to parents on their children’s fitness assessments (Almond, Lee, & Schwartz, 2016). In California, researchers have looked at Native-American obesity in response to the establishment or expansion of casinos on reservations (Jones-Smith, Dow, & Chichlowska, 2014). These evaluations each utilized the data form we address here and each took distinct analytic approaches. We propose a general approach that can be applied to a variety of evaluation questions.

After introducing the quantile regression framework in the next section, we demonstrate its use in evaluating the impact of poverty and in assessing NYC’s water jet intervention. While poverty level is not an intervention it does provide an important baseline measure for effect sizes. Poverty is a known correlate of childhood obesity (Bennett, Wolin, & Duncan, 2008, Schroeder, Day, Konty, Dumenci, & Lipman, 2020). As noted above, student’s in NYC schools come from a wide range of household incomes and there is a high degree of residential income segregation. This is reflected in the neighborhood distribution of poverty prevalence (see Fig. 1b). For practical purposes, we use NYC School “District” as our neighborhood definition. There are 32 school “districts” in NYC which were defined in 1969 to partition the school system into administrative units corresponding with historic communities. As neighborhood definitions, School Districts are coarse (for comparison, there are 59 Community Districts in New York City and 189 Neighborhood Tabulation Areas) and, when first constructed, size criteria required adjacent smaller neighborhoods to be grouped (Bowser & Devadutt, 2019, p. 57). Nonetheless, they are designed to represent communities and are the primary geographic scale for the analysis of data collected within schools. Our area-based poverty measure classifies each district along an ordinal scale (low, medium, high, and extreme) (Toprani, Li, & Hadler, 2016). Given the composite of factors picked up in an area-based measure of poverty (lack of access to quality foods, other adverse neighborhood effects, etc.), we expect that children living in higher poverty areas will have higher rates of obesity; reflected in a rightward shift in the location of the 90th centile. We can also measure individual level poverty exposure using data from the reduced price/free-lunch program. Specifically each child record indicates whether they pay full price for lunch, a reduced price for lunch, or have free lunch. Here again, we expect higher poverty (free lunch status) to be associated with higher rates of obesity (a rightward shift in the 90th centile). Both poverty measures are treated as non-varying covariates across years. For the assessment of poverty (area and individual), we obtained subsamples of the NYC fitnessgram data. Specifically,we randomly extract two sets of records of approximately 100,000 female students each. One data set is used for model fitting and the other is used for validation.

Fig. 1
figure 1

Visual association between obesity and poverty

Our third assessment of the modeling framework uses data from NYC’s Water Jet Initiative. Starting in 2008, NYC installed 236 water jets in cafeterias serving 483 schools with the expectation that access to chilled, filtered water would increase child uptake of water relative to other higher calorie beverages. The initiative was funded by a CDC Communities Putting Prevention to Work grant and was explicitly designed and funded as an obesity intervention (CDC, 2012). We measure exposure to the water jet access based on the date of installation and the date of the child’s height and weight measurement. Restricting to the subset of children with between 3 months and 2 years of exposure yields a population of 4,744; which we considered exposed to treatment. We then construct two data sets for analysis. The first includes the 4,744 exposed children and a random sample of children not exposed to water jets equal to three times the size of the treatment group (3x4,744=14,232). We note that water jets were not installed across all districts and that child obesity has a strong spatial pattern. Our second data set combines the 4,744 treatment group with a non-treatment group that is three times that size and is randomly drawn in proportion to the districts represented in the treatment group. Our expectation is that children exposed to water jets will have lower rates of obesity (a leftward shift in the 90th centile) relative to the non-treatment group.

3 Methods

The framing of current childhood obesity reporting and analysis relies on standard centile reference charts from the US Centers for Disease Control (CDC, 2000, 2010) for height, weight, and BMI. The growth standards represent historic U.S. population age distributions for a given measure (height, weight, or BMI). Current CDC growth charts are estimated using the LMS model (Cole & Green, 1992); each age-group distribution is converted to normal using a Box-Cox transformation and the estimated location, scale, and skewness parameters are constrained to vary smoothly across age groups. For any raw measure, the age-specific LMS parameters are used to convert the measure to a z-score that relates to the standard reference centile chart. The notion of obesity or severe obesity among children by demographic subgroup or local district is made in reference to the share of the population that is above age-specific z-scores, or equivalently centiles, of the reference distribution.

Policy analysis to evaluate the effectiveness of city-wide, district, or school-based interventions to reduce obesity have relied on this same framing in relation to standard growth charts. A policy effect is conceived of as, \(Pr(z>z^*|\text {no treatment})-Pr(z>z^*|\text {treatment})=\delta\), where z is the z-score in the study population and \(z^*\) is a threshold from the reference growth chart, \(\delta\) is the policy effect size, and the probability model may include additional controls depending on the research design. It is recognized that the z-scores produced by current LMS methods perform poorly in the tails of the distribution (Flegal, Wei, Ogden, Freedman, Johnson, & Curtin, 2009) and can display erratic behavior across age (Koenker, 2018). This limitation occurs within the subgroup most of interest– the obese and severely obese– and alternate thresholds have been suggested (Gulati, Kaplan, & Daniels, 2012) to address it. Further, a substantial number of children are observed at these extreme values; over 5 percent of NYC public school children are severely obese (Day et al., 2014).

Another key feature of current reporting and analysis that relies on standard growth charts is that all measurement is cross-sectional. As noted by Wei (2004) and Wei et al. (2006), it is possible to develop an alternative framing based on longitudinal analysis when repeated measures are available for each individual. Indeed, comparing individual growth paths to reference growth charts based on cross-sectional data could be inappropriate (see examples, (Wei, 2004)). This alternative framing relies on quantile regression longitudinal growth curve models. Defining BMI for individual i as \(Y_i\) and age as \(t_i\), a baseline model of smooth quantile growth curves is:

$$\begin{aligned} Q_{Y_i|t_i,x_i}[\tau | t_i,x_i] = g_{\tau }(t_i) + x_i^T\gamma _\tau + e_i. \end{aligned}$$
(1)

Estimation of the smooth term \(g_\tau\) results in an empirical growth curve such that approximately \(\tau\), proportion of the observations lie below the fitted curve. Details of estimation are available in Koenker and Bassett Jr (1978), Koenker et al. (1994), Wei (2004). The model can also include covariates that act to shift the growth curves. In this case we define \(x_i\) as an individual-level covariate indicating policy treatment or no policy treatment. The policy effect in the quantile model is, \(Q_{Y_i|t_i}[\tau | t_i,x_i=\text {treatment}]-Q_{Y_i|t_i}[\tau | t_i,x_i=\text {no treatment}]=\gamma _\tau\); that is, the difference between the \(\tau\)th growth curve percentile under treatment or no treatment. Note that the policy effects will vary depending on the growth curve percentile selected such that the impact of a policy could reflect a shift in the location and shape of the growth curve distributions under the policy.

Wei (2004) refers to model (1) as the unconditional model. The model can be extended to condition on one or more prior measures of BMI when longitudinal data is available. Following (Wei, 2004), a model for irregular follow-up measurement times and including one prior measure of BMI is,

$$\begin{aligned} \begin{aligned} Q_{Y_i(t_{i,j})}(\tau |t_{i,j},.,X_{i,j}) = g_\tau (t_{i,j})&+ [\beta _{1,\tau } + \beta _{2,\tau }(t_{i,j}-t_{i,j-1})]Y_i(t_{i,j-1}) + X_{i,j}^T\gamma _\tau + e_{i,j}. \end{aligned} \end{aligned}$$
(2)

and with two prior measures is,

$$\begin{aligned} \begin{aligned} Q_{Y_i(t_{i,j})}(\tau |t_{i,j},.,X_{i,j}) = g_\tau (t_{i,j})&+ [\beta _{1,\tau } + \beta _{2,\tau }(t_{i,j}-t_{i,j-1})]Y_i(t_{i,j-1}) \\&+ [\beta _{3,\tau } + \beta _{4,\tau }(t_{i,j}-t_{i,j-2})]Y_i(t_{i,j-2})+ X_{i,j}^T\gamma _\tau + e_{i,j}. \end{aligned} \end{aligned}$$
(3)

The autoregressive terms are approximated by a linear function of the elapsed time between current and past measures. The smooth term remains and predictions again yield empirical growth curves for the \(\tau ^{th}\) centile. Notice that the policy effect is now also indexed by j to indicate that the covariate references a particular period and could also be specified as a lagged term. As noted by (see pages 32 and 61, (Wei, 2004)), the interpretation of autoregressive estimate \({\hat{\beta }}(\tau )\) is complicated under heteroskedastic error distribution. If \((t_{i,j}-t_{i,j-1})\) and \((t_{i,j}-t_{i,j-2})\) are held fixed, the resulting estimates do reflect the scale of the autoregressive effects.

For each model it will be necessary to interpret the magnitude, sign, and significance of parameter estimates and overall model fit. Hypothesis testing of parameter estimates is based on standard errors and p-values produced using the R package quantreg (Koenker, 2015). To assess model fit and aid in model selection we compare AIC across models specification and for different values of \(\tau\).Footnote 1 We also use visual fit diagnostics suggested by Wei (2004), that are based on comparing the share of the population in a subgroup below a predicted quantile to the nominal quantile; that is, \(\tau -{\hat{\tau }}\) where \({\hat{\tau }}=\sum _{i=1}^n n^{-1}I[Y_i(t) \le {\hat{Q}}_{Y(t)}(\tau )]\). For example, for children aged 12.50 to 12.75 the share of children below the predicted 0.97 quantile for model (1) is 0.9717 and for model (3) is 0.9659, giving fit measures respectively of -0.0017 and 0.0041. Positive valued fit indicates the model predicted a threshold such that fewer individuals than expected were below the predicted quantile, and more than expected were above it. Another interpretation, and supposing \(\tau =0.97\) is a marker for severe obesity, is that a positive fit measure suggests that the subpopulation has a larger share of extremely obese children than predicted by the model. We produce graphical measure of fit by age groups from 7 to 18 in steps of quarter year. Also, noting that these fit measures can be constructed for any definition of subgroup, we extend Wei’s approach to measure fit by school district. The spatial assessment of fit is graphically displayed as a map and we also use the information to assess spatial autocorrelation (a global Moran’s I)Footnote 2. Presence of spatial autocorrelation is suggestive of possible neighborhood spillover effects that are not captured in our model specifications.

While Wei introduced this framework almost 20 years ago (Wei, 2004, Wei, Pere, Koenker, & He, 2006), it has not been adopted for policy evaluation of the type we propose here. The data used in her applications is from the early 1960s with height and weight measures for roughly 2,500 Finnish children. They specifically model height, not BMI, which is better behaved as it is monotonically increasing. We believe we are the first to adopt the framework for use in evaluation of policy interventions. We also extend the fit diagnostics for age (Wei, 2004) to the spatial domain.

4 Results

We proposed three assessments of the modeling framework: an area-poverty covariate, an individual-level poverty covariate, and an intervention indicator based on exposure to water jets. We review the results for each of these three analyses below.

As noted already, socio-economic status (SES) and childhood obesity are correlated and both have a strong spatial signal (see Fig. 1a and b). Several districts in east Brooklyn bordering Queens and in the south Bronx, are characterized by high prevalence of both child obesity and extreme poverty. The connection between the two is well documented in the literature (Singh, Siahpush, & Kogan, 2010, Singh, Siahpush, Hiatt, & Timsina, 2011, ** & Jones-Smith, 2015).

Given the established strong connection in the literature, the model results for area- and individual-poverty covariates provide a useful benchmark for effect sizes and significance. Using the calibration data of 100,000 female student records, we fit the unconditional model (1) without a covariate and the AR(1) and AR(2) models with and without each of our poverty covariates. The area poverty covariate includes four levels – low, medium, high, or extreme poverty – with low being the reference category; the individual poverty covariate includes three levels – full price, reduced price, and free – with full price being the reference category. Model selection is based on AIC (see Table 1) and the visual fit diagnostics in the domains of age and school district (see Figs. 2 and 3). The AR(2) model either without poverty covariate (\(\tau =0.1\)), with area-based poverty (\(\tau =\{0.03,0.5,0.75,0.9,0.97\}\)), or with individual-poverty (\(\tau =0.25)\) have the lowest AIC and thus the best overall fit for specified quantile.

Fig. 2
figure 2

Fit and validation by age group

Fig. 3
figure 3

Fit by school district

Table 1 AIC for alternative model specifications

The visual diagnostics concur with AIC and reinforce the selection of the AR(2) model. Fig. 2 indicates the model fit across age groups for the unconditional, AR(2) model with area poverty covariate, and AR(2) with individual poverty covariate. All three fit almost perfectly in the age domain to both the training and validation data sets. There is no indication of overfitting to the training data in the age domain.

The same model specifications are evaluated in the spatial domain in Fig. 3 focusing on the 0.5 and 0.9 centiles to assess differences in fit comparing the center and right tail of the distribution. For the unconditional model (Fig. 3a), the same areas of east Brooklyn and south Bronx have positive residuals, indicating a larger share of obese children in those districts than predicted by the model using only age structure. The spatial dependence in the unconditional model can be quantified using a Moran’s I measure of spatial autocorrelation (see Table 2); indeed there is significant positive spatial autocorrelation present in the pattern. The spatial fit is improved by including prior BMI in either the AR(1) or AR(2) specifications (Table 2). However, note that even after conditioning on having two prior BMI measures, children in those districts are becoming more obese than elsewhere in the city. That is, there are areas in the city where a child’s BMI increases greater than a child from another area with same past BMI trajectory and equal age.After conditioning on prior BMI in the AR(2) models with a poverty covariate (see Fig. 3b and c), the spatial fit is improved over the baseline AR(1) or AR(2) models. In the AR(2) model with area poverty covariate the spatial pattern is absent in the 0.9 centile. This means that the estimate for the poverty covariate has captured the spatial variation in childhood obesity. In the AR(2) model with individual poverty the spatial pattern persists but it is much reduced as measured by the Moran’s I (Table 2).

Table 2 Moran’s I and significance. Significance is based on permutation tests of 1000 iterations.

Detailed parameter estimates for the AR(2) models with poverty covariates are provided in Table 3 and the composite autoregressive effects, assuming 1 year duration to the one-step lagged BMI and a 2 year duration to two-step lagged BMI, are provided in Table 4. For the area poverty covariate, there is no measurable impact in the lower tail (\(\tau =0.1\)), the impact is present in the center of the distribution (\(\tau =0.5\)), and has the most impact in the upper tail of the distribution (\(\tau =0.9\)). The largest effect size is for children living in extreme poverty districts, with a rightward shift of 0.345 in the 90th centile, after controlling for two prior BMI measurements and age. This equates to a roughly 1.5% to 2% increase in the 90th centile BMI depending on the child’s age. Given the literature base on poverty and obesity, we can regard this as a large effect size.Footnote 3. Also notice there is strong dependence on prior BMI measurements (Table 4); at the 90th centile, roughly 3/4 of the value of the one-step lag and 1/3 of the value of the two-step lag is propogated into the current BMI measure. The effect of individual-level poverty is more muted but still in the expected direction. Notably, comparing students receiving free lunch (high poverty) versus full price lunch (low poverty), the former group has a 90th centile 0.208 higher than the latter group. The lag effects are almost identical to the area-poverty model. While we don’t display full results for the full set of centiles we estimate (\(\tau =\{0.3,0.1,0.25,0.5,0.75,0.9,0.97\}\) due to space constraints, the basic pattern for poverty is as characterized in Table 3; comparing extreme poverty to low poverty districts, the whole BMI distribution shifts higher and in increasing amounts the farther out the centile is in the right tail. Essentially, the distribution shifts rightward and is also more right skewed under extreme poverty.

Table 3 AR(2) model specifications with area-level poverty covariate (1) and individual-level poverty proxy (2). Quantile results displayed are for \(\tau =\{0.1,0.5,0.9\}\)
Table 4 Estimated autoregressive effects given \(t_{i,j}-t_{i,j-1}=1\) and \(t_{i,j}-t_{i,j-2}=2\) for model specifications (1) and (2) in Table 2

The final assessment of the modeling framework compares students exposed to water jet access (treatment group) against those without access to water jets (control group). We do not present an AIC table but simply note that the AR(2) models are again preferred. Parameter estimates, standard errors, and significance indicators are provided in Table 5. The first specification (3) is based on data for the 4,744 student treatment group and a randomly selected 14,232 student control group. Results indicate a roughly symmetric leftward shift in the BMI distribution for the students in the treatment group relative to the control group. We know that district is important because of income segregation, and not all districts received water jets. The second (4) and third (5) specifications in Table 5 use a control group that is proportionately matched to the distribution of districts represented in the case group. This stratification results in an increased estimated leftward shift in the 90th centile. We can also include statistical controls for individual level poverty, and the water jet impact then increases further; a statistically significant -0.234 shift in the location of the 90th centile for children given access to water jets versus those without access.

Table 5 AR(2) model specifications for water jets interventions. Specification (3) uses data for water jet exposed students with a random sample of non-exposed students. Specification (4) uses water jet exposed students with a random sample that a matched proportionally by district. Specification (5) uses the same data as (4) but includes statistical adjustment for individual level poverty. Quantile results displayed are for \(\tau =\{0.1,0.5,0.9\}\).

5 Discussion

It is already well established that for a variety of reasons, health conditions covary with SES and the link between childhood obesity and poverty is just a single instance of that broader pattern (Barr, 2014; Schroeder et al., 2020). Our point here is to demonstrate the utility of quantile growth curve modeling of longitudinal records to characterize this relationship. The framework presented here can be used to identify social determinants, measures and monitor disparities in child obesity, and describe the relationship between BMI and neighborhood-level characteristics. Such descriptive epidemiology provides critical input to public health practice informing decision making including the targeting of interventions at the individual and community scale. We have also demonstrated how the framework can be used to evaluate these interventions longitudinally.

We have quantified the relationship of poverty to BMI growth curves at both the individual and neighborhood level. There is a clear gradient of BMI increasing with poverty especially at higher BMI values. Identifying and monitoring health disparities is a central function of public health and the highly detailed characterization presented can directly contribute to these efforts. By quantifying the magnitude of poverty/BMI relationship we also establish a scale to interpret effect sizes for public health interventions.

The evaluation of water jets presented here provides strong evidence in support of their impact on childhood obesity. The evaluation builds on our intial models by incorporating an intervention impact in addition to either an AR(1) or AR(2) growth trajectory. As such, the impact of water jets is measured as a shift in the potential BMI trajectory. These evaluations were performed over two specifications with similar results. The measured impacts at higher BMI percentiles were in the −0.15 to −0.25 BMI points range. While these impacts may appear modest, they have magnitudes approaching the observed impact of extreme poverty. Further, a shift in BMI of -0.15 to -0.25 points, the estimated impact of water jets, would correspond to a 2.6 to 4.8% decline in obesity, or 3,500 to 6,500 children in New York City.

We have also presented a mechanism to characterize the spatial pattern of obesity across neighborhoods. As presented here, the spatial patterns are established as a type of area-aggregated model residual derived by comparing the share of the sample population below a predicted quantile to the nominal quantile. This could be further enhanced by directly incorporating neighborhood-level characteristics or interventions as shown with neighborhood poverty. It would also be possible to use interesting cross-over research designs that would use the information on a child’s change in residence to treat neighborhood features as a kind of natural experiment (Schroeder et al., 2020).

The key point in this paper is that we have demonstrated that it is possible to develop models of child BMI trajectories that incorporate longitudinal records and a policy control with individual covatiates without resorting to an external centile reference chart or the myriad assumptions inherent in converting raw measures to the domain of reference charts. It is also important to note that because the quantile growth models can be used to characterize differences in the overall shape and location of distributions under policy treatment or no policy treatment, we can also translate our results back into the domain of CDC growth charts. That is, we can use the models to indicate how CDC defined severe obesity increases or decreases.

The efficacy of childhood obesity policy should be evaluated as the impact on individuals’ BMI trajectories. Quantile regression of longitudinal growth curves can describe full BMI distributions while incorporating demographic and policy-specific information. Further, the impacts of obesity-specific policies are of varying importance across the distribution of BMIs. This is a key feature of quantile regression (Koenker, 2018) that has begun to receive attention at the research design or policy formation stage (Wang, Zhou, Song, & Sherwood, 2018). As long as policy can be applied to individual children, either by their school or by the residence (community), the impact can be quantified. Likewise, changes in the built environment (community or school) can be incorporated into these models to assess their impact on individual growth trajectories.

The statistical models developed in this paper rely on the availability of highly detailed longitudinal administrative data. Such data are becoming increasingly common and the detail in these databases is also expanding with the establishment of integrated data systems. Our framework is particularly relevant when evaluating targeted policies that are typically evaluated in an ad hoc manner. Even rigorous evaluations of such policies routinely ignore the full policy context in which it was implemented. Our proposed approach takes advantage of the demographic, geographic, and temporal detail within school records. In countries without health registries, such as the United States, demographics taken from school records offer incredibly high coverage with high detail and the ability to see individuals change through time. Our analytic approach takes full advantage of longitudinal information potentially enabling causal interpretations of the link between policies and outcomes. By focusing on BMI the approach achieves greater power than using an obesity indicator, while the quantile regression allows us to quantify the impact on the right tail of the distribution where health risk is known to occur.