1 Introduction

India was one of the highest contributors to the world’s stunted children. In 2015, the prevalence of stunting among children under the age of 5 years was 38.4% in India and 23.1% for the world as a whole. The prevalence of stunting in India decreased by merely 3 percentage points to reach 35.5% by the year 2019–21 (IIPS, 2021). However, it is observed that the rural sector is the major contributor to the national decline in the percentage of stunted children as it declined from 41.2% in 2015–16 to 37.3% in 2019–21 whereas the decline in the urban sector over the same period was only 0.9 percentage point (31% in 2015–16 and 30.1% in 2019–21).

Only a few rigorous studies exist that show that milk consumption is positively associated either with lower stunting rates or a lesser likelihood of stunting or higher HAZ scores (de Beer, 2012; Choudhury & Headey, 2018; Headey et al., 2018; Herber et al., 2020). A few studies exist for African countries and even fewer for South Asian countries that analyse the association between milk intake and child’s height. This could be because collecting individual-level dietary intake for very small children in a nationwide large-scale survey could be a very challenging task and also prone to reporting errors from the lack of knowledge and understanding among develo** country respondents.

The National Sample Survey Organisation (NSSO) collects data on the quantity and expenditure of milk and other dairy products at a household level. The National Family Health Survey (NFHS)—which is a part of the worldwide DHS surveys—collects qualitative information on dietary habits for young children aged between 0 and 23 months and also for the mothers of children aged 0–59 months. Anthropometric information is also available for both mothers and under-five children in the NFHS data. Thus, this rich source of qualitative information on diets and quantitative information on nutritional outcomes is used for the empirical analysis in this study.

The aim of the study is threefold. First, if there are key covariates that differ between children who consumed milk the previous day and those who did not. Second, if the covariates associated with milk consumed or not are similar or different from the frequency of milk consumed among the children who consumed it. Third, what is the nature of the association between the frequency of milk consumed (including none) and the child’s height for age Z-score (HAZ)? We find that all these three aspects are given limited attention in the Indian context.

The findings of the study are intuitive and along expected lines, but the contribution of this study lies in quantitatively establishing it in a manner that is very different from the earlier studies. The Zero-Inflated Poisson model is used to analyse the factors associated with a large number of ‘no milk given’ reported by the primary caregiver and then the factors associated with the frequency of milk given in the last 24 h. This enables one to understand whether the factors for the two aspects of decision-making are similar or different. Further, by considering the frequency of milk consumed as an endogenous variable in a model that explains the factors associated with HAZ, the association between the higher frequency of milk given and better HAZ scores is significantly evident; once again, a feature not well-established in the Indian context.

Section 2 briefly discusses the relationship between milk consumption and HAZ in very young children and Sect. 3 summarizes the findings of the Indian studies on this aspect. Section 4 discusses the data and the variables of this study followed by Sect. 5 which gives an empirical analysis of the milk consumption pattern and then a brief analysis of its association with HAZ. The last section briefly highlights the policy options to increase the intake of dairy milk.

2 Milk Consumption and Linear Growth Among Young Children

The age group of 6–23 months is ideal for the analysis of milk consumption (given)Footnote 1 as a child’s dietary requirements increase with age, and complimenting with nutritionally adequate diets alongside optimal breastfeeding is recommended by WHO and UNICEF (IYCF, 2021). The Infant and Young Child Feeding training module for mothers as part of the National Health MissionFootnote 2 mentions that breastfeeding though relevant provides only half the nutrition in the 6–12-month age and only one-third of the requirement in the 1–2-year age. One of the complementary foods that is recommended for this age group is dairy milk, which is a natural energy drink and provides balanced nutrition of macronutrients (proteins and fats), vitamin (B12), and micronutrients (calcium) and can be fortified with Vitamins A and D to improve the adequacy of widespread deprivations in those micronutrients (Dror & Allen, 2011).

Milk is an Animal Source Food (or ASF) and is accepted both as part of vegetarian and non-vegetarian diets in India though it has less acceptance among the tribal population. ASFs are found to be better than plant-based proteins to treat moderate malnutrition. An important feature of ASFs is their anti-nutrient property. Anti-nutrients are compounds that reduce the body's ability to absorb essential nutrients. Anti-nutrients are naturally found in animals and many plant-based foods and include phytates, lectics, saponins lysinoalanine, polyphenols, and tannis. Michaelsen et al. (2009) find that high phytate in foods has a strong negative effect on the bioavailabiliy of important minerals like iron and zinc. Phytates along with tannis (polyphenolic compound) have a negative effect on growth, and so low Phytate content should be given priority in the diets of children with moderate malnutrition. Since ASFs have little or no anti-nutrients, they recommend a diet with high content of ASFs for moderately malnourished children.

The focus on 6–23 months old children is also relevant as the first 1000 days of the child’s birth is very important in terms of care and nutrition, and any prolonged setback results in growth faltering. This faltering is essentially associated with linear growth or height and captures chronic undernutrition. One of the reasons for not attaining age-appropriate height is the inadequacy of infant and young child feeding practices. Unlike weight as a measure of nutritional status, catching up on heights is usually very slow at a later stage of growth. In a predominantly stunted and anaemic adult population with increasing overweight and obesity among many develo** countries including India, the focus on linear growth of children is very important.

As per WHO convention, a child’s growth and general nutritional status can be assessed using height-for-age Z-scores (HAZ), weight-for-age Z-scores (WAZ), weight-for-height Z-scores (WHZ), and body-mass-index-for-age Z-scores (BMIZ). The Z-score system expresses the anthropometric value as a number of standard deviations or Z-scores below or above the reference mean or median value. A child with HAZ score between −1 and −2 is marginally stunted, between −2 and −3 is moderately stunted, and below −3 is severely stunted. Stunting can hurt a child’s physical and cognitive development and in the long run, it can lead to less productivity, lower Intelligence Quotient (IQ), health issues, or even premature death.

3 Milk Consumption and Nutritional Status of Indian Children: Evidence So Far

Vanderhout and Corsi (2021), based on NFHS-4 data for 6–59 months, showed that the coefficient for dairy milk consumption was positive and statistically significant for both HAZ and weight for age (WAZ) after controlling for child, mother, household, and regional characteristics. Interestingly, only 41% of the sample was included in the HAZ model. Parasar et al. (2021) also used the same data set and noted that across districts of Odisha, when the proportion of children reporting milk consumed exceeds 30%, then with an increase in the proportion of such children the underweight rate declines. Further, using the unit-level data for Odisha, WAZ was found to be positively associated with milk consumption and that economic status, as well as urban residence, played an important role. This study, however, did not analyse the association of milk consumption with HAZ for this less developed state of India. The study further focused on a primary survey to understand the variations across districts and in particular the tribal/non-tribal difference in milk consumption. An important contribution of this survey was to understand other commercial substitutes to dairy milk, like Amulspray, Cerelac, Horlicks, Lactogen, and Biscuits. This analysis is very insightful as it not only provides a perception of what kind of commercial products are considered as substitutes of milk by the consumers for their young children but also how it varies across rural–urban, tribal/non-tribal, and age groups of young children.

None of these studies pays attention to the fact that DHS surveys restrict the dietary questions to only the recent births in the last three years, and this primarily covers 6–23-month-olds. Headey and Palloni (2020) while taking cognisance of this consider the 24-h recall (on a child’s dietary intake) as not a good representation of dietary patterns and instead focus on the mother’s dietary habits. However, their analysis focuses mainly on vegetarianism and its association with stunting separately among 6–23-month-old children as well as among 24–59 months.

Agrawal et al. (2019) analyse the dietary pattern of children 6–23 months based on the same data and two results stand out: as a mother’s education increases or economic status improves, the proportion of children with no milk consumption declines sharply.

4 Data and Description of Variables

The main database used in this study to analyse the pattern of milk given among very young children is the National Family Health Survey-4 (IIPS, 2017) with the survey conducted between January 2015 and December 2016 in two phases. The data on milk consumption is recorded in the survey as milk given in the last 24 h by the primary caregiver for the child and the frequency is recorded as 0, or 1, 2 going up to seven or more times. The survey design of NFHS is such that milk consumption frequency data are collected for the children born in the last 2 years, which would mean that they are recorded only for 0–23-month-old children. In Fig. 1a, the proportion of children that had milk at least once in the previous day cuts off at about 23 months, and similarly for the proportion of children reporting intake of staple food like grains, tubers, etc. at least once as shown in Fig. 1b.

Fig. 1
Two line graphs labeled A and B plot the consumption of milk and grains, roots, or tubers by children in rural and urban areas. In both graphs, the lines for the rural and urban rise till 22 months, decrease sharply till around 24 months, and then slightly increase with slight fluctuations.

Source Author’s own calculation based on NFHS data

Type of diet consumed in the past 24 h across age (in months): proportion of children.

Thus, the analysis of milk consumption patterns of children and their growth is restricted to 6–23-month-old children. We exclude 0–6-month-old children mainly because the WHO recommendation is for exclusive breastfeeding, and any other food given is simply a violation of the norm, and analysis of such a feature is beyond the scope of this study.

As mentioned earlier, a child’s nutritional status is assessed by Height for Age and Weight for Age and measured as Z-scores. Figure 2a shows that height faltering is rather severe in rural than in urban areas with the HAZ scores starting at almost similar values at about 6 months but declining more rapidly and settling down to a lower value around the age of 24 months. Comparing HAZ (Fig. 2a) with WAZ (Fig. 2b), we observe that faltering is more severe for height than on weight. Then there is a persistent rural–urban gap in the 6–23-month age group, which motivates us to focus on heights than on weight. Further, a large part of the existing literature assesses linear growth with milk intake.

Fig. 2
Two line graphs plot Z-scores for H A Z and W A Z versus the age of a child in months. The H A Z scores in both rural and urban areas initially decline till 20 months of a child's age while the W A Z scores initially decline till 18 months of a child's age and then slightly fluctuate.

Source Author’s own calculation based on NFHS data

Z-scores for height-for-age and weight-for-age across age (in months): rural–urban comparison.

In Table 1, summary statistics are presented for the covariates that are used in the econometric models for analysing the variations in the milk intake pattern (binary response of Yes/No and the frequency) and the association between milk intake pattern and HAZ after controlling for other covariates. Alongside milk a total of 21 food items that the child could be given in the previous 24 h and reported as binary responses whether given or not. An important variable linked to nutritional intake is breastfeeding which is reported as a binary variable, whether currently breastfed or not and is not a 24-h recall like other food items.

Table 1 Mean and standard deviation of variables for 6–23-month-old children: rural and urban areas (2015–16)

The average HAZ score is below zero for both rural and urban 6–23-month-old children in India in 2015–16 indicating that many children in rural areas are far below the normal range of values while the standard deviation is nearly the same for both regions. About 68% (55%) of children in rural (urban) areas have not consumed milk in the past 24 h and among those who have reported consumption, the modal value is two times in a day. The rural–urban gap in the proportion of children is lesser for frequencies once and thrice compared to other frequencies of milk consumed. The categories for other food items mentioned in Table 1 based on Agrawal et al. (2019) show that rural rates are lower than urban, but the gap is largest for any other solid/semi-solid or liquid food and the least for grains/roots/tubers. This rural–urban difference in lower rates of other diets appears ‘compensated’ by (currently) breastfeeding of 87% in rural and 79% in urban areas.

In rural areas, the number of female children is marginally lower than the male while the average age is similar at 14–15 months. Due to larger family size, more children in rural areas are with higher birth order than in urban areas, and a qualitative assessment of birth size is not very different between rural and urban areas. All these covariates of child characteristics are known to be associated with HAZ from earlier studies and will be used as control variables in the relevant econometric models discussed later.

The average age of a mother is about 26 years and secondary educated mothers have the largest share in both rural and urban areas, but more urban women are better educated than rural. The mother’s height captures her long-term cumulative nutritional status since her birth. This would be influenced by both genetic and environmental factors including the socio-economic and health conditions in their early childhood. Mother’s heights are categorized into 4 quartiles which show that a higher proportion of rural women are in lower quartiles with a reverse pattern for urban women.Footnote 3 This captures an effect of deprivation (than genetic) from having grown up in less privileged circumstances for rural women compared to urban women. The short-term nutritional status of mothers is based on BMI which also shows that more women in rural are underweight, while malnourishment among urban women is from being not only underweight but also overweight and obese. These characteristics of mothers are controlled in the HAZ regression model to account for intergenerational transmission of nutritional status. Mother’s education is included in milk consumption and HAZ models to account for mother’s awareness and empowerment which play a key role in the care and feeding practices of her children.

About 13% (11%) of household heads are women, with an average age of household heads as 45, and 23% (29%) are from households with a size of less than 5 in rural (urban) areas. The access to basic amenities for a healthy and hygienic living environment is far poor in rural than in urban; whether it is clean cooking fuel that would reduce indoor air pollution or better-quality water or poor sanitation with high open defecation or non-disposal of child’s stools properly. Cattle at home is a strong predictor of milk consumption among rural children and more so in the absence of milk markets (Hoddinott et al., 2015). In this database, the only information available is on household ownership of bovines, which is 51% (8%) in rural (urban) areas. This binary variable captures if bovine ownership makes a difference to a child’s milk consumption after controlling for other factors including the economic status of the household. It is however important to acknowledge that bovine alone may not be the only animal source for the consumption of milk from self-production and that some bovines may not be providing any milk and may be used for other purposes.

Expectedly, Hindus are predominant but more so in rural than in urban followed by Muslims (lower share in rural and urban) and equal representation for other religions. The caste distribution is skewed towards the less privileged ones like the scheduled tribe (ST) and scheduled caste (SC) in the rural areas compared to the urban areas with a higher share of other castes while OBCs are equally represented. The economic status of the household is assessed based on the wealth index score (first principal component based on a multidimensional asset possession dichotomous variable) separately for rural and urban areas and used in this analysis as opposed to the commonwealth index based on all India scores. These scores are then classified into five quintiles each with 20% households in them, and the lowest (topmost) quintile is the relatively poorest (richest). For this sample of 6–23-month-old children, 23% (26%) were from the relatively poorest rural (urban) household and 19% (17%) from the topmost rural (urban) wealth quintile.

Lastly, we consider the district-level average per consumer unit protein from liquid milk intake by households having at least one child aged 0–2 years in the year 2011–12. This information is estimated from the NSSO consumption data after accounting for the age–sex composition of the household (NSSO, 2015). This consumption information thereby proxies for the district-level availability of liquid milk from the previous year. The NSSO data is a few years preceding the NFHS survey and thus avoids any potential endogeneity, and we do not expect any major changes in milk consumption across rural and urban districts of India within this small gap of four years.

5 Examining the Covariates Associated with Milk Consumption

5.1 Breastfeeding, Other Food Items, and Milk: Rural/Urban Comparison

5.1.1 Comparison Across Age Groups

The children, who were currently breastfeeding, were rather high for 6–23-month-olds but declined more slowly with age in rural when compared to urban (Fig. 3a, b). 6–11-month-old children had the largest rate while the decline in breastfeeding rate was modest in the 18–23-month group with about 78% of rural children and 68% of urban children. It is to be noted that compared to food items assessed by a 24-h reference period preceding the survey, the breastfeeding question was on whether the child is currently breastfeeding or not and hence is of different time duration. It is unclear whether breast milk is a substitute or a complement as the frequency was not reported. As a result, it would be difficult to conclude whether it was beneficial to the child or not. However, we could examine this empirically that if its coefficient was either statistically insignificant or significant with a negative value in the model for milk consumption or in the HAZ model, then it could be inferred that breastfeeding was on average a complementary feeding. We now turn to other food items.

Fig. 3
Two grouped bar graphs plot the proportion of children of 3 age groups who breastfeed or consume dairy, eggs, fish chicken, and other items in rural and urban areas. The proportion of breastfeeding children is the highest in both rural and urban areas.

Source Author’s own calculation based on NFHS data

Rural/urban comparison of breastfeeding, other food items, and milk consumption in 2015–16 across 6-month age groups.

The highest share of reported food items in the last 24 h was grains, roots, tubers, etc. followed by liquid milk alone or for all dairy products for the 6–11-month group while the next highest share was for vegetables and fruits in the 11–23 age group; all of these items show an increase in reported rates with an increase in age. So clearly, age plays an important role in the reporting of 24-h recall of milk consumption, yet the percentage of children doing so in rural and urban areas is rather low compared to the consumption of grains, etc. Ease in feeding breast milk as long as mothers are lactating could play an important role in the decision (on average) not to give dairy milk. Thus, the zero milk consumption reported by many children in the last 24 h would be a random event of missing out on a given day only for some children while for many it could be a systematic feature either due to lack of availability, or limited affordability or awareness.

5.1.2 Comparison Across Mother’s Education

When the proportion of children reporting milk consumption in the last 24 h is classified across the mother’s education level, then a higher consumption rate was reported for better-educated mothers (Fig. 4a, b). There is a significant jump in the proportion of children reporting milk (and dairy products) consumption among mothers who have higher secondary and above education. The rate of milk consumption increases from 33% (for no education) to 52% in rural areas while for the urban areas it increases from 38 to 55% increases. Further, compared to rural children, a higher proportion of urban children irrespective of their mother’s education level consume milk or dairy products at least once during the reference period but more importantly, the rural–urban gap closes in for the highest education level.

Fig. 4
2 grouped bar graph illustrates a comparison between rural and urban areas regarding children's food consumption, categorized by their mother's education level. In both graphs, breastfeeding emerges as a prominent choice across all educational backgrounds.

Source Author’s own calculation based on NFHS data

Rural/urban comparison of breastfeeding, other food items, and milk consumption in 2015–16 across mother’s education level.

5.1.3 Comparison Across Household’s Economic Status

NFHS combines the possession of several household consumer durables into a wealth index score based on the first principal component. This score is then categorized into five groups with 20% of the households in each ‘wealth’ category by ranking the scores in the increasing order. The bottom 20% scores are relatively the poorest households in terms of these household assets, followed by poor, middle, rich, and richest (top 20%). These relative rankings of the economic status of the households are available separately for rural and urban areas as well as for all of India. Since we observe that the pattern of milk consumption varies significantly between rural and urban areas, we consider the relative economic status separately for rural and urban areas. Figure 5a, b shows that breastfeeding declines as the wealth index score increases and compared to any other food item, milk and dairy products show a significant positive gradient with economic status.

Fig. 5
2 grouped bar graphs plot children's consumption of food in rural and urban areas, based on their mothers' economic status. In both graphs, breastfeeding emerges as a prominent choice across all economic statuses.

Source Author’s own calculation based on NFHS data

Rural/urban comparison of breastfeeding, other food items, and milk consumption in 2015–16 across wealth quintiles.

The proportion of children reporting consumption of grains etc. does not vary across the sub-categories of either mother’s education or household economic status. Vegetables and fruits increases across all levels of education while this increase is observed only for the topmost wealth quintile among the wealth status categories. Though the patterns of consumption within the sub-categories of either mother’s education or household’s economic status remain the same, the proportion of children reporting a diverse consumption of food items is higher in urban than in rural areas. This could be attributed to either higher availability or affordability or both. More importantly, irrespective of whether the household is in rural or urban areas mother’s education level of higher secondary and above is associated with age-appropriate diets even in a 24-h recall period. For instance, diets like dairy (inclusive of liquid milk), other solid and liquid food, pulses, etc. and vegetables and fruits alongside calorie-rich food like grains and tubers are observed for a higher proportion of children with better-educated mothers. Nutrition awareness as assessed by the mother’s education status is highly likely to influence the dietary diversity of young children.

5.2 How Big is the Rich–Poor Gap and How Does One Bridge This Gap?

In this section, we estimate a logit regression model for the binary variable of milk consumed or not. The intention is to understand the role of other covariates in reducing the rich–poor gap in milk consumption and hence provide insights on pathways for policy intervention including behavioural change. The results from the logit model are easy to interpret as odds ratios, which is the ratio of the probability of milk consumption to the probability of no milk consumption.Footnote 4 If the estimated odd ratio is more (less) than one for a particular covariate, then it means that the covariate is favourable (unfavourable) to the probability of milk consumption as opposed to the probability of no milk consumption. Table 2 reports the estimates of the odds ratio in a logit model for milk consumption, for three models with different covariates for rural and urban areas.

Table 2 Estimates of odds ratio (logit regression model) for probability of milk consumption: rural/urban children (6–23 months)

As shown in Table 2, the estimates of Model-1 for Rural Sector indicate that the odds of the richer children reporting milk consumption in the last 24 h is 2–3 times more than a child from the poorest households. The odds of milk consumption change marginally when other diets and age is controlled for. However when we include mother’s education level and milk availability (captured either by bovine at home or the district level average per consumer unit protein from liquid milk), the rich-poor gap in milk consumption declines; thereby highlighting the relevance of mother’s awareness and improved access to milk.

5.3 Frequency of Milk Consumption

5.3.1 Broad Patterns

For milk consumption, the data is also available for the number of times milk is given in the last 24 h with the response recorded as 0, 1, 2, 3, 4, 5, 6, and 7 plus. If we exclude zero milk consumption, then the largest rate is for ‘Twice’ and is the median value (Table 3). The next largest rate is three times followed by once, and then the frequency of four times and above can be combined for the sake of counting all such children.

Table 3 Frequency of milk given in the last 24 h: distribution of children (%)

Regrou** this milk frequency to 0, 1, 2, 3, and 4 plus, it is observed that the zero milk given declines substantially among those with bovine, and the shift seems to be happening more towards three times and to a lesser extent to twice while in urban areas the change is similar but is very minimal (Table 4). Owning and managing dairy animals and livestock is largely a rural activity, so the change in zero times milk given in urban areas will not be affected if the data on milk consumed is tabulated across ownership of cattle. In the absence of markets and limited affordability, ownership of bovines (as in the NFHS questionnaire) is an important determinant of the frequency of milk given to the child.

Table 4 Frequency of milk consumed and (a) ownership of bovine or (b) currently breastfeeding

The largest change from zero to a higher frequency of milk consumption is when the child is currently not breastfeeding (Table 4). From about 60–65% of children reported as not given milk in the last 24 h, it declines to about 35–41% between ‘yes’ and ‘no’ of currently breastfeeding. However, the number of children reporting not breastfeeding currently is a very small percentage, but this once again gives a sense of both availability of other sources of milk and its affordability. This is compounded by limited awareness, and it could be an intensive task to manage the child’s daily dietary requirement for many mothers given their daily chores including caregiving for other members of the household.

5.3.2 Zero-Inflated Poisson Model: Frequency of Milk Consumption Among Rural Children

As rural areas show large stunting rates and lower milk consumption, this section analysis the covariates associated with variations in the frequency of milk consumed after accounting for a large number of zero values in the last 24-h reference period. As we observed, in the context of zero versus more than zero frequency of milk consumption in Sect. 5.1.3 that apart from economic status, other aspects play an important role in improving milk given (and hence intake) by these very young children. Most studies in the Indian context have not used this information on milk given in the last 24 h partly because it is not very reliable (Headey & Palloni, 2020), and there are few instances of positive values. However, we do find that this need not be the case and explore further to analyse the data.

We explain the variation in milk pattern using the covariates to first estimate the part with excessive zeros and then a count of events for the frequency of milk consumption using the Zero-Inflated Poisson (ZIP) model. This model on the one hand allows some values to be zero as a natural possibility of Poisson random variable along with the number of times the non-zero event happens while an unusually high number of zero values are also analysed within the same framework but considering that as a part of a different data generating process.

The estimated model has two components: The first part analyses the covariates associated with the ‘inflated zeros’ or the logit model with the probability that milk consumed (Y) is zero as opposed to the probability that Y has a positive value but the positive values are all assigned one.Footnote 5 Unlike a selection model, the logit model here predicts the probability of Y = 0 and not Y = 1. The second part of the model analyses the covariates associated with the frequency of milk consumed using a Poisson regression model. This ZIP model specification is more relevant here to distinguish between no milk consumed on a random day either due to illness, etc. from a systematic pattern of not giving milk as a dietary preference by the household or the primary caregiver. The results of the ZIP model are reported in Table 5 with logit estimates for no milk consumed in Panel (a) and Panel (b) of Table 5 with the Poisson estimates for frequency of milk consumed in the 24 h prior to the survey.

Table 5 Coeffieicient estimates for Zero-Inflated Poisson (ZIP) model for frequency of milk consumption

The aim here is to understand given the past literature (and scientific understanding) which of the covariates are relevant for the logit model of zero probability of milk consumption and the ones relevant for the Poisson model for frequency of milk consumption. AIC/BIC are used to compare the model fit across these two specifications, and the one with the least AIC/BIC values was chosen. The main finding is that the model with the larger set of variables in the zero-inflated part of the logit model reduces the AIC/BIC more than the one with fewer variables. Correspondingly, we observe that a few variables that were insignificant in the Poisson model are highly significant in the zero-inflated part.

The interpretation of the coefficients in terms of their exact magnitudes is avoided as causal inference is difficult to establish with such data. Instead, the statistical significance, sign and relative magnitudes of the estimated coefficients are used for the interpretation of the results. In the last 24 h, the probability of no milk given was higher among female children than for male children and a 6–11-month-old (reference group) child was more likely to not consume dairy milk compared to the children in the next two age groups. The gendered aspect of milk consumption could go either way in terms of benefitting or not benefitting a male child compared to a female child. If for an older male child mother’s milk was weaned away at an appropriate age and substituted by dairy milk, then it would be nutritionally beneficial while the reverse would be true for a child in the 6–11 month age group.

Currently being breastfed was positively associated with the likelihood of no milk given while food supplements that were given in the past day like grains and tubers, other solid food, and chicken and meat in the last 24 h were all negatively associated with it with the relative magnitudes of the coefficients declining in that order. Controlling for other food items, when egg was given, it was associated with the zero likelihood of milk given, and was perhaps considered as nutritionally equivalent to milk; egg, like milk, can be easily fed to children in that age group. Similarly, other liquid food including dairy products (primarily yogurt) has a statistically significant and positive sign, implying that this is also a substitute for milk.

Mothers with above (below) secondary level schooling were more (less) likely to (not) give milk and so were those who owned a bovine at home. This result for bovine ownership is interesting even though it is not known if it was providing milk or not and if so, how much milk was being used for self-consumption. The (less) economically well-off were more likely to give (no) milk and so were those children who were residing in the rural area of a district which had higher per consumer unit protein intake from milk consumption in 2011–12 based on NSSO data. This district average is based on only those households which had at least one child in the 0–2-year age group and, as mentioned earlier, is used as a proxy for the availability of milk at a geographically aggregated district level.

Once the probability of zero milk consumption is accounted for, the Poisson regression estimates in Table 5b show that the frequency of milk consumption is not different for girls and boys. The frequency of milk intake increases more in the older age group of 17–23 months. It must however be mentioned that the Poisson regression model could result in a monotonic association between the number of times milk was consumed, and its covariates while it may not be able to capture the non-linearity. For instance, two- or three-times milk given and supplemented with other diets would be more beneficial. So, the choice of the model could impose an a priori structure that may not be distinguished from the observed behaviour of whether factors associated with milk given twice or thrice were different from when milk was given once or more than thrice. For instance, a multinomial logit model with selection may be better if the different milk frequency was considered as a separate nominal category rather than ordinal ones after allowing for the selection of zero/non-zero milk given. Though this would allow the observed outcome to be appropriately modelled, this would mean that for the selection model all zeros are treated similarly while the reality would be that some zeros are expected while many could be inflated. Further, in this context, a higher frequency of milk could also be beneficial from a nutritional perspective given the young age of the child who cannot consume much in one feeding time. So small quantities of milk interspersed between other food items to feed periodically in the last 24 h could be the preferred option provided time is available for the mother or other caregivers in the family. On both these accounts, the choice of the Poisson regression model seems appropriate.

Currently, breastfeeding reduces the frequency of milk given even after playing a significant role in being associated with the likelihood of no milk given. In contrast to this, dummy variables for consumption of different food items were all statistically insignificant and hence excluded from this model. The frequency of milk consumption increases with a mother’s secondary and post-secondary education compared to below secondary education. Children in Muslim, scheduled tribe, or scheduled caste households in rural areas were given milk less frequently. After accounting for the observed zero-inflated data on milk consumption in the past 24 h, higher milk frequency is positive and statistically significant at 5% (10%) only for the relatively richer (richest) compared to the children in all other rural households. This is different from the zero-inflated part of the results where the higher economic status had a very clear secular decline in the relative magnitudes with the probability of no milk given.

Thus, the above empirical strategy of the Zero-Inflated Poisson regression model has enabled us to make use of the frequency of milk given to 6–23 months old based on a 24-h recall prior to the survey date. It has also enabled us to control for other demographic characteristics and dietary intake behaviour of such children while also understanding the role of accessibility to milk through either markets or home production. This analysis fills the gap in the absence of a quantitative assessment based on a nationwide representative survey for India while such studies exist for neighbouring Bangladesh and some of the African nations. Our findings corroborate that the presence of bovine at home and currently breastfeeding as important factors for milk consumed by very young children. Economic status and mother’s education are other factors for improving the number of times milk is given while the results for a girl child can at best be interpreted ambiguously. The following section further analyses such ambiguities as well as to understand the relevance of milk intake on linear growth.

5.4 Milk Consumption and HAZ for Rural Children

In most recent studies that analysed the role of milk consumption in a child’s HAZ (or linear growth), the milk consumption variable was considered as an exogenous variable. Given the findings in the previous section, we consider the five categories (frequencies) of milk consumption as endogenous variables. The endogeneity arises from the fact that a very large number of households surveyed over two years could not be reporting zero milk consumption for the 6–23-month-old children in the past 24 h unless there is a decision involved in doing so. Herber et al. (2020) find a huge variation in their sample of 73 countries covering two decades that the proportion of children who consumed milk in the last 24 h ranges from 20 to 95% across different countries. Thus, economic, social, and cultural factors which influence food habits, seem to determine the decision to give milk regularly to children in this age group and this gets reflected even when milk consumption was recorded with a short recall period of 24 hours.

We estimate a linear regression model for HAZ (the dependent variable) of 6–23-month-olds with the frequency of milk consumed in the previous day as an endogenous variable and the results reported in Table 6.Footnote 6 We estimate the model with a slightly different specification where endogenous milk consumption is specified as an ordered probit model with no milk given as the reference category, and once, twice, thrice, and more than thrice are the four other categories.

Table 6 Coefficient estimates for HAZ with frequency of milk consumption as exogenous and endogenous

Table 6 reports only a part of the results of the regression model mainly to focus on the difference in the result when the frequency of milk consumption is exogenous vis a vis when it is endogenous. When the frequency of milk consumed is exogenous, then it is statistically significant and has a positive association only when it was consumed once and all other higher frequencies are no different from no milk consumed and hence have no association with HAZ. When the frequency of milk consumed is endogenous, not only are each of the milk frequency variables statistically significant but the higher the frequency (compared to the reference category of no milk consumption), the larger is its coefficient magnitude after controlling for the usual covariates based on child, mother, household, and regional characteristics.

The results of other variables that im**e upon the child’s milk consumption (estimates in columns 4 and 5) are similar to the ZIP model in the statistical significance and sign for the endogenous model. The estimates for the other variables in this table are not directly comparable since in the exogenous model these variables are associated with HAZ while in the endogenous model these variables are associated with the frequency of milk consumption and are also based on the ordered probit model. Equally important to note is that the household’s ownership of bovine and the district average of protein intake of milk as a proxy for district-level variations in milk availability are both insignificant in the exogenous model indicating that they are not directly associated with child HAZ. This also justifies the use of these two variables as instruments in the zero-inflated specification of the ZIP model.

Grains and tubers, and other solid/semi-solid food are the two consumption variables significant and positively associated with the HAZ model (Table 6). However, in the milk consumption equation of the endogenous model, several complementary foods are positively associated with higher milk frequency indicating that additional diets are given to some children along with milk, as may be appropriate for that child on a given day. The relative magnitudes of the different food coefficients decline in that order indicating the level effect that is as follows: the higher the share of children reporting a particular food item, the larger is its association with higher milk frequency. As in the ZIP model, if a higher proportion of mothers report that egg was given in the previous day, then it is associated with a lower frequency of milk given. This perhaps reiterates the vegetarian versus non-vegetarian preferences of households where both are animal-sourced food and may have a similar implication while both have the advantage of easy to feed these very young children. One additional advantage of milk is that it can be given in small quantities and multiple times in a given day. This is perhaps the reason that higher milk frequency is associated with higher HAZ. However, there are two cautions here. Estimation of milk frequency as an ordinal variable imposes the structure through the ordered probit model that a higher frequency of consumption is associated with higher HAZ. A multinomial probit model could consider the different milk frequency as a nominal variable so that there are separate estimates for each frequency of milk intake. This would enable us to understand if a higher frequency is a reflection of less spaced out feedings for children of this age group and perhaps more appropriate for them. On the other hand, if higher milk frequency is a reflection of more (and perhaps diluted) milk being used as a substitute for other food items, then it will be a nutritionally inadequate diet and hence have no impact or even worse impact on a child’s growth.

Thus, even though we are not interpreting the magnitudes of the milk frequency variable as a causal effect on HAZ, and accounting for it by not using an appropriate empirical framework has a completely different implication.

6 Conclusion

There are several studies in recent years that have shown the importance of crop diversity on the nutritional intake of farmers in several countries including India (Ruel Marie et al., 2018; Viswanathan & David, 2021 and Nithya et al., 2021). However, extending this to the allied sectors of agriculture in particular livestock, poultry, fishing, etc. is very limited. 98% of the livestock is in rural areas as per the latest livestock census of India. Here lies the opportunity in harnessing the potential by providing better resources and care to manage it and for improving the availability and accessibility of milk to rural children. Chaturvedi (2022) in a newspaper article mentions that managing livestock through better technology will not only improve income but also employment while this article misses out on an important aspect of the consumption of milk from self-production. This could address the needs of some groups of people who have land and also own some livestock but for rural labour either in agriculture or non-agriculture, subsidies could be provided to both the consumer and producer so that demand and supply are ensured in the local economy given that milk is also a highly perishable food item. Alongside this, milk cooperatives and agri-food businesses producing milk powder could also be considered. As was observed in the study in Odisha, milk powder was readily available in remote and small villages of rural areas but had varied uses rather than providing the same for young children (Parasar et al., 2021). Awareness programmes to give dairy milk to children from the 6th month onwards could play a significant role. The manual on infant and young child feeding practices by the National Health Mission hardly emphasizes the role of dairy milk and focuses more on other complementary foods and their feeding practices.

India has a significant vegetarian population but even the diverse culture encourages the consumption of ASFs, especially dairy products. Milk is a major source of dietary energy, protein, fats, lactose, and a variety of vitamins and minerals. In comparison with many Plant-Based Foods (PBF), milk has less anti-nutrient composition and is easily digestible by people of various ages. Different types of dairy products with different nutritive values can be prepared from milk to increase the variety and avoid boredom in feeding the children.

In India, production and transportation are also not major hurdles to the availability or accessibility of milk especially because of its local availability. Over the years from 1993–94 to 2011–12, the NSSO data shows that there has been a consistent increase in milk’s share in calorie and protein consumption in both rural and urban India (Purohit, 2020). However, there has not been a systematic spatial or temporal analysis of milk consumption among very young children based on several earlier surveys of the same database. More recent data for NFHS-5 covering the years 2019–20 is now available and can be used in future studies to enrich the analysis. More importantly, Purohit (2020) finds that states that have higher child undernourishment rates are also the states that do not report the district-level milk supply data thereby indicating the poor capacity of such institutions in enabling better and higher quality supply of milk within smaller administrative divisions.

Availability of the right kind of information on milk supply and consumption is a challenge, yet India’s well-equipped machinery to provide publicly available statistics has made it possible to analyse milk consumption patterns this far. Some researchers feel that the 24-h recall may not reflect the exact consumption pattern but a one-week recall in other countries by the DHS has shown that it is even worse. The statistical analysis from this study is worthwhile as it provides a sense of variation based on a single cross-section of children reporting milk consumption and its association with other covariates. Large sample surveys like NFHS are also very useful in carrying out a regional analysis, and combining two years of such data for 2015–16 and 2019–20 could give further scope to analyse the districts with high rates of child undernutrition. Understanding seasonal patterns of consumption is also very relevant in the context of young children as severe deprivations during the rainy season or harsh winters or hot summers can set in growth faltering that may be difficult to address at a later stage. The NFHS surveys are not evenly spread across the quarters like the NSSO surveys and are limited from that aspect.

This study used 2011–12 NSSO data on per-consumer unit proteins from milk intake for households with children in the 0–2-year age group as a proxy for the availability of milk at the district level. Some more analysis can be done from such consumption data, but it is rather dated now. It has been a decade of no survey on household consumption data by the NSSO which has created a huge void to further examine milk intake patterns among households with young children and understand the inequities that exist in the country on such aspects which in turn im**e on the health of children.

A hitherto less used econometric specification was considered in this study to analyse the factors associated with milk consumption and perhaps among the very few studies to analyse the frequency of milk consumed by very young children in India. Though there are methodological limitations in establishing a causal relationship between milk consumption and a child’s growth using observational data, nevertheless, the findings of this study show a strong positive association between milk consumption and a child’s linear growth.