Abstract
The panel data regression models have become one of the most widely applied statistical approaches in different fields of research, including social, behavioral, environmental sciences, and econometrics. However, traditional least-squares-based techniques frequently used for panel data models are vulnerable to the adverse effects of data contamination or outlying observations that may result in biased and inefficient estimates and misleading statistical inference. In this study, we propose a minimum density power divergence estimation procedure for panel data regression models with random effects to achieve robustness against outliers. The robustness, as well as the asymptotic properties of the proposed estimator, are rigorously established. The finite-sample properties of the proposed method are investigated through an extensive simulation study and an application to climate data in Oman. Our results demonstrate that the proposed estimator exhibits improved performance over some traditional and robust methods in the presence of data contamination.
Similar content being viewed by others
References
Aquaro, M., Cizek, P. (2013). One-step robust estimation of fixed-effects panel data models. Computational Statistics and Data Analysis, 57(1), 536–548.
Athey, S., Bayati, M., Doudchenko, N., et al. (2021). Matrix completion methods for causal panel data models. Journal of the American Statistical Association, 1–15.
Bakar, N. M. A., Midi, H. (2015). Robust centering in the fixed effect panel data model. Pakistan Journal of Statistics, 31(1), 33–48.
Balestra, P., Nerlove, M. (1966). Pooling cross-section and time series data in the estimation of a dynamic model: The demand for natural gas. Econometrica, 34(3), 585–612.
Baltagi, B. H. (2005). Econometric analysis of panel data. Chichester: John Wiley and Sons.
Basak, S., Basu, A., Jones, M. (2021). On the ‘optimal’ density power divergence tuning parameter. Journal of Applied Statistics, 48(3), 536–556.
Basu, A., Ghosh, A., Mandal, A., et al. (2017). A Wald-type test statistic for testing linear hypothesis in logistic regression models based on minimum density power divergence estimator. Electronic Journal of Statistics, 11(2), 2741–2772.
Basu, A., Harris, I. R., Hjort, N. L., et al. (1998). Robust and efficient estimation by minimising a density power divergence. Biometrika, 85(3), 549–559.
Basu, A., Mandal, A., Martin, N., et al. (2013). Testing statistical hypotheses based on the density power divergence. Annals of the Institute of Statistical Mathematics, 65(2), 319–348.
Basu, A., Mandal, A., Martin, N., et al. (2018). Testing composite hypothesis based on the density power divergence. Sankhya, Ser. B, 80(2), 222–262.
Beyaztas, B. H., Bandyopadhyay, S. (2020). Robust estimation for linear panel data models. Statistics in Medicine, 39(29), 4421–4438.
Bramati, M. C., Croux, C. (2007). Robust estimators for the fixed effects panel data model. Econometrics Journal, 10(3), 521–540.
Cameron, A. C., Trivedi, P. K. (2005). Microeconometrics: Methods and applications. New York Cambridge University Press.
Cizek, P. (2010). Reweighted least trimmed squares: an alternative to one-step estimators. CentER Discussion Paper Series 91/2010.
Cox, D. R., Hall, P. (2002). Estimation in a simple random effects model with nonnormal distributions. Biometrika, 89(4), 831–840.
Diggle, P. J., Heagerty, P., Liang, K.-Y., et al. (2002). Analysis of Longitudinal Data. United Kingdom Oxford University Press.
Ferguson, T. S. (1996). A course in large sample theory. Texts in Statistical Science Series. London Chapman & Hall.
Fitzmaurice, G. M., Laird, N. M., Ware, J. H. (2004). Applied longitudinal analysis. New York: John Wiley and Sons.
Fujisawa, H. (2013). Normalized estimating equation for robust parameter estimation. Electronic Journal of Statistics, 7, 1587–1606.
Fujisawa, H., Eguchi, S. (2008). Robust parameter estimation with a small bias against heavy contamination. Journal of Multivariate Analysis, 99(9), 2053–2081.
Gardiner, J. C., Luo, Z., Roman, L. A. (2009). Fixed effects, random effects and gee: What are the differences? Statistics in Medicine, 28(2), 221–239.
Gervini, D., Yohai, V. J. (2002). A class of robust and fully efficient regression estimators. The Annals of Statistics, 30(2), 583–616.
Ghosh, A., Basu, A. (2013). Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electronic Journal of Statistics, 7, 2420–2456.
Ghosh, A., Mandal, A., Martin, N., et al. (2016). Influence analysis of robust Wald-type tests. Journal of Multivariate Analysis, 147, 102–126.
Greene, W. H. (2017). Econometric analysis. New York: Prentice Hall.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., et al. (1986). Robust statistics: The approach based on influence functions. New York: John Wiley & Sons Inc.
Hsiao, C. (1985). Benefits and limitations of panel data. Econometric Reviews, 4(1), 121–174.
Hsiao, C. (2007). Panel data analysis - advantages and challenges. Test, 16(1), 1–22.
Huber, P. J. (1981). Robust statistics. John Wiley & Sons, Inc., New York. Wiley Series in Probability and Mathematical Statistics.
Jana, S., Basu, A. (2019). A characterization of all single-integral, non-kernel divergence estimators. IEEE Transactions on Information Theory, 65(12), 7976–7984.
Jirata, M. T., Chelule, J. C., Odhiambo, R. O. (2014). Deriving some estimators of panel data regression models with individual effects. International Journal of Science and Research, 3(5), 53–59.
Kennedy, P. (2003). A guide to econometrics. Cambridge: The MIT Press.
Kuchibhotla, A. K., Mukherjee, S., Basu, A. (2019). Statistical inference based on bridge divergences. Annals of the Institute of Statistical Mathematics, 71(3), 627–656.
Kutner, M. H., Nachtsheim, C. J., Neter, J. (2004). Applied linear regression models. New York: McGraw-Hill Education.
Laird, N. M., Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38(4), 963–974.
Lamarche, C. (2010). Robust penalized quantile regression estimation for panel data. Journal of Econometrics, 157(2), 396–408.
Lehmann, E. L. (1999). Elements of large-sample theory. New York: Springer Texts in Statistics. Springer-Verlag.
Maciak, M. (2021). Quantile LASSO with changepoints in panel data models applied to option pricing. Econometrics and Statistics, 20, 166–175.
Maddala, G. S., Mount, T. D. (1973). A comparative study of alternative estimators for variance components models used in econometric applications. Journal of the American Statistical Association, 68(342), 324–328.
Mandal, A., Ghosh, S. (2019). Robust variable selection criteria for the penalized regression. ar**v preprint ar**v:1912.12550.
Maronna, R. A., Martin, R. D., Yohai, V. J. (2006). Robust statistics. Theory and methods. New York: John Wiley and Sons.
Maronna, R. A., and Yohai, V. J. (2000). Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference, 89(1–2), 197–214.
Midi, H., Muhammad, S. (2018). Robust estimation for fixed and random effects panel data models with different centering methods. Journal of Engineering and Applied Sciences, 13(17), 7156–7161.
Mundlak, Y. (1978). On the pooling of time series and cross section data. Econometrica, 46(1), 69–85.
Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79(388), 871–880.
Rousseeuw, P. J., Leroy, A. M. (2003). Robust regression and outlier detection. New York: John Wiley and Sons.
Rousseeuw, P. J., van Zomeren, B. C. (1990). Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association, 85(41), 633–639.
Sherman, J., Morrison, W. J. (1950). Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. The Annals of Mathematical Statistics, 21(1), 124–127.
Sugasawa, S., Yonekura, S. (2021). On selection criteria for the tuning parameter in robust divergence. Entropy, 23(9), 1147.
Visek, J. A. (2015). Estimating the model with fixed and random effects by a robust method. Methodology and Computing in Applied Probability, 17(4), 999–1014.
Wallace, T. D., Hussain, A. (1969). The use of error components models in combining cross section and time-series data. Econometrica, 37(1), 55–72.
Warwick, J., Jones, M. (2005). Choosing a robustness tuning parameter. Journal of Statistical Computation and Simulation, 75(7), 581–588.
Acknowledgements
The authors gratefully acknowledge the comments of two anonymous referees, which led to an improved version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix
A The Estimating Equations
Using Equation (17) of Supplementary Material, the DPD measure in Eq. (8) can be simplified as
where \(B_i = (y_i - x_i \beta )^T {{\Omega }}^{-1} (y_i - x_i \beta )\). Using the Sherman–Morrison formula (Sherman and Morrison 1950), we get
It further simplifies \(B_i\) as follows
The estimating equations of \(\theta \) is obtained from equation \(\frac{\partial }{\partial \theta } {\widehat{d}}_\gamma (f_\theta , g) =0\), and the equations corresponding to \(\beta \), \(\sigma _\alpha ^2\) and \(\sigma _\epsilon ^2\) are simplified as
where \( {\bar{x}}_i = \frac{1}{T}\sum _{t=1}^T x_{it}\). The MDPDE of \(\theta \) is obtained by solving the above system of equations. One may use an iterative algorithm for this purpose or directly minimize the DPD measure in Eq. (8) with respect to \(\theta \in \Theta _0\).
B Regularity Conditions
For the asymptotic distribution of the MDPDE, we need the following assumptions:
-
(A1)
The true density g(y|x) is supported over the entire real line \(\mathbb {R}\).
-
(A2)
There is an open subset \(\omega \in \Theta _0\) containing the best fitting parameter \(\theta \) such tat \({{J}}\) is positive definite for all \(\theta \in \omega \).
-
(A3)
There exist functions \(M_{jkl}(x, y)\) such tat \(|\partial ^3 \exp [(y - x \beta )^T {{\Omega }}^{-1} (y - x \beta )] /\partial \theta _j \partial \theta _k \partial \theta _l | \le M_{jkl}(x, y)\) for all \(\theta \in \omega \), where \(\int _x \int _y |M_{jkl}(x, y)| g(y|x) h(x) dy dx < \infty \) for all j, k and l.
Note that these regularity conditions hold good for the contaminated model, defined in Lemma 1, when \(\eta (\gamma )\) is sufficiently small.
C Proof of Theorem 1
Proof
The proof of the first part closely follows the consistency of the maximum likelihood estimator with the line of modifications as given in Theorem 3.1 of Ghosh and Basu (2013). For brevity, we only present the detailed proof of the second part.
Let \(\widehat{\theta }\) be the MDPDE of \(\theta \). Then
Thus, it can be written as the estimating equation of an M-estimator as follows
where
Let \(\theta _g\) be the true value of \(\theta \), then \( E\left( \sum _{i=1}^N \Psi _{\theta _g}(y_i|x_i)\right) = 0\) gives
Taking a Taylor series expansion of Eq. (23), we get
where \(R_N\) is the remainder term. Using the weak law of large numbers (WLLN), we have
So
From Eq. (25), we get
Now,
Following Section 5 of Ferguson (1996) or Section 2.7 of Lehmann (1999) and using Eqs. (29) and (30), the central limit theorem (CLT) for the independent but not identical random variables gives
Under regularity condition (A3), it can be easily shown that the reminder term \(\sqrt{N}R_N = o_p(1).\) Therefore, combining Eqs. (28) and (31), we get from Eq. (26)
This completes the proof. \(\square \)
1.1 D \({{J}}\) and \({{K}}\) Matrices at the model
Let us write the score function as
Suppose \( {\bar{x}}_i = \frac{1}{T}\sum _{t=1}^T x_{it}\). Then, it can be shown that
Note that if the true distribution g(y|x) is a member of the model family \(f_\theta (y|x)\) for some \(\theta \in \Theta _0\), then
In this case, the symmetric matrix \({{J}}^{(i)}\) can be partitioned as
where
and
Similarly, \(\xi ^{(i)}\) can be partitioned as \(\xi ^{(i)} = \left( \xi _\beta ^{(i)T}, \xi _{\sigma _\alpha ^2}^{(i)} , \xi _{\sigma _\epsilon ^2}^{(i)} \right) ^T\), and it is shown that
Note that if we write the matrix \({{J}}^{(i)}\) as a function of \(\gamma \), i.e., \({{J}}^{(i)} \equiv {{J}}^{(i)}(\gamma )\), then we have
Moreover, \(\xi _\beta ^{(i)}\) is constant for all values of \(i=1, 2, \cdots , N\). Therefore, \({{K}}\) can be written as
About this article
Cite this article
Mandal, A., Beyaztas, B.H. & Bandyopadhyay, S. Robust density power divergence estimates for panel data models. Ann Inst Stat Math 75, 773–798 (2023). https://doi.org/10.1007/s10463-022-00862-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-022-00862-2