Log in

Quantifying Infinite-Dimensional Data: Functional Data Analysis in Action

  • Published:
Statistics in Biosciences Aims and scope Submit manuscript

Abstract

Functional data analysis (FDA) is concerned with inherently infinite-dimensional data objects and therefore can be viewed as part of the methodology for big data. The size of functional data may vary from terabytes as encountered in functional magnetic resonance imaging (fMRI) and other applications in brain imaging to just a few kilobytes in longitudinal data with small or modest sample sizes. In this contribution, we highlight some applications of FDA methodology through various data illustrations. We briefly review some basic computational tools that can be used to accelerate implementations of FDA methodology. The analyses presented in this paper illustrate the principal analysis by conditional expectation (PACE) package for FDA, where our applications include both relatively simple and more complex functional data from the biomedical sciences. The data we discuss range from functional data that result from daily movement profile tracking and that are modeled as repeatedly observed functions per subject, to medfly longitudinal behavior profiles, where the goal is to predict remaining lifetime of individual flies. We also discuss the quantification of connectivity of fMRI signals that is of interest in brain imaging and the prediction of continuous traits from high-dimensional SNPs in genomics. The methods of FDA that we demonstrate for these analyses include functional principal component analysis, functional regression and correlation, the modeling of dependent functional data, and the stringing of high-dimensional data into functional data and can be implemented with the PACE package.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Bosq D (2000) Linear processes in function spaces: theory and applications. Springer, New York

    Book  MATH  Google Scholar 

  2. Bøvelstad H, Nygård S, Størvold H, Aldrin M, Borgan Ø, Frigessi A, Lingjærde O (2007) Predicting survival from microarray data—a comparative study. Bioinformatics 23:2080–2087

    Article  Google Scholar 

  3. Cardot H (2000) Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J Nonparametr Stat 12:503–538

    Article  MathSciNet  MATH  Google Scholar 

  4. Cardot H, Crambes C, Sarda P (2005) Quantile regression when the covariates are functions. J Nonparametr Stat 17:841–856

    Article  MathSciNet  MATH  Google Scholar 

  5. Chen K, Chen K, Müller H-G, Wang J (2011) Stringing high-dimensional data for functional analysis. J Am Stat Assoc 106:275–284

    Article  MathSciNet  MATH  Google Scholar 

  6. Chen K, Müller H-G (2012) Conditional quantile analysis when covariates are functions, with application to growth data. J R Stat Soc Ser B 74:67–89

    Article  MathSciNet  Google Scholar 

  7. Chen K, Müller H-G (2012) Modeling repeated functional observations. J Am Stat Assoc 107:1599–1609

    Article  MathSciNet  MATH  Google Scholar 

  8. Chiou J-M, Müller H-G (2007) Diagnostics for functional regression via residual processes. Comput Stat Data Anal 51:4849–4863

    Article  MathSciNet  MATH  Google Scholar 

  9. Dauxois J, Pousse A, Romain Y (1982) Asymptotic theory for the principal component analysis of a vector random function: some applications to statistical inference. J Multivar Anal 12:136–154

    Article  MathSciNet  MATH  Google Scholar 

  10. Febrero-Bande M, González-Manteiga W (2013) Generalized additive models for functional data. Test 22:278–292

    Article  MathSciNet  MATH  Google Scholar 

  11. Ferraty F, Vieu P (2006) Nonparametric functional data analysis. Springer, New York

    MATH  Google Scholar 

  12. Good IJ (1969) Some applications of the singular decomposition of a matrix. Technometrics 11:823–831

    Article  MATH  Google Scholar 

  13. Hall P, Hosseini-Nasab M (2006) On properties of functional principal components analysis. J R Stat Soc Ser B 68:109–126

    Article  MathSciNet  MATH  Google Scholar 

  14. Hall P, Müller H-G, Wang J-L (2006) Properties of principal component methods for functional and longitudinal data analysis. Ann Stat 34:1493–1517

    Article  MathSciNet  MATH  Google Scholar 

  15. Hall P, Müller H-G, Yao F (2008) Modeling sparse generalized longitudinal observations with latent Gaussian processes. J R Stat Soc Ser B 70:703–723

    Article  MATH  Google Scholar 

  16. Hinton L, Carter K, Reed BR, Beckett L, Lara E, DeCarli C, Mungas D (2010) Recruitment of a community-based cohort for research on diversity and risk of dementia. Alzheimer Dis Assoc Disord 24:234

    Google Scholar 

  17. Horvath L, Kokoszka P (2012) Inference for functional data with applications. Springer, New York

    Book  MATH  Google Scholar 

  18. Horváth L, Reeder R et al (2013) A test of significance in functional quadratic regression. Bernoulli 19:2120–2151

    Article  MathSciNet  MATH  Google Scholar 

  19. Hsing T, Eubank R (2015) Theoretical foundations of functional data analysis, with an introduction to linear operators. Wiley, Chichester

    Book  MATH  Google Scholar 

  20. Kneip A, Utikal KJ (2001) Inference for density families using functional principal component analysis. J Am Stat Assoc 96:519–542

    Article  MathSciNet  MATH  Google Scholar 

  21. Li Y, Hsing T (2010) Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Ann Stat 38:3321–3351

    Article  MathSciNet  MATH  Google Scholar 

  22. McLean MW, Hooker G, Staicu A-M, Scheipl F, Ruppert D (2014) Functional generalized additive models. J Comput Graph Stat 23:249–269

    Article  MathSciNet  Google Scholar 

  23. Müller H-G (2005) Functional modelling and classification of longitudinal data. Scand J Stat 32:223–240

    Article  MathSciNet  MATH  Google Scholar 

  24. Müller H-G (2008) Functional modeling of longitudinal data. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G (eds) Longitudinal data analysis (handbooks of modern statistical methods). Chapman & Hall, New York, pp 223–252

    Chapter  Google Scholar 

  25. Müller H-G (2011) Functional data analysis, in international encyclopedia of statistical science. In: Lovric M (ed) Extended version available in StatProb: the encyclopedia sponsored by statistics and probability societies, id 242, Springer, Heidelberg, pp. 554–555

  26. Müller H-G, Chiou J-M, Leng X (2008) Inferring gene expression dynamics via functional regression analysis. BMC Bioinform 9:60

    Article  Google Scholar 

  27. Müller H-G, Wu Y, Yao F (2013) Continuously additive models for nonlinear functional regression. Biometrika 100:607–622

    Article  MathSciNet  MATH  Google Scholar 

  28. Müller H-G, Yao F (2008) Functional additive models. J Am Stat Assoc 103:1534–1544

    Article  MathSciNet  MATH  Google Scholar 

  29. Müller H-G, Yao F (2010) Empirical dynamics for longitudinal data. Ann Stat 38:3458–3486

    Article  MathSciNet  MATH  Google Scholar 

  30. Papadopoulos NT, Katsoyannos BI, Kouloussis NA, Carey JR, Müller H-G, Zhang Y (2004) High sexual calling rates of young individuals predict extended life span in male Mediterranean fruit flies. Oecologia 138:127–134

    Article  Google Scholar 

  31. Ramsay JO, Silverman BW (2005) Functional data analysis, 2nd edn., Springer series in statisticsSpringer, New York

    MATH  Google Scholar 

  32. Rosenwald A, Wright G, Chan W, Connors J, Campo E, Fisher R, Gascoyne R, Muller-Hermelink H, Smeland E, Giltnane J, Hurt E, Zhao H, Averett L, Yang L, Wilson W, Jaffe E, Simon R, Klausner R, Powell J, Duffey P, Longo D, Greiner T, Weisenburger D, Sanger W, Dave B, Lynch J, Vose J, Armitage J, Montserrat E, Lpez-Guillermo A, Grogan T, Miller T, LeBlanc M, Ott G, Kvaloy S, Delabie J, Holte H, Krajci P, Stokke T, Staudt L, LMPP (2002) The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 346:1937–1947

    Article  Google Scholar 

  33. Silverman BW (1996) Smoothed functional principal components analysis by choice of norm. Ann Stat 24:1–24

    Article  MathSciNet  MATH  Google Scholar 

  34. Staniswalis JG, Lee JJ (1998) Nonparametric regression analysis of longitudinal data. J Am Stat Assoc 93:1403–1418

    Article  MathSciNet  MATH  Google Scholar 

  35. Wang K, Liang M, Wang L, Tian L, Zhang X, Li K, Jiang T (2007) Altered functional connectivity in early Alzheimer’s disease: a resting-state fMRI study. Hum Brain Mapp 28:967–978

    Article  Google Scholar 

  36. Yang W, Müller H-G, Stadtmüller U (2011) Functional singular component analysis. J R Stat Soc Ser B 73:303–324

    Article  MathSciNet  Google Scholar 

  37. Yao F, Müller H-G (2010) Functional quadratic regression. Biometrika 97:49–64

    Article  MathSciNet  MATH  Google Scholar 

  38. Yao F, Müller H-G, Wang J-L (2005) Functional data analysis for sparse longitudinal data. J Am Stat Assoc 100:577–590

    Article  MathSciNet  MATH  Google Scholar 

  39. Yao F, Müller H-G, Wang J-L (2005b) Functional linear regression analysis for longitudinal data. Ann Stat 33:2837–2903

    Article  MathSciNet  Google Scholar 

  40. Zhang Y, Müller H-G, Carey JR, Papadopoulos NT (2006) Behavioral trajectories as predictors in event history analysis: male calling behavior forecasts medfly longevity. Mech Ageing Dev 127:680–686

    Article  Google Scholar 

  41. Zhang H-Y, Wang S-J, **ng J, Liu B, Ma Z-L, Yang M, Zhang Z-J, Teng G-J (2009) Detection of PCC functional connectivity characteristics in resting-state fMRI in mild Alzheimers disease. Behav Brain Res 197:103–108

    Article  Google Scholar 

  42. Zhu H, Yao F, Zhang HH (2014) Structured functional additive regression in reproducing kernel Hilbert spaces. J R Stat Soc Ser B 76:581–603

    Article  MathSciNet  Google Scholar 

  43. Zou S, Liedo P, Altamirano-Robles L, Cruz-Enriquez J, Morice A, Ingram DK, Kaub K, Papadopoulos N, Carey JR (2011) Recording lifetime behavior and movement in an invertebrate model. PloS One 6:e18151

    Article  Google Scholar 

Download references

Acknowledgments

Research supported by NSF Grants DMS-1228369 and DMS-1407852.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kehui Chen.

Additional information

Dedicated to the memory of Bitao Liu.

Bitao Liu graduated with a Ph.D. in statistics from UC Davis in 2008 on topics in functional data analysis and made substantial contributions to the PACE package. She worked at Affymetrix and suffered a premature and unexpected death in October 2014.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, K., Zhang, X., Petersen, A. et al. Quantifying Infinite-Dimensional Data: Functional Data Analysis in Action. Stat Biosci 9, 582–604 (2017). https://doi.org/10.1007/s12561-015-9137-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-015-9137-5

Keywords

Navigation