Solutions to Data Science Problems

Venugopal, Deepak; Deng, Lih-Yuan; Garzon, Max

doi:10.1007/978-3-031-05371-9_2

737 Accesses

Abstract

This chapter presents a review of statistical and machine learning models to tackle data science problems, arguably the most popular approaches. Both supervised and unsupervised algorithms are described along with practical considerations when using these methods. Empirical results on exemplar datasets are also presented where applicable to illustrate the application of these methods to real-world problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 39.58; Price includes VAT (Germany)

Softcover Book: EUR 64.19; Price includes VAT (Germany)

Hardcover Book: EUR 69.54; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Data Science: An Introduction

Supervised Machine Learning in a Nutshell

Big data algorithms beyond machine learning

Article Open access 24 October 2017

References

Arthur, D., & Vassilvitskii, S. (2007). k-Means++: The advantages of careful seeding. In SODA ’07: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 1027–1035).
Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
MATH Google Scholar
Bottou, L., & Bengio, Y. (1995). Convergence properties of the k-means algorithms. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems (Vol. 7). MIT Press.
Google Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Article Google Scholar
Chen, C., Liu, Y., & Peng, L. (2019). How to develop machine learning models for healthcare. Nature Materials, 18, 410–414.
Article Google Scholar
Christopher, M. (2006). Pattern recognition and machine learning. Springer.
MATH Google Scholar
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other Kernel-based learning methods. Cambridge University Press.
Book Google Scholar
Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29(2), 103–130.
Article Google Scholar
Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1), 3133–3181.
MathSciNet MATH Google Scholar
Friedman, J. B. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.
Article MathSciNet Google Scholar
Funahashi, K. I. (1989). On the approximate realization of continuous map**s by neural networks. Neural Networks, 2(3), 183–192.
Article Google Scholar
Garzon, M., & Botelho, F. (1999). Dynamical approximation by recurrent neural networks. Neurocomputing, 29(1), 25–46.
Article Google Scholar
Glantz, S. A., Slinker, B. K., & Neilands, T. B. (1990). Primer of applied regression and analysis of variance. McGraw-Hill Inc.
Google Scholar
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
MATH Google Scholar
Gideon, S., et al. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.
MathSciNet MATH Google Scholar
Hastie, T. J., & Tibshirani, R. J. (1986). Generalized additive models. Statistical Science, 43(3), 297–310.
MathSciNet MATH Google Scholar
Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527–1554.
Article MathSciNet Google Scholar
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.
Article Google Scholar
Kelleher, J. D., MacNamee, B., & D’Arcy, A. (2015). Fundamentals of machine learning for predictive data analytics: Algorithms, worked examples, and case studies. MIT Press.
MATH Google Scholar
Mitchell, T. M. (1997). Machine learning. McGraw-Hill.
MATH Google Scholar
Mount, J., & Zumel, N. (2019). Practical data science with R. Simon & Schuster.
Google Scholar
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81–106.
Google Scholar
Ripley, B. D. (2007). Pattern recognition and neural networks. Cambridge University Press.
MATH Google Scholar
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Article Google Scholar
Schapire, R. E. (2013). Explaining AdaBoost. In Empirical inference (pp. 37–52). Springer.
Google Scholar
Taddy, M. (2019). Business data science: Combining machine learning and economics to optimize, automate, and accelerate business decisions. McGraw Hill Professional.
Google Scholar
Titterington, D. M., Smith, A. F. M., & Makov, U. E. (1985). Statistical analysis of finite mixture distributions. New York: Wiley.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science, The University of Memphis, Memphis, TN, USA
Deepak Venugopal & Max Garzon
Mathematical Sciences, The University of Memphis, Memphis, TN, USA
Lih-Yuan Deng

Authors

Deepak Venugopal
View author publications
You can also search for this author in PubMed Google Scholar
Lih-Yuan Deng
View author publications
You can also search for this author in PubMed Google Scholar
Max Garzon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepak Venugopal .

Editor information

Editors and Affiliations

375 Dunn Hall, The University of Memphis, Memphis, TN, USA
Max Garzon
375 Dunn Hall, The University of Memphis, Memphis, TN, USA
Ching-Chi Yang
375 Dunn Hall, The University of Memphis, Memphis, TN, USA
Deepak Venugopal
375 Dunn Hall, The University of Memphis, Memphis, TN, USA
Nirman Kumar
Memphis, TN, USA
Kalidas Jana
375 Dunn Hall, The University of Memphis, Memphis, TN, USA
Lih-Yuan Deng

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Venugopal, D., Deng, LY., Garzon, M. (2022). Solutions to Data Science Problems. In: Garzon, M., Yang, CC., Venugopal, D., Kumar, N., Jana, K., Deng, LY. (eds) Dimensionality Reduction in Data Science. Springer, Cham. https://doi.org/10.1007/978-3-031-05371-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-05371-9_2
Published: 09 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05370-2
Online ISBN: 978-3-031-05371-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Solutions to Data Science Problems

Abstract

Access this chapter

Similar content being viewed by others

Data Science: An Introduction

Supervised Machine Learning in a Nutshell

Big data algorithms beyond machine learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Solutions to Data Science Problems

Abstract

Access this chapter

Similar content being viewed by others

Data Science: An Introduction

Supervised Machine Learning in a Nutshell

Big data algorithms beyond machine learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation