Dimensionality Reduction

Creating Manageable Training Datasets

  • Chapter
  • First Online:
Thinking Data Science

Part of the book series: The Springer Series in Applied Machine Learning ((SSAML))

  • 1401 Accesses

Abstract

Having more features for inference may be thought of better than having just a handful of features. However, in machine learning, data scientists usually treat having many features as a “curse of dimensionality.” The reasoning behind this is having large dimensions makes data exploration and visualization difficult. It is also computationally expensive to train models on high-dimensional datasets. After all, every dimension may not play a significant role in machine learning. Thus, it is always an advantage to reduce the dimensions. There are several techniques available for dimensionality reductions. I have covered almost 14 different techniques in this chapter. Some of these are trivial and require manual inspection; while there are many advanced techniques which are fully automated. To list a few, I discuss factor analysis, PCA, ICA, t-SNE, UMAP, SVD, and LDA. I describe each technique with the implementation code on an appropriate dataset and perform a series of experiments to show you the effectiveness of each technique. This will help you in gaining a solid knowledge of dimensionality reduction, which is a critical step in data science process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
EUR 29.95
Price includes VAT (Spain)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 53.49
Price includes VAT (Spain)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 62.39
Price includes VAT (Spain)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 67.59
Price includes VAT (Spain)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sarang, P. (2023). Dimensionality Reduction. In: Thinking Data Science. The Springer Series in Applied Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-031-02363-7_2

Download citation

Publish with us

Policies and ethics

Navigation