The Shape of Data: Geometry-Based Machine Learning and Data Analysis in R


The Shape of Data: Geometry-Based Machine Learning and Data Analysis in R cover
Cover of The Shape of Data: Geometry-Based Machine Learning and Data Analysis in R

The Shape of Data is a practical guide to geometry-based machine learning, co-authored with Colleen M. Farrelly and published by No Starch Press. It works through persistent homology, the Mapper algorithm, metric geometry, and graph methods using R, aimed at people who want to use these tools on actual datasets rather than read about them in survey papers.

What you’ll learn

  • Topological Data Analysis (TDA): Persistent homology, persistence diagrams, Betti numbers, and the Mapper algorithm
  • Metric Geometry: Distance-based methods, embeddings, and curvature features for ML
  • Network Science: Graph representations, community detection, and centrality measures
  • Practical R Code: Hands-on implementations using real-world datasets

Who this book is for

It’s aimed at data scientists who want geometric tools beyond Euclidean distance, mathematicians curious about ML applications, and graduate students working at the intersection of topology and data. Assumed background: undergraduate math and basic R or Python.

Get the book