← Back to catalog
Data pre-processing with Python
Prétraitement des données avec Python
Yaé Ulrich Gaba
Undergraduate30 hours10 chaptersEN
Description
A practical course covering the full data pre-processing pipeline: loading data from multiple sources, handling missing values, detecting outliers, encoding and scaling, feature engineering, text preprocessing, temporal data, and building reproducible scikit-learn pipelines. Uses real messy datasets (Titanic, FIFA 21 Raw, Adult Census, Jena Climate).
Table of contents
- Chapter 1 Data landscape and pipelines
- Chapter 2 Data loading (CSV, JSON, SQL, APIs)
- Chapter 3 Missing data
- Chapter 4 Outlier detection and treatment
- Chapter 5 Data type transformations
- Chapter 6 Feature engineering
- Chapter 7 Text pre-processing
- Chapter 8 Time series and temporal data
- Chapter 9 Pipelines and automation
- Chapter 10 Capstone project
Prerequisites
Basic Python and Pandas. Familiarity with NumPy and matplotlib helpful.