← Back to catalog

Data pre-processing with Python

Prétraitement des données avec Python

Yaé Ulrich Gaba

Undergraduate30 hours10 chaptersEN

Description

A practical course covering the full data pre-processing pipeline: loading data from multiple sources, handling missing values, detecting outliers, encoding and scaling, feature engineering, text preprocessing, temporal data, and building reproducible scikit-learn pipelines. Uses real messy datasets (Titanic, FIFA 21 Raw, Adult Census, Jena Climate).

Table of contents

  1. Chapter 1 Data landscape and pipelines
  2. Chapter 2 Data loading (CSV, JSON, SQL, APIs)
  3. Chapter 3 Missing data
  4. Chapter 4 Outlier detection and treatment
  5. Chapter 5 Data type transformations
  6. Chapter 6 Feature engineering
  7. Chapter 7 Text pre-processing
  8. Chapter 8 Time series and temporal data
  9. Chapter 9 Pipelines and automation
  10. Chapter 10 Capstone project

Prerequisites

Basic Python and Pandas. Familiarity with NumPy and matplotlib helpful.

Download