Week 06 — Modern ML: ANN, CNN, RNN

Module 6: deep learning end-to-end. MLP from scratch in NumPy, CNNs with a dataset-bias audit, LSTM on financial time series.

Module M6  |  ← schedule |  ← week 05 |  week 07 →

Deep learning end-to-end, with enough theory to know when *not* to use it.

What you ship this week

Three small deliverables: a 2-layer MLP from scratch in NumPy then ported to PyTorch, a CNN trained on a medical-imaging dataset, and an LSTM for sequence prediction on a financial time series.

Due Friday 18:00 (Africa/Lagos (UTC+1))
Submission Drop the repo URL into the week's cohort channel. Peer-review pairing announced Monday of next week.
Rubric Pass / revise. Pass requires green CI, tests covering the public API, and a README a stranger can follow to install and run the code.

Live sessions and labs

Default weekly cadence below. Cohort-specific dates and Zoom links fill in at intake.

Day Time Block Recording
Mon 09:00-12:00 Live instruction + code-along (post-session)
Mon 14:00-16:00 Independent lab work + TA office hours (post-session)
Tue 09:00-12:00 Live instruction + code-along (post-session)
Tue 14:00-16:00 Independent lab work + TA office hours (post-session)
Wed 09:00-12:00 Live instruction + code-along (post-session)
Wed 14:00-16:00 Independent lab work + TA office hours (post-session)
Thu 09:00-12:00 Live instruction + code-along (post-session)
Thu 14:00-16:00 Independent lab work + TA office hours (post-session)
Fri 10:00-11:00 Industry speaker (post-session)
Fri 11:30-12:30 Lab review (post-session)
Fri 14:00-15:00 Cohort retrospective (post-session)

Learning outcomes

By the end of the week, every participant will:

  1. Train a feed-forward neural network from scratch in NumPy, then in PyTorch.
  2. Build, train, and evaluate CNNs on image classification tasks.
  3. Build, train, and evaluate RNNs/LSTMs on sequence tasks.
  4. Diagnose training pathologies: vanishing gradients, overfitting, dead neurons, distribution shift.

Topics covered

Feed-forward networks, backpropagation, optimization (SGD, Adam, AdamW) · regularization (dropout, batch norm, weight decay, early stopping) · CNNs (LeNet, AlexNet, ResNet, modern architectures) · RNNs, LSTMs, GRUs · attention as a primitive · representation learning · the bitter lessons of deep learning (compute scaling, what doesn’t transfer).

Labs

Lab 1 — MLP from scratch then PyTorch

Implement a 2-layer MLP with manual backprop in NumPy. Get it training on MNIST. Then port to PyTorch and verify identical loss curves (to within stochastic noise).

Dataset: MNIST (canonical).

Lab 2 — Medical-imaging CNN with dataset audit

Train a ResNet18 on a public chest-X-ray dataset. Then conduct a dataset-bias audit: does the model exploit hospital-specific artifacts? Write a 400-word memo on what you find.

Dataset: NIH ChestX-ray14 (small subset) + a second hospital's images for out-of-distribution test.

Lab 3 — LSTM for financial time series

Predict 10-day-ahead returns on a multi-asset basket. Compare LSTM against a baseline AR(1) and a naive last-value predictor. Report the embarrassing gap, then explain why time-series prediction is harder than it looks.

Dataset: Yahoo Finance daily closes for a 10-ticker basket, 2010-2024.

Readings

Mandatory

Optional deepening

Builds on (course catalogue)