Week 06 — Modern ML: ANN, CNN, RNN |

Module M6 | ← schedule | ← week 05 | week 07 →

Deep learning end-to-end, with enough theory to know when *not* to use it.

What you ship this week

Three small deliverables: a 2-layer MLP from scratch in NumPy then ported to PyTorch, a CNN trained on a medical-imaging dataset, and an LSTM for sequence prediction on a financial time series.

Due	Friday 18:00 (Africa/Lagos (UTC+1))
Submission	Drop the repo URL into the week's cohort channel. Peer-review pairing announced Monday of next week.
Rubric	Pass / revise. Pass requires green CI, tests covering the public API, and a README a stranger can follow to install and run the code.

Live sessions and labs

Default weekly cadence below. Cohort-specific dates and Zoom links fill in at intake.

Day	Time	Block	Recording
Mon	`09:00-12:00`	Live instruction + code-along	(post-session)
Mon	`14:00-16:00`	Independent lab work + TA office hours	(post-session)
Tue	`09:00-12:00`	Live instruction + code-along	(post-session)
Tue	`14:00-16:00`	Independent lab work + TA office hours	(post-session)
Wed	`09:00-12:00`	Live instruction + code-along	(post-session)
Wed	`14:00-16:00`	Independent lab work + TA office hours	(post-session)
Thu	`09:00-12:00`	Live instruction + code-along	(post-session)
Thu	`14:00-16:00`	Independent lab work + TA office hours	(post-session)
Fri	`10:00-11:00`	Industry speaker	(post-session)
Fri	`11:30-12:30`	Lab review	(post-session)
Fri	`14:00-15:00`	Cohort retrospective	(post-session)

Learning outcomes

By the end of the week, every participant will:

Train a feed-forward neural network from scratch in NumPy, then in PyTorch.
Build, train, and evaluate CNNs on image classification tasks.
Build, train, and evaluate RNNs/LSTMs on sequence tasks.
Diagnose training pathologies: vanishing gradients, overfitting, dead neurons, distribution shift.

Topics covered

Feed-forward networks, backpropagation, optimization (SGD, Adam, AdamW) · regularization (dropout, batch norm, weight decay, early stopping) · CNNs (LeNet, AlexNet, ResNet, modern architectures) · RNNs, LSTMs, GRUs · attention as a primitive · representation learning · the bitter lessons of deep learning (compute scaling, what doesn’t transfer).

Labs

Lab 1 — MLP from scratch then PyTorch

Implement a 2-layer MLP with manual backprop in NumPy. Get it training on MNIST. Then port to PyTorch and verify identical loss curves (to within stochastic noise).

Dataset: MNIST (canonical).

Lab 2 — Medical-imaging CNN with dataset audit

Train a ResNet18 on a public chest-X-ray dataset. Then conduct a dataset-bias audit: does the model exploit hospital-specific artifacts? Write a 400-word memo on what you find.

Dataset: NIH ChestX-ray14 (small subset) + a second hospital's images for out-of-distribution test.

Lab 3 — LSTM for financial time series

Predict 10-day-ahead returns on a multi-asset basket. Compare LSTM against a baseline AR(1) and a naive last-value predictor. Report the embarrassing gap, then explain why time-series prediction is harder than it looks.

Dataset: Yahoo Finance daily closes for a 10-ticker basket, 2010-2024.

Readings

Mandatory

Before Tuesday. Goodfellow, Bengio, Courville, *Deep Learning*, chapters 6-8 (feed-forward, regularization, optimization)
Before Wednesday. fast.ai *Practical Deep Learning for Coders*, lessons 1-3
Before Thursday. He, Zhang, Ren, Sun, *Deep Residual Learning for Image Recognition* (CVPR 2016, the ResNet paper)

Optional deepening

Andrej Karpathy, *A Recipe for Training Neural Networks* (blog post 2019)

Builds on (course catalogue)

Apprentissage Profond