Week 13 Part IV · Integration

Integration & Transfer Learning

Instructor lesson plan: lecture (3 h) and practice (2 h).

Learning objectives

Fine-tune a pretrained model end-to-end.
Run inference and assemble a full workflow.
Reason about when transfer learning helps.

🎓Lecture · 3 hours

0:00–0:10	10 min	Recap & retrievalOpen with two quick questions on last week's material (retrieval practice), then state this week's objectives.
0:10–0:25	15 min	MotivationTraining from scratch is rare; standing on pretrained models is the bridge to the advanced courses.
0:25–1:10	45 min	Transfer learning A model pretrained on a large dataset already knows general-purpose features. Feature extraction freezes the backbone and trains only a new head (good for small, similar data). Fine-tuning updates the whole network, usually with a smaller learning rate on the pretrained layers. Match the input preprocessing to what the pretrained model expects.
1:10–1:20	10 min	Break
1:20–2:05	45 min	The end-to-end workflow Data, model, train, evaluate, infer: the full pipeline assembled in one place. Split and load the data, choose a loss and metric, train with validation, and touch the test set once. Inference uses model.eval() and no_grad(); save and load checkpoints. This foundation carries directly into the advanced language and vision courses.
2:05–2:35	30 min	Live demo (predict, then run)Ask the class to predict the ranking of from-scratch, frozen-features, and fine-tuning on small data before showing the three results. Fine-tune a pretrained ResNet, compare from-scratch versus frozen versus fine-tuned, and run inference.
2:35–2:50	15 min	Wrap-up & practice previewRevisit the misconception and concept checks below, recap the takeaways, and preview the practice lesson.
2:50–3:00	10 min	Buffer & questions

Common misconception to confront.

Students often think: To use a pretrained model on your data, retrain the whole network from scratch.
Set it straight: Usually you freeze the pretrained backbone and train only a new head (or fine-tune with a small learning rate); the pretrained features are the value, retraining from scratch discards them.

Check for understanding (pose during the concept blocks; let students answer before revealing).

You have 500 labeled images similar to ImageNet. Feature-extract or fine-tune everything?

Feature-extract: freeze the backbone, train a new head. With little, similar data, fine-tuning all weights risks overfitting.

Why must you match the pretrained model’s input preprocessing?

The frozen features were learned on inputs normalized a specific way; a mismatch shifts the input distribution and degrades the features.

Key takeaways.

Transfer learning beats training from scratch on small data.
Freeze first, then unfreeze gradually.
Match the pretrained model's input preprocessing.

💻Practice · 2 hours

In the practice lesson the instructor demonstrates implementations, runs code, and works through examples, using the practice notebook linked below. The weekly lab is then set as homework, where students apply this themselves.

0:00–0:10	10 min	Setup & recapRecap the lecture's key ideas and open the working notebook.
0:10–1:00	50 min	Instructor demonstrations Load a pretrained model and fine-tune it live on a new task. Compare from-scratch, frozen-features, and fine-tuning side by side.
1:00–1:05	5 min	Break
1:05–1:45	40 min	Instructor demonstrations (continued) Run inference on new inputs end to end.
1:45–2:00	15 min	Wrap-up & lab briefSummarize the patterns shown and brief the weekly lab (homework), which students complete on their own.

Common pitfalls to pre-empt.

Use a smaller learning rate for pretrained layers; unfreeze gradually.
Match input preprocessing to the pretrained model.

Open the practice notebook in Colab Curated references Lab (homework)