Week 1 Part I · Foundations

Deep Learning Overview & ML-to-Network Framing

Instructor lesson plan: lecture (3 h) and practice (2 h).

Learning objectives

Set up the PyTorch toolchain and confirm it runs.
Frame any ML task as tensors in, a model, and a loss out.
Run a first end-to-end training loop and read the loss curve.

🎓Lecture · 3 hours

0:00–0:10	10 min	Recap & retrievalOpen with two quick questions on last week's material (retrieval practice), then state this week's objectives.
0:10–0:25	15 min	MotivationWhy deep learning now: representation learning, scale, and one framework that spans vision, language, and more.
0:25–1:10	45 min	What a neural network is A neural network is a parametric function: it maps an input tensor to an output tensor through learned weights. It is built from layers (linear transforms) interleaved with nonlinear activations; without the nonlinearity it collapses to a single linear map. Learning means adjusting the weights to reduce a loss that scores how wrong the outputs are. Board work: a single neuron computing w.x + b, then stacking neurons into a layer.
1:10–1:20	10 min	Break
1:20–2:05	45 min	Framing an ML task as a network Decide the input representation and its tensor shape (a vector, an image, a sequence). Choose the output layer: one value for regression, or one logit per class for classification. Match the loss to the output: MSE for regression, cross-entropy for classification. Assemble the loop: forward pass, compute loss, backward pass, optimizer step, repeat over the data.
2:05–2:35	30 min	Live demo (predict, then run)Before running, ask the class to predict what the loss curve does when the learning rate is set 10x too high, then run it and compare. Build a minimal training loop on a toy dataset, watch the loss fall, then change the learning rate to show divergence.
2:35–2:50	15 min	Wrap-up & practice previewRevisit the misconception and concept checks below, recap the takeaways, and preview the practice lesson.
2:50–3:00	10 min	Buffer & questions

Common misconception to confront.

Students often think: Stacking more linear layers makes a more powerful model.
Set it straight: Without a nonlinearity between them, any stack of linear layers is equivalent to a single linear layer W x + b; the activation is what gives depth its power.

Check for understanding (pose during the concept blocks; let students answer before revealing).

Remove every activation from a 5-layer MLP. What function class can it still represent?

Only linear functions: the whole stack collapses to one linear map.

You predict house price from features. What output layer and loss, and why?

One linear output unit and MSE (or MAE): the target is a continuous value, so no softmax and no cross-entropy.

Key takeaways.

Every task reduces to data, a model, a loss, and optimization.
The output layer and the loss are chosen together.
PyTorch computes gradients, so effort goes into framing.

💻Practice · 2 hours

In the practice lesson the instructor demonstrates implementations, runs code, and works through examples, using the practice notebook linked below. The weekly lab is then set as homework, where students apply this themselves.

0:00–0:10	10 min	Setup & recapRecap the lecture's key ideas and open the working notebook.
0:10–1:00	50 min	Instructor demonstrations Set up PyTorch live and confirm the device (GPU or CPU). Walk through a minimal training loop on a toy dataset, run it, and read the loss curve.
1:00–1:05	5 min	Break
1:05–1:45	40 min	Instructor demonstrations (continued) Vary the learning rate live to show divergence versus convergence. Frame a classification and a regression example as tensors-in, loss-out, in code.
1:45–2:00	15 min	Wrap-up & lab briefSummarize the patterns shown and brief the weekly lab (homework), which students complete on their own.

Common pitfalls to pre-empt.

Start on CPU with a tiny dataset; correctness first, speed later.
If the loss is flat, check the learning rate and that the label format matches the loss.

Open the practice notebook in Colab Curated references Lab (homework)