Week 1 Part I · Foundations
Deep Learning Overview & ML-to-Network Framing
What deep learning is; framing a task as tensor inputs, model outputs, and a loss function.
Learning goals
- Set up the PyTorch toolchain and confirm it runs.
- Frame any ML task as tensors in, a model, and a loss out.
- Run a first end-to-end training loop and read the loss curve.
This is the weekly
homework lab, completed independently after the lecture and the practice lesson. It follows the course's
Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the
AI-use policy and a
fully worked sample submission.
⚙Exercise
Part A · AI assistant welcomeBuild
- Set up Python, PyTorch, and Jupyter or Colab; confirm whether a GPU is visible and fall back to CPU if not.
- With an AI assistant's help, scaffold a minimal training loop on a tiny dataset (a 2-class toy set or a small MNIST subset): model, loss, optimizer, and one train step.
- Train for a few epochs and plot the loss going down.
Part B · student reasoningPredict & probe
- For three tasks (binary spam detection, house-price regression, 10-class digit recognition), write the input tensor shape, output-layer size, and loss, before any coding.
- Predict whether the loss should reach near zero on the toy task and roughly how fast.
Part C · in plain languageExplain & defend
- Explain in a few sentences what the model, loss, and optimizer each do.
- Justify why the shape-and-loss framing is correct for each of the three tasks.
- Name one thing the AI assistant produced that needed correction or was not initially understood.
✓Deliverables
- A notebook with the running loop and a loss curve.
- The three-task framing table (input shape, output size, loss).
- A short reflection (5 to 8 sentences).
Hints.- Start on CPU with a tiny dataset; correctness first, speed later.
- If the loss is flat, check the learning rate and that the label format matches the loss.
❓Self-check
Answer each before expanding it. If one is unclear, revisit the lab and the references.
What four components does every ML task reduce to?
Data, a model, a loss (objective), and optimization.
For 10-class digit classification, what output-layer size and loss are appropriate?
Ten output logits with cross-entropy loss.
Why pass logits, not softmax probabilities, to CrossEntropyLoss?
It applies log-softmax internally, which is more numerically stable.
What is the difference in role between the loss and the optimizer?
The loss measures error; the optimizer updates parameters to reduce it.
Training loss stays flat. Name two things to check.
The learning rate, and that the label format matches the chosen loss.
Instructor lesson plan (with references)