Introduction to Deep Learning · HIT

Week 9   Part III · Architectures & Representation Learning

Convolutional Networks II

Batch and layer normalization; residual connections; modern CNN design.

Learning goals

This is the weekly homework lab, completed independently after the lecture and the practice lesson. It follows the course's Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the AI-use policy and a fully worked sample submission.

Exercise

Part A · AI assistant welcomeBuild

  1. Add batch normalization and residual blocks to the Week 8 CNN.

Part B · student reasoningPredict & probe

  1. Predict the effect of removing normalization and residual connections on trainability and depth.

Part C · in plain languageExplain & defend

  1. Ablate each, measure, and explain why residual connections help gradient flow in deep networks.

Deliverables

Hints.

Self-check

Answer each before expanding it. If one is unclear, revisit the lab and the references.

What problem do residual connections address?
Degradation and vanishing gradients in very deep networks.
How does a skip connection help gradients?
It provides an identity path so gradients reach earlier layers directly.
What does batch normalization normalize?
Layer activations per mini-batch, stabilizing and scaling them.
Why compare training curves, not just final accuracy, in the ablation?
To see effects on trainability and convergence speed, not only the endpoint.
How are shapes matched for a residual add when channels change?
Use a 1x1 convolution on the skip path.

Instructor lesson plan (with references)

PreviousWeek 8: Convolutional Networks INextWeek 10: Recurrent Networks (RNNs)