Week 11 Part III · Architectures & Representation Learning
Gated recurrent units; how gates restore gradient flow; LSTM versus GRU; sequence classification and sequence-to-sequence tasks.
Curated, free, canonical references for this week: a course or lecture, a book chapter, a video, and an authoritative blog post or official tutorial. Each opens in a new tab.
Full sections on LSTM, GRU, and sequence-to-sequence machine translation.
Builds an encoder-decoder seq2seq model with teacher forcing and masked loss.
Step-by-step visual explanation of LSTM gates and the cell state.
The definitive illustrated walkthrough of LSTM gates, with a GRU variant section.