Introduction to Deep Learning · HIT

Week 11   Part III · Architectures & Representation Learning

LSTMs, GRUs & Sequence Tasks

Gated recurrent units; how gates restore gradient flow; LSTM versus GRU; sequence classification and sequence-to-sequence tasks.

Curated, free, canonical references for this week: a course or lecture, a book chapter, a video, and an authoritative blog post or official tutorial. Each opens in a new tab.

Course
Dive into Deep Learning, Chapter 10: Modern Recurrent Neural Networks d2l.ai

Full sections on LSTM, GRU, and sequence-to-sequence machine translation.

Book
Dive into Deep Learning, 10.7 Sequence-to-Sequence Learning d2l.ai

Builds an encoder-decoder seq2seq model with teacher forcing and masked loss.

Video
StatQuest: Long Short-Term Memory (LSTM), Clearly Explained youtube.com

Step-by-step visual explanation of LSTM gates and the cell state.

Blog / Docs
Christopher Olah: Understanding LSTM Networks colah.github.io

The definitive illustrated walkthrough of LSTM gates, with a GRU variant section.

← Back to the Week 11 lab

PreviousWeek 10: Recurrent Networks (RNNs)NextWeek 12: Representation Learning