Introduction to Deep Learning · HIT

Prerequisite   Review & refresh

∑ Mathematics

Deep learning is applied linear algebra and calculus with a probabilistic flavor. A mathematician's depth is not required, but these ideas should feel familiar so the course can move quickly from notation to networks.

Linear algebra

Probability and statistics

Multivariable functions

Gradients and optimization

Readiness check

Self-check questions

Multiple-choice questions on the topic itself. Pick an answer, then reveal it. If several are unclear, work through the review above first.

1. If A is 3x4 and B is 4x2, the product AB has shape:

  1. 4x4
  2. 3x2
  3. 2x3
  4. undefined
Show answer
Correct: B. The inner dimensions (4) match, and the result takes the outer dimensions: 3x2.

2. Matrix multiplication is:

  1. commutative (AB = BA)
  2. generally not commutative
  3. only defined for square matrices
  4. the same as the elementwise product
Show answer
Correct: B. In general AB does not equal BA; order matters, and the elementwise (Hadamard) product is a different operation.

3. The transpose of a product (AB) equals:

  1. A B transposed each
  2. B-transpose times A-transpose
  3. AB
  4. BA
Show answer
Correct: B. Transposing a product reverses the order: (AB)^T = B^T A^T.

4. The dot product of two non-zero vectors is zero when they are:

  1. parallel
  2. of unit length
  3. orthogonal (perpendicular)
  4. equal
Show answer
Correct: C. u . v = |u||v|cos(theta); it is zero when the angle is 90 degrees.

5. The L2 (Euclidean) norm of the vector (3, 4) is:

  1. 7
  2. 5
  3. 12
  4. 25
Show answer
Correct: B. sqrt(3^2 + 4^2) = sqrt(25) = 5.

6. For a fair six-sided die, the expected value of one roll is:

  1. 3
  2. 3.5
  3. 6
  4. 1
Show answer
Correct: B. (1+2+3+4+5+6)/6 = 21/6 = 3.5.

7. Variance measures:

  1. the average value
  2. the most frequent value
  3. the spread around the mean
  4. the maximum value
Show answer
Correct: C. Variance is the expected squared deviation from the mean.

8. Two events A and B are independent when:

  1. P(A and B) = P(A) + P(B)
  2. P(A and B) = P(A) P(B)
  3. they are mutually exclusive
  4. P(A given B) = 0
Show answer
Correct: B. Independence means the joint probability factorizes into the product of the marginals.

9. Bayes' rule writes P(A given B) in terms of:

  1. P(B given A), P(A), P(B)
  2. P(A) + P(B)
  3. P(A) P(B) only
  4. P(A - B)
Show answer
Correct: A. P(A|B) = P(B|A) P(A) / P(B).

10. The partial derivative of f(x, y) = x^2 y with respect to x treats:

  1. both x and y as variables
  2. y as a constant
  3. x as a constant
  4. f as constant
Show answer
Correct: B. A partial derivative w.r.t. x holds the other variables (y) constant, giving 2xy.

11. The gradient of a scalar function points in the direction of:

  1. steepest descent
  2. zero change
  3. steepest ascent
  4. the x-axis
Show answer
Correct: C. The gradient points toward the greatest rate of increase; its negative is the steepest-descent direction.

12. The chain rule gives the derivative of f(g(x)) as:

  1. f'(x) g'(x)
  2. f'(g(x)) times g'(x)
  3. f'(g(x))
  4. g'(f(x))
Show answer
Correct: B. Differentiate the outer function at the inner value, times the derivative of the inner function.

13. To minimize a function, gradient descent moves a parameter:

  1. along the gradient
  2. opposite the gradient
  3. perpendicular to the gradient
  4. randomly
Show answer
Correct: B. It steps in the negative-gradient direction, scaled by the learning rate.

14. If the learning rate is far too large, gradient descent typically:

  1. converges faster with no downside
  2. diverges or oscillates
  3. stops immediately
  4. ignores the gradient
Show answer
Correct: B. Overshooting the minimum makes the loss oscillate or blow up.

15. A function is convex if:

  1. it has many local minima
  2. any local minimum is also global
  3. its gradient is always zero
  4. it is always increasing
Show answer
Correct: B. For convex functions every local minimum is global, which makes optimization reliable.

📚Refresher resources

Refresh

← All prerequisites Course home

NextPython Foundations & Advanced Features