• recap
  • Topic: Recurrent neural network
  • Application: video analysis, speech recognition, climate measurements (RNN擅长处理序列)
  • Learning over sequences
    • Transition matrix: $f(S_t) = f(S_{t-1}, X_t)$ where $S_t$ is state at time $t$ and $X_t$ is observation at time $t$.
    • Example: predict "Mountain" from "Mountai"
    • Essentially, it is context and memory that need to be modeled
  • Example: predict attitude (positive/negative) from customer comments
  • Input representation
    • In this lecture, input are either words or characters
    • Words can be encoded using one-hot representation
  • Example: character RNN
    • Predicted character = $h_t = f_W(x_t, h_{t-1})$ where $h$ is state and $x$ is input character
  • Vanilla RNN cell
  • RNN forward pass
  • Backpropagation Through Time
  • Backpropagation of RNN
    • ......
    • gradient clipping: ......
  • Long-term dependencies
  • Long short-term memory