Zoom
\[ \mathbf{h}_{t+1} = \phi(W_h \mathbf{h}_t + W_x \mathbf{x}_t + \mathbf{b_h}) \]
\[\begin{align} \mathbf{h}_{t+1} &= \mathbf{z}_t \odot \mathbf{h}_t + (1 - \mathbf{z}_t) \odot \tilde{\mathbf{h}}_t \\ \tilde{\mathbf{h}}_t &= \phi\left(W\mathbf{x}_t + U(\mathbf{r}_t \odot \mathbf{h}_t)\right)\\ \mathbf{r}_t &= \sigma(W_r\mathbf{x}_t + U_r\mathbf{h}_t)\\ \mathbf{z}_t &= \sigma(W_z\mathbf{x}_t + U_z\mathbf{h}_t)\\ \end{align}\]
…in code:
if r:
return 5
else:
return 3
…in algebra:
return r*5 + (1-r)*3
\[\begin{align} \mathbf{h}_{t+1} &= \mathbf{z}_t \odot \mathbf{h}_t + (1 - \mathbf{z}_t) \odot \tilde{\mathbf{h}}_t \\ \tilde{\mathbf{h}}_t &= \phi\left(W\mathbf{x}_t + U(\mathbf{r}_t \odot \mathbf{h}_t)\right)\\ \mathbf{r}_t &= \sigma(W_r\mathbf{x}_t + U_r\mathbf{h}_t)\\ \mathbf{z}_t &= \sigma(W_z\mathbf{x}_t + U_z\mathbf{h}_t)\\ \end{align}\]
from (Veit et al, 2016)
See this distill post
figure from Andrej Karpathy’s blog post
travelling salesman
convex hull and triangulation
“Asking the network too much”
Attention weights:
\[ \alpha_{t,s} = \frac{e^{\mathbf{e}^T_t \mathbf{d}_s}}{\sum_u e^{\mathbf{e}^T_t \mathbf{d}_s}} \]
Context vector:
\[ \mathbf{c}_s = \sum_{t=1}^T \alpha_{t,s} \mathbf{e}_t \]
Try the char-RNN Exercise from Udacity.