Training helper functions and classes

Adds new optimizers and LR schedules to dynet.

dynn.training.inverse_sqrt_schedule(warmup, lr0)

Inverse square root learning rate schedule

At step \(t\) , the learning rate has value

\[\texttt{lr}\times \min(1 {\sqrt{t}}, \sqrt{\frac {t} {\texttt{warmup}^3})\]
Parameters:
  • warmup (int) – Number of warmup steps
  • lr0 (float) – Initial learning rate