Training helper functions and classes¶

Adds new optimizers and LR schedules to dynet.

dynn.training.inverse_sqrt_schedule(warmup, lr0)¶

Inverse square root learning rate schedule

At step \(t\) , the learning rate has value

\[\texttt{lr}\times \min(1 {\sqrt{t}}, \sqrt{\frac {t} {\texttt{warmup}^3})\]

Parameters:	warmup (int) – Number of warmup steps lr0 (float) – Initial learning rate