Penn TreeBank¶
Various functions for accessing the PTB dataset used by Mikolov et al., 2010.
-
dynn.data.ptb.
download_ptb
(path='.', force=False)¶ Downloads the PTB from “http://www.fit.vutbr.cz/~imikolov/rnnlm”
Parameters:
-
dynn.data.ptb.
load_ptb
(path, eos=None)¶ Loads the PTB dataset
Returns the train and test set, each as a list of images and a list of labels. The images are represented as numpy arrays and the labels as integers.
Parameters: Returns: dictionary mapping the split name to a list of strings
Return type:
-
dynn.data.ptb.
read_ptb
(split, path, eos=None)¶ Iterates over the PTB dataset
Example:
for sent in read_ptb("train", "/path/to/ptb"): train(sent)
Parameters: Returns: tree, label
Return type: