class dynn.data.batching.bptt_batching.BPTTBatches(data, batch_size=32, seq_length=30)

Bases: object

Wraps a list of sequences as a contiguous batch iterator.

This will iterate over batches of contiguous subsequences of size seq_length. TODO: elaborate

Example:

# Dictionary
# Sequence of length 1000
data = np.random.randint(10, size=1000)
# Iterator with over subsequences of length 20 with batch size 5
batched_dataset = BPTTBatches(data, batch_size=5, seq_length=20)
# Training loop
for x, y in batched_dataset:
    # x has and y have shape (seq_length, batch_size)
    # y[i+1] == x[i]
    # Do something with x
Parameters:
  • data (list) – List of numpy arrays containing the data
  • targets (list) – List of targets
  • batch_size (int, optional) – Batch size
  • seq_length (int, optional) – BPTT length
__getitem__(index)

Returns the index th sample

The result is a tuple x, next_x of numpy arrays of shape seq_len x batch_size seq_length is determined by the range specified by index, and next_x[t]=x[t+1] for all t

Parameters:index (int, slice) – Index or slice
Returns:x, next_x
Return type:tuple
__init__(data, batch_size=32, seq_length=30)

Initialize self. See help(type(self)) for accurate signature.

__len__()

This returns the number of batches in the dataset (not the total number of samples)

Returns:
Number of batches in the dataset
ceil(len(data)/batch_size)
Return type:int
__weakref__

list of weak references to the object (if defined)

just_passed_multiple(batch_number)

Checks whether the current number of batches processed has just passed a multiple of batch_number.

For example you can use this to report at regular interval (eg. every 10 batches)

Parameters:batch_number (int) – [description]
Returns:True if \(\fraccurrent_batch\)
Return type:bool
percentage_done()

What percent of the data has been covered in the current epoch

reset()

Reset the iterator and shuffle the dataset if applicable