Misleading examples in the tf.keras.utils.timeseries_dataset_from_array

The tf.keras.utils.timeseries_dataset_from_array provides 3 examples, the 2nd of which is misleading and may lead to truncation of input data in the time series output.

The original example is:

# Example 2: Temporal regression.

# Consider an array data of scalar values, of shape (steps,). 
# To generate a dataset that uses the past 10 timesteps to predict the next timestep, you would use:

input_data = data[:-10]
targets = data[10:]
dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    input_data, targets, sequence_length=10)
for batch in dataset:
  inputs, targets = batch
  assert np.array_equal(inputs[0], data[:10])  # First sequence: steps [0-9]
  assert np.array_equal(targets[0], data[10])  # Corresponding target: step 10
  break

it returns

Input:[[0 1 2 3 4 5 6 7 8 9]], target:[10]

Say we set data = tf.range(20) in fact the steps that it generates is less than what it should have because the slicing of input_data is misleading. If it is to predict the next 1 step, the example should be:

data = tf.range(20)
input_data = data[:-1]
targets = data[10:]
dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    input_data, targets, sequence_length=10)
for batch in dataset:
  inputs, targets = batch
  assert np.array_equal(inputs[0], data[:10])  # First sequence: steps [0-9]
  assert np.array_equal(targets[0], data[10])  # Corresponding target: step 10
  break

for batch in dataset.as_numpy_iterator():
  input, label  = batch
  print(f"Input:{input}, target:{label}")

It returns:

Input:[[ 0  1  2  3  4  5  6  7  8  9]
 [ 1  2  3  4  5  6  7  8  9 10]
 [ 2  3  4  5  6  7  8  9 10 11]
 [ 3  4  5  6  7  8  9 10 11 12]
 [ 4  5  6  7  8  9 10 11 12 13]
 [ 5  6  7  8  9 10 11 12 13 14]
 [ 6  7  8  9 10 11 12 13 14 15]
 [ 7  8  9 10 11 12 13 14 15 16]
 [ 8  9 10 11 12 13 14 15 16 17]
 [ 9 10 11 12 13 14 15 16 17 18]], target:[10 11 12 13 14 15 16 17 18 19]