TFRecordDataset auto cache

I have a input pipeline that I need to update input regularly, so I use TFRecordDataset and thought I just need to update the file to update pipeline. However, it looks like the pipeline auto cache the dataset, but I didn’t use cache() method. Can anyone help me point out what making my pipeline automatically cache dataset?
Below is my pipeline:

ds =,file_name))

ds =,

options =
options.experimental_distribute.auto_shard_policy = (

train_dataflow = ds.with_options(options)

train_ds = train_dataflow.repeat().batch(
self.batch_size, drop_remainder=True

train_input_iterator = (

1 Like

Did you resolve this issue? If so, how? I’m seeing a similar issue with my code…