[AutoKeras] Bug in using tf.Dataset for training

Hi everyone! :wave:
I was exporting a custom data generator to a tf.Dataset to use my dataset in a memory-efficient way. However, I am encountering this error:-

ValueError: Expect x to be a non-empty array or dataset.

To keep things clean here, I have put all the heavy information on a GitHub Thread.

If anyone requires additional information or help for reproduction, please do not hesitate to ping me! As I have mentioned in my issue, using torch.random.radn((...)) for a dummy dataset seems the fastest way :rocket:

Dissecting the error message, it seems that in train.py (keras/engine) the Dataset it gets is empty (even though its not) which yields no updates to the model parameters. Since no updates are made, no logs are created. logs is unchanged to its OG value None and it hits the raise statement.

If anyone has any idea, please do help me out! :hugs:

I made a little progress; but help is really appreciated as Im kinda confused :sweat: :pray:

The issue seems to be directly at autokeras/auto_model.py at 78223dfc63056414e32e202b158dbc981be48dc9 · keras-team/autokeras · GitHub just when I pass the tf.Dataset - its simply empty. I have checked numerous times but can confirm I am not passing an empty Dataset. strange. but I will surely double check.

Again, if anyone can expedite this process for me - that would be really appreciated!

When I learned how to use generators with Keras, this thread helped me: python - How to make a generator callable? - Stack Overflow

1 Like

oh, it’s definitely callable - I’ve already put a lambda in the definition of train_dataset but that doesn’t seem to be the source of the error at all.

The actual problem is a bit simple, but I can’t seem to get my head around it at all:-

very simply, in my own script ,train_dataset seems to change its value automatically after a few lines of code and comments none of which actually reference it directly :thinking: