[AutoKeras] Why is this Autokeras NAS failing?

I have two ditectories train_data_npy and valid_data_npy where there are 3013 and 1506 *.npy files respectively.

Each *.npy file has 12 columns of float types of which first nine columns are features, and last three columns are one-hot-encoded labels of three classes.

The following python script’s task is to load those *.npy files in chunks, so that the memory is not overflowed, while search for a neural network model.

However, the script is failing.

What exactly is the issue with the given script?

Why is the script failing?

Or, is it not about the script rather about the installation issues of CUDA, TF, or AutoKeras?

# File: cnn_search_by_chunk.py
import numpy as np
import tensorflow as tf
import os
import autokeras as ak

N_FEATURES = 9
BATCH_SIZE = 100

def get_data_generator(folder_path, batch_size, n_features):
    """Get a generator returning batches of data from .npy files in the specified folder.

    The shape of the features is (batch_size, n_features).
    """
    def data_generator():
        files = os.listdir(folder_path)
        npy_files = [f for f in files if f.endswith('.npy')]

        for npy_file in npy_files:
            data = np.load(os.path.join(folder_path, npy_file))
            x = data[:, :n_features]
            y = data[:, n_features:]
            y = np.argmax(y, axis=1)  # Convert one-hot-encoded labels back to integers

            for i in range(0, len(x), batch_size):
                yield x[i:i+batch_size], y[i:i+batch_size]

    return data_generator

train_data_folder = '/home/my_user_name/original_data/train_data_npy'
validation_data_folder = '/home/my_user_name/original_data/valid_data_npy'

train_dataset = tf.data.Dataset.from_generator(
    get_data_generator(train_data_folder, BATCH_SIZE, N_FEATURES),
    output_signature=(
        tf.TensorSpec(shape=(None, N_FEATURES), dtype=tf.float32),
        tf.TensorSpec(shape=(None,), dtype=tf.int32)  # Labels are now 1D integers
    )
)

validation_dataset = tf.data.Dataset.from_generator(
    get_data_generator(validation_data_folder, BATCH_SIZE, N_FEATURES),
    output_signature=(
        tf.TensorSpec(shape=(None, N_FEATURES), dtype=tf.float32),
        tf.TensorSpec(shape=(None,), dtype=tf.int32)  # Labels are now 1D integers
    )
)

clf = ak.StructuredDataClassifier(overwrite=True, max_trials=1, seed=5)
clf.fit(x=train_dataset, validation_data=validation_dataset, batch_size=BATCH_SIZE)
print(clf.evaluate(validation_dataset))
my_user_name@192:~/my_project_name_v2$ python3 cnn_search_by_chunk.py
2023-11-29 20:05:53.532005: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Using TensorFlow backend
2023-11-29 20:05:55.467804: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1960] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

Search: Running Trial #1

Hyperparameter    |Value             |Best Value So Far
structured_data...|True              |?
structured_data...|2                 |?
structured_data...|False             |?
structured_data...|0                 |?
structured_data...|32                |?
structured_data...|32                |?
classification_...|0                 |?
optimizer         |adam              |?
learning_rate     |0.001             |?

Epoch 1/1000
33143/33143 [==============================] - 149s 4ms/step - loss: 0.0670 - accuracy: 0.9677 - val_loss: 0.0612 - val_accuracy: 0.9708
Epoch 2/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0625 - accuracy: 0.9697 - val_loss: 0.0598 - val_accuracy: 0.9715
Epoch 3/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0617 - accuracy: 0.9702 - val_loss: 0.0593 - val_accuracy: 0.9717
Epoch 4/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0614 - accuracy: 0.9703 - val_loss: 0.0591 - val_accuracy: 0.9718
Epoch 5/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0612 - accuracy: 0.9705 - val_loss: 0.0590 - val_accuracy: 0.9719
Epoch 6/1000
33143/33143 [==============================] - 145s 4ms/step - loss: 0.0610 - accuracy: 0.9707 - val_loss: 0.0588 - val_accuracy: 0.9721
Epoch 7/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0608 - accuracy: 0.9707 - val_loss: 0.0586 - val_accuracy: 0.9721
Epoch 8/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0607 - accuracy: 0.9709 - val_loss: 0.0585 - val_accuracy: 0.9723
Epoch 9/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0605 - accuracy: 0.9710 - val_loss: 0.0584 - val_accuracy: 0.9723
Epoch 10/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0604 - accuracy: 0.9710 - val_loss: 0.0583 - val_accuracy: 0.9724
Epoch 11/1000
33143/33143 [==============================] - 148s 4ms/step - loss: 0.0603 - accuracy: 0.9711 - val_loss: 0.0583 - val_accuracy: 0.9724
Epoch 12/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0602 - accuracy: 0.9712 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 13/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0601 - accuracy: 0.9712 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 14/1000
33143/33143 [==============================] - 148s 4ms/step - loss: 0.0601 - accuracy: 0.9712 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 15/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0600 - accuracy: 0.9713 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 16/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0600 - accuracy: 0.9713 - val_loss: 0.0581 - val_accuracy: 0.9725
Epoch 17/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0600 - accuracy: 0.9713 - val_loss: 0.0581 - val_accuracy: 0.9725
Epoch 18/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0599 - accuracy: 0.9713 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 19/1000
33143/33143 [==============================] - 145s 4ms/step - loss: 0.0599 - accuracy: 0.9713 - val_loss: 0.0581 - val_accuracy: 0.9724
Epoch 20/1000
33143/33143 [==============================] - 144s 4ms/step - loss: 0.0599 - accuracy: 0.9713 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 21/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0599 - accuracy: 0.9713 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 22/1000
33143/33143 [==============================] - 144s 4ms/step - loss: 0.0599 - accuracy: 0.9713 - val_loss: 0.0581 - val_accuracy: 0.9724
Epoch 23/1000
33143/33143 [==============================] - 146s 4ms/step - loss: 0.0600 - accuracy: 0.9713 - val_loss: 0.0582 - val_accuracy: 0.9724
Epoch 24/1000
33143/33143 [==============================] - 145s 4ms/step - loss: 0.0599 - accuracy: 0.9714 - val_loss: 0.0581 - val_accuracy: 0.9725
Epoch 25/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0599 - accuracy: 0.9714 - val_loss: 0.0581 - val_accuracy: 0.9724
Epoch 26/1000
33143/33143 [==============================] - 147s 4ms/step - loss: 0.0599 - accuracy: 0.9713 - val_loss: 0.0581 - val_accuracy: 0.9724
Trial 1 Complete [01h 16m 38s]
val_accuracy: 0.9724819660186768

Best val_accuracy So Far: 0.9724819660186768
Total elapsed time: 01h 16m 38s
WARNING:tensorflow:Detecting that an object or model or tf.train.Checkpoint is being deleted with unrestored values. See the following logs for the specific values in question. To silence these warnings, use `status.expect_partial()`. See https://www.tensorflow.org/api_docs/python/tf/train/Checkpoint#restorefor details about the status object returned by the restore function.
WARNING:tensorflow:Detecting that an object or model or tf.train.Checkpoint is being deleted with unrestored values. See the following logs for the specific values in question. To silence these warnings, use `status.expect_partial()`. See https://www.tensorflow.org/api_docs/python/tf/train/Checkpoint#restorefor details about the status object returned by the restore function.
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.1
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.1
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.2
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.2
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.3
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.3
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.4
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.4
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.5
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.5
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.6
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.6
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.7
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.7
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.8
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.8
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.9
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.9
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.10
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.10
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.11
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.11
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.12
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).optimizer._variables.12
2023-11-29 21:23:57.450991: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451029: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451059: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451091: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451123: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451157: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451185: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451213: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
2023-11-29 21:23:57.451250: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at lookup_table_op.cc:929 : FAILED_PRECONDITION: Table not initialized.
Traceback (most recent call last):
  File "cnn_search_by_chunk.py", line 50, in <module>
    print(clf.evaluate(validation_dataset))
  File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/tasks/structured_data.py", line 187, in evaluate
    return super().evaluate(x=x, y=y, **kwargs)
  File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/auto_model.py", line 492, in evaluate
    return utils.evaluate_with_adaptive_batch_size(
  File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/utils/utils.py", line 68, in evaluate_with_adaptive_batch_size
    return run_with_adaptive_batch_size(
  File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/utils/utils.py", line 101, in run_with_adaptive_batch_size
    history = func(x=x, validation_data=validation_data, **fit_kwargs)
  File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/utils/utils.py", line 70, in <lambda>
    lambda x, validation_data, **kwargs: model.evaluate(
  File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/my_user_name/.local/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 53, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.FailedPreconditionError: Graph execution error:

Detected at node 'model/multi_category_encoding/string_lookup_15/None_Lookup/LookupTableFindV2' defined at (most recent call last):
    File "cnn_search_by_chunk.py", line 50, in <module>
      print(clf.evaluate(validation_dataset))
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/tasks/structured_data.py", line 187, in evaluate
      return super().evaluate(x=x, y=y, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/auto_model.py", line 492, in evaluate
      return utils.evaluate_with_adaptive_batch_size(
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/utils/utils.py", line 68, in evaluate_with_adaptive_batch_size
      return run_with_adaptive_batch_size(
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/utils/utils.py", line 101, in run_with_adaptive_batch_size
      history = func(x=x, validation_data=validation_data, **fit_kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/utils/utils.py", line 70, in <lambda>
      lambda x, validation_data, **kwargs: model.evaluate(
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/training.py", line 2200, in evaluate
      logs = test_function_runner.run_step(
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/training.py", line 4000, in run_step
      tmp_logs = self._function(dataset_or_iterator)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/training.py", line 1972, in test_function
      return step_function(self, iterator)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/training.py", line 1956, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/training.py", line 1944, in run_step
      outputs = model.test_step(data)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/training.py", line 1850, in test_step
      y_pred = self(x, training=False)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/training.py", line 569, in __call__
      return super().__call__(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/base_layer.py", line 1150, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/functional.py", line 512, in call
      return self._run_internal_graph(inputs, training=training, mask=mask)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/functional.py", line 669, in _run_internal_graph
      outputs = node.layer(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/base_layer.py", line 1150, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/keras_layers.py", line 91, in call
      for input_node, encoding_layer in zip(split_inputs, self.encoding_layers):
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/keras_layers.py", line 92, in call
      if encoding_layer is None:
    File "/home/my_user_name/.local/lib/python3.8/site-packages/autokeras/keras_layers.py", line 100, in call
      output_nodes.append(
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/engine/base_layer.py", line 1150, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/layers/preprocessing/index_lookup.py", line 756, in call
      lookups = self._lookup_dense(inputs)
    File "/home/my_user_name/.local/lib/python3.8/site-packages/keras/src/layers/preprocessing/index_lookup.py", line 792, in _lookup_dense
      lookups = self.lookup_table.lookup(inputs)
Node: 'model/multi_category_encoding/string_lookup_15/None_Lookup/LookupTableFindV2'
Table not initialized.
         [[{{node model/multi_category_encoding/string_lookup_15/None_Lookup/LookupTableFindV2}}]] [Op:__inference_test_function_5785123]
2023-11-29 21:23:57.618149: W tensorflow/core/kernels/data/generator_dataset_op.cc:108] Error occurred when finalizing GeneratorDataset iterator: FAILED_PRECONDITION: Python interpreter state is not initialized. The process may be terminated.
         [[{{node PyFunc}}]]
2023-11-29 21:23:57.618266: W tensorflow/core/kernels/data/generator_dataset_op.cc:108] Error occurred when finalizing GeneratorDataset iterator: FAILED_PRECONDITION: Python interpreter state is not initialized. The process may be terminated.
         [[{{node PyFunc}}]]
2023-11-29 21:23:57.618360: W tensorflow/core/kernels/data/generator_dataset_op.cc:108] Error occurred when finalizing GeneratorDataset iterator: FAILED_PRECONDITION: Python interpreter state is not initialized. The process may be terminated.
         [[{{node PyFunc}}]]
2023-11-29 21:23:57.618434: W tensorflow/core/kernels/data/generator_dataset_op.cc:108] Error occurred when finalizing GeneratorDataset iterator: FAILED_PRECONDITION: Python interpreter state is not initialized. The process may be terminated.
         [[{{node PyFunc}}]]
my_user_name@192:~/my_project_name_v2$

Hi @South_Asia

Welcome to the TensorFlow Forum!

Please provide some more details on installed TensorFlow, Python and Autokeras version as well as using systemOS to replicate and understand the issue. Thank you.