Preprocessing Layers and KerasTuner Cooperation

Hi !

I’m having troubles making the Preprocessing layers and the Keras Tuner cooperate.

I am referring to this tutorial Load CSV data | TensorFlow Core for the Preprocessing part, and to the Getting started with KerasTuner documentation for the Keras Tuner part.

Briefly, here’s the code.
He loads the data:

titanic = pd.read_csv("https://storage.googleapis.com/tf-datasets/titanic/train.csv")
titanic_features = titanic.copy()
titanic_labels = titanic_features.pop('survived')

He creates the symbolic tensors of the features in a dictionary

inputs = {}

for name, column in titanic_features.items():
  dtype = column.dtype
  if dtype == object:
    dtype = tf.string
  else:
    dtype = tf.float32

  inputs[name] = tf.keras.Input(shape=(1,), name=name, dtype=dtype)

inputs

Then he applies normalization on the numerical features:

numeric_inputs = {name:input for name,input in inputs.items()
                  if input.dtype==tf.float32}

x = layers.Concatenate()(list(numeric_inputs.values()))
norm = layers.Normalization()
norm.adapt(np.array(titanic[numeric_inputs.keys()]))
all_numeric_inputs = norm(x)

all_numeric_inputs

He creates a list

preprocessed_inputs = [all_numeric_inputs]

He one hot encodes the categorical features:

for name, input in inputs.items():
  if input.dtype == tf.float32:
    continue

  lookup = layers.StringLookup(vocabulary=np.unique(titanic_features[name]))
  one_hot = layers.CategoryEncoding(num_tokens=lookup.vocabulary_size())

  x = lookup(input)
  x = one_hot(x)
  preprocessed_inputs.append(x)

and then He concatenates it:

preprocessed_inputs_cat = layers.Concatenate()(preprocessed_inputs)

titanic_preprocessing = tf.keras.Model(inputs, preprocessed_inputs_cat)

Now what I want to do is insert this preprocessing part in KerasTuner. I tried this:

def titanic_model(units, activation):

    model_inputs = tf.keras.Input(shape=28)

    dense_1 = layers.Dense(units=units, activation=activation)(model_inputs)
    dense_output = layers.Dense(1)(dense_1)
    body = tf.keras.Model(inputs = model_inputs, outputs =  dense_output)

    return body

def build_model(hp,preprocessing_head, inputs):

    units = hp.Int("units", min_value=32, max_value=512, step=32)
    activation = hp.Choice("activation", ["relu", "tanh"])

    preprocessed_inputs = preprocessing_head(inputs)
    result = titanic_model(units,
                           activation)(preprocessed_inputs)

    model = tf.keras.Model(inputs, result)

    model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
                  optimizer=tf.keras.optimizers.Adam(),
                  metrics = ['accuracy'])

    return model

titanic_model = build_model(keras_tuner.HyperParameters(),titanic_preprocessing, inputs)

but it gives me the following error:

Inputs to a layer should be tensors. Got: <keras_tuner.engine.hyperparameters.HyperParameters
object at 0x7ff52844da30>

I cannot understand if I am close to the solution, or this is not the right way to proceed.

However, the workaround i found was to insert directly in the build_model function the preprocessing layer (titanic_preprocessing) and the inputs dictionary, without passing it as an argument of the function.

Hence:

def titanic_model(units, activation):

    model_inputs = tf.keras.Input(shape=28)

    dense_1 = layers.Dense(units=units, activation=activation)(model_inputs)
    dense_output = layers.Dense(1)(dense_1)
    body = tf.keras.Model(inputs = model_inputs, outputs =  dense_output)

    return body

def build_model(hp):

    units = hp.Int("units", min_value=32, max_value=512, step=32)
    activation = hp.Choice("activation", ["relu", "tanh"])

    preprocessed_inputs = titanic_preprocessing(inputs)
    result = titanic_model(units,
                           activation)(preprocessed_inputs)

    model = tf.keras.Model(inputs, result)

    model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
                  optimizer=tf.keras.optimizers.Adam(),
                  metrics = ['accuracy'])

    return model

titanic_model = build_model()

In this case it seems to work, and by setting the tuner

tuner = keras_tuner.RandomSearch(
    hypermodel = build_model,
    objective=keras_tuner.Objective("accuracy", direction="max"),
    max_trials = 1,
    overwrite = True,
    directory = "tuner_dir",
    project_name = "regression_tuner")

and searching it works:

tuner.search(x=titanic_features_dict, y=titanic_labels, epochs=10)

However, I am doubting this solution and would appreciate your feedback on this.

Thank you!

Did you receive any response to this query - if not, will try it work thru it…

Arindam

Hi Arindam!

Unfortunately I still do not have the solution to the question :slightly_frowning_face:

A possible solution I had in mind is to make the preprocessing inside a function of the model.
But it does not seem convenient.