How to convert tfdf.keras.GradientBoostedTreesModel saved model to Frozen graph format?

Manish_Vidyasagar · February 26, 2024, 5:11am

Hello Team,

I have a requirement where i need to have my saved model in frozen graph format. I am not able to find any pointers to convert my tfdf.keras.GradientBoostedTreesModel model to frozen graph .pb format for our internal serving infrastructure.

I followed the tutorials here but it didnt work out. I got this error: AttributeError: 'GradientBoostedTreesModel' object has no attribute 'graph'

I am in a time crunch and have been stuck on this for 3 days, can someone help me out?

Here is how i am saving my model

train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_df, label=feature_spec.get_label_col())
    valid_ds = tfdf.keras.pd_dataframe_to_tf_dataset(valid_df, label=feature_spec.get_label_col())
    test_ds = tfdf.keras.pd_dataframe_to_tf_dataset(test_df, label=feature_spec.get_label_col())

    if not model_config.gradient_boosted_trees_config:
        exit(f"Boosted trees config not set, check the config file {model_config}")

    model = tfdf.keras.GradientBoostedTreesModel(
        task=tfdf.keras.Task.CLASSIFICATION,
        **model_config.gradient_boosted_trees_config.model_dump(),
        features=[
            f for f in feature_spec.get_all_tffeature() if f.name not in feature_spec.get_blacklist_feature_names()
        ],
        exclude_non_specified_features=True,
        num_threads=12,
    )

    tensorboard_callback = keras.callbacks.TensorBoard(log_dir=model_config.tensorboard_log_dir + "/{}")
    model.fit(train_ds, validation_data=valid_ds, callbacks=[tensorboard_callback], batch_size=BATCH_SIZE)

    # Some information about the model.
    print(model.make_inspector().variable_importances())
    print(model.summary())

    # Evaluates the model on the test dataset.
    model.compile(
        metrics=[
            tf.keras.metrics.Accuracy(),
            tf.keras.metrics.Precision(),
            tf.keras.metrics.Recall(),
        ]
    )
    evaluation = model.evaluate(valid_ds)
    print(f"BinaryCrossentropyloss: {evaluation[0]}")
    print(f"Accuracy: {evaluation[1]}")

    saved_model_path = f"{model_config.saved_model_dir}/boosted_trees/my_saved_model_{utils.get_now()}"
    model.save(saved_model_path + ".keras", save_format="keras")
    model.save(saved_model_path, save_format="tf")

Manish_Vidyasagar · February 26, 2024, 5:54pm

@lgusm @Mathieu Wondering if you folks have any suggestions?

rstz · February 27, 2024, 8:51am

Hi, unfortunately this is not possible since Gradient Boosted Trees are not trained / represented the same way neural networks are Is there another format you can serve on your infra (e.g. in-process C++, tiny standalone binary, TensorFlow Serving, …) I hope we can find a solution that works for you.

Manish_Vidyasagar · February 27, 2024, 3:07pm

Thanks @rstz .

We are considering loading the model locally in the same service, Do you think the existing Java implementation of loading the model and getting inference will work with gbdt?

A non-gbdt question, we are also thinking of putting limited time effort on moving model to neural network based model. Will i be able to get a frozen graph model of keras preprocessing layers or tf transform layers to make an end to end frozen graph?

Edit:

This implementation seems to be graph based so am assuming it wont be compatible, lmk if my assumption is incorrect - SavedModelBundle | JVM | TensorFlow

I can’t find the inference code from keras/tf saved model in java -TensorFlow Java をインストールする | JVM

Do you have any pointers?

rstz · February 28, 2024, 8:52am

Hi, we don’t actively support TF-DF for Java, so I’m not sure if TF-DF models work there (Issue with loading model from: tensorflow_decision_forests · Issue #492 · tensorflow/java · GitHub) since the GBT inference ops are not implemented in vanilla Tensorflow, so one would probably need to link the TF-DF inference ops (packaged with the TF-DF pip package).

As one possible workaround, you can try to create a C++ library with just the inference and call it from Java. To bootstrap the necessary code, install the Python package ydf, run

import ydf
model = ydf.from_tensorflow_decision_forests("/tmp/my_tensorflow_saved_model")
# Print C++ code for inference with YDF
print(model.to_cpp(key="my_ydf_model"))