Try out ConvNeXt in Keras!

Hi folks,

I hope you are doing well. I wanted to tell y’all about the new ConvNeXt models [1] I have been converting for the past few days. Finally, they are available on TF-Hub [2]. The collection contains a total of 30 models that are categorised into two groups: classifier and feature extractor.

These models are NOT blackbox SavedModels i.e., they can be fully expanded into tf.keras.Model objects and one can call all the utility functions on them (example: .summary()). TF-Hub links to the models,

conversion code, off-the-shelf classification, and fine-tuning code are available in the GitHub repository [3]. There are some points in the repository that calls for contributions, so happy to welcome them if
anyone’s interested :slight_smile:

A huge shoutout to the ML-GDE team for providing GCP credits that made the validation of these models [4] possible. Also, thanks to @vasudevgupta, @lgusm, and Willi Gierke for helping out. Happy to address any feedback and answer any questions.

References:

[1] https://arxiv.org/abs/2201.03545

[2] TensorFlow Hub

[3] GitHub - sayakpaul/ConvNeXt-TF: Includes PyTorch -> Keras model porting code for ConvNeXt family of models with fine-tuning and inference notebooks.

[4] ConvNeXt-TF/i1k_eval at main · sayakpaul/ConvNeXt-TF · GitHub

7 Likes

This is awesome @Sayak_Paul :clap: :clap: :clap: :clap: Thanks! And thanks @vasudevgupta and @lgusm and Willi :dark_sunglasses:

This is great work!!!

Thanks for sharing these models Sayak!

1 Like

@Sayak_Paul @lgusm How to reproduce conv-next in keras? Is your GitHub holds an on-the-shell classifier to do that? What should be the steps? I want to train the model on image net and inference on a validation set.

The ImageNet-1k evaluation scripts are here: ConvNeXt-TF/i1k_eval at main · sayakpaul/ConvNeXt-TF · GitHub.

For training on ImageNet-1k, you’d need to follow the paper and implement the necessary utilities.

1 Like

Hi Sayak,
Thanks for the nice work and for sharing this. I am new to use hub. I need to hook/add something to the model and finetune it on imagenet.
How can I see all the layers of the model? if I need to crop it before the last layers.
Thanks

Thank you!

Here’s an example:

import tensorflow as tf

model_gcs_path = "gs://tfhub-modules/sayakpaul/convnext_tiny_1k_224/1/uncompressed"
model = tf.keras.models.load_model(model_gcs_path)
print(model.summary(expand_nested=True))

Just so you are aware, some of the models that come with the TF-Hub collection are already fine-tuned on ImageNet-1k while some of them were pre-trained on ImageNet-21k. Be sure to refer to the documentation of each model before proceeding.

1 Like

Thanks, great. I meant “finer”-tuning on imagenet with my attachment:))

You might also want to take a look on the feature extractors that Sayak also published, they might be a better option for fine tuning directly

2 Likes

Thanks for sharing! Testing it out with the feature extractor as a backbone to my model.

Please do and let us know about the results.

Hi @Sayak_Paul, the training goes well, but I am having trouble saving the model.
model.save_weights throws me this error:
ValueError: Unable to create dataset (name already exists)
and model.save gives me error about get_config()
It seems that I cannot call get_config with model.get_config() which might be affecting the error.

Do you have any suggestions?

Serializing to h5 weights might not be possible since the model has a custom layer that does not override get_config(). It’s an oversight on my part. Apologies.

Could you do model.save("directory-to-savedmodel") and report back?

If serializing to .h5 weights is a requirement for you then I would suggest directly using this repository and running the conversion:

One tweak you’d need, though. You’d need to implement get_config() for this layer:

1 Like

Here’s an example showing a serialization recipe of the model:

For the model, it uses a feature extractor and then appends a single classification layer.

I hope that helps.

Hi @Sayak_Paul ,

model.save does not work either. I just need to save the model in any form, but both save and save_weights don’t work in my case…

Do you think it is expected that both methods to not work? Or would you think model.save should work and something might have been done wrong?

Hi @Sayak_Paul ,

Thank you so much for the example. Let me try and get back to you.

What are the ways to load model from non - .h5 files? After saving the model according to the colab you saved with me, with the name "custom_model",
There are four files under custom_model folder:

!ls custom_model
assets keras_metadata.pb saved_model.pb variables

If I use custom_model.h5 as a filename, it throws the exact same error I got from my model.

NotImplementedError Traceback (most recent call last)

[<ipython-input-6-d82d1d889911>](https://localhost:8080/#) in <module>() ----> 1 custom_model.save("custom_model.h5")

[/usr/local/lib/python3.7/dist-packages/keras/saving/saved_model/load.py](https://localhost:8080/#) in get_config(self) 1061 return self._config 1062 else: -> 1063 raise NotImplementedError 1064 1065

NotImplementedError:

You should be able to load the serialized SavedModel by tf.keras.models.load_model("custom_model").

1 Like

Oh, I see. Thank you very much for your help!

1 Like

Hi @Sayak_Paul ,

model.save works before the training like this:

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as conv2d_4_layer_call_fn, conv2d_4_layer_call_and_return_conditional_losses, layer_normalization_4_layer_call_fn, layer_normalization_4_layer_call_and_return_conditional_losses, dense_layer_call_fn while saving (showing 5 of 432). These functions will not be directly callable after loading.

INFO:tensorflow:Assets written to: custom_model2/assets

INFO:tensorflow:Assets written to: custom_model2/assets

However, it doesn’t work for saving after the training, with the following errors:

WARNING:absl:Found untraced functions such as conv2d_4_layer_call_fn, conv2d_4_layer_call_and_return_conditional_losses, layer_normalization_4_layer_call_fn, layer_normalization_4_layer_call_and_return_conditional_losses, dense_layer_call_fn while saving (showing 5 of 432). These functions will not be directly callable after loading.

INFO:tensorflow:Assets written to: model_save_dir/assets

INFO:tensorflow:Assets written to: model_save_dir/assets

---------------------------------------------------------------------------

UnimplementedError                        Traceback (most recent call last)

<ipython-input-44-118ce546ce88> in <module>()
----> 1 model.save("model_save_dir")

1 frames

/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/context.py in sync_executors(self)
    692     """
    693     if self._context_handle:
--> 694       pywrap_tfe.TFE_ContextSyncExecutors(self._context_handle)
    695     else:
    696       raise ValueError("Context is not initialized.")

UnimplementedError: File system scheme '[local]' not implemented (file: 'model_save_dir/variables/variables_temp/part-00000-of-00001')
	Encountered when executing an operation using EagerExecutor. This error cancels all future operations and poisons their output tensors.

Any suggestions for workarounds?

Thank you.