Generic way of retrieving input signature of tfhub models?

slai-nick · October 20, 2022, 4:12pm

Hi,

I need to write a generic way of fetching and feeding tensorflow hub models with random inputs for testing purpose, so I am finding the input signature of the model using the above method to generate the appropriate input tensors and then feed the model with these inputs.
The problem seems to be that the method I use to retrieve the inputs does not tell me whether the model expects them as a dict or as keyword/positional arguments.

An example:

model = hub.load('https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4')
input_sig = model.signatures['serving_default'].structured_input_signature
print(input_sig)
# Out:
# ((), {'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), 'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids':TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids')})

From my understanding the tuple stored as .structured_input_signature' represents (positional_args, keyword_args). So I find that the model has 3 inputs, however what I don't find is that the model is expecting a single argument which is a dict of those three inputs (see in the Usage` section of that particular model).

Here is the actual stack trace when I try to feed the model with keyword arguments named as from the signature above:

Traceback (most recent call last):
  File "/Users/nicholasscottodiperto/work/Plugins/test/tfhub.py", line 322, in
 <module>
    run_model(
  File "/Users/nicholasscottodiperto/work/Plugins/test/tfhub.py", line 218, in
 run_model
    ppu_out = to_numpy(mod(**input_dict))
  File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-pack
ages/tensorflow/python/saved_model/load.py", line 686, in _call_attribute
    return instance.__call__(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-pack
ages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-pack
ages/tensorflow/python/eager/function_spec.py", line 410, in canonicalize_func
tion_inputs
    raise TypeError(f"{self.signature_summary()} missing 1 required "
TypeError: f(inputs, training, mask) missing 1 required argument: inputs.

This is even more confusing, the names here don’t even match the names from the signature.
If I pass the dictionary to the model, it works.
When trying to pass the values as positional arguments I get the following:

  Positional arguments (3 total):
    * <tf.Tensor 'inputs:0' shape=(8, 8) dtype=int32>
    * <tf.Tensor 'training:0' shape=(8, 8) dtype=int32>
    * <tf.Tensor 'mask:0' shape=(8, 8) dtype=int32>
  Keyword arguments: {}

 Expected these arguments to match one of the following 4 option(s):

Option 1:
  Positional arguments (3 total):
    * {'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'),
 'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'),
 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids')}
    * False
    * None
  Keyword arguments: {}

Option 2:
  Positional arguments (3 total):
    * {'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask'),
 'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'),
 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids')}
    * False
    * None
  Keyword arguments: {}

Option 3:
  Positional arguments (3 total):
    * {'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask'),
 'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'),
 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids')}
    * True
    * None
  Keyword arguments: {}

Option 4:
  Positional arguments (3 total):
    * {'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'),
 'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'),
 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids')}
    * True
    * None
  Keyword arguments: {}

This is the information I would like to retrieve, we can clearly see here that the model expects a dictionary of Tensors as input.

How can I find what to feed to the __call__ method in a generic way? Or how can I feed the model without having to bother whether I should feed a dict or keyword inputs?

PS: same problem with the output signature…

lgusm · October 21, 2022, 12:10pm

Hi slai-nick,

can you share more information about what you are trying to build?

This is a trick problem. Since models are code, they can have any input signature and that’s why we try to keep a good documentation, to make sure the publisher can explain how to use the model and make user’s life easier

Do you have a specific list of models?
For example, you’re using a BERT model on your example, TFHub have some default api for such models and of course not all publishers HAVE to use it, many end up doing so. Same for image related models.

Given that, your process can be easier depending on the list of models you need

One tool that may help you is the saved_model_cli

I hope it helps