Save a RandomForestModel?

Hello. Trying to learn tensorflow so apologies if I’m missing something obvious. I can’t seem to make predictions on a loaded model. My original model seems to work just find and I can get a prediction without issues. However, saving and loading that same model doesn’t seem to do what I would think.

Predicting with the original model gives me appropriate values while the loaded model gives me an error:

model.save(“Model”)
s = testFrame.head(1).drop([‘classification’,‘cloudProb’], axis=1).loc[0].to_dict()
loadedModel = tf.saved_model.load(‘Model’)
model(s)
loadedModel(s)

The above code works when passing ‘s’ into model but fails for the loadedModel.

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (2 total):
* {‘band2’: 346.0, ‘band3’: 625.0, ‘band4’: 443.0, ‘band8’: 2688.0}
* False
Keyword arguments: {}

Expected these arguments to match one of the following 4 option(s):

Option 1:
Positional arguments (2 total):
* {‘band4’: TensorSpec(shape=(None, 1), dtype=tf.float32, name=‘band4’), ‘band8’: TensorSpec(shape=(None, 1), dtype=tf.float32, name=‘band8’), ‘band3’: TensorSpec(shape=(None, 1), dtype=tf.float32, name=‘band3’), ‘band2’: TensorSpec(shape=(None, 1), dtype=tf.float32, name=‘band2’)}
* False
Keyword arguments: {}

… it goes on to give me all the options. What am I missing here? Thanks for the help.

1 Like

@Mathieu might be able to help

Hi Barrett,

Sorry for the late answer. Thanks @lgusm for the alert :).

Quick answer

Can you try applying expand_dims on the feature before calling the model.

Full example:

model.save("/tmp/my_model")
loaded_model = tf.keras.models.load_model("/tmp/my_model")

for features,label in test_ds:

  # Add a batch dimension.
  features = tf.nest.map_structure(lambda v : tf.expand_dims(v,axis=0), features)

  # Make sure the model is feed with rank 2 features.	 
  features = tf.nest.map_structure(lambda v : tf.expand_dims(v,axis=1), features)

  print(loaded_model(features))

Explanations

There are two separate issues:

Part 1

In your example {‘band2’: 346.0, ‘band3’: 625.0, ‘band4’: 443.0, ‘band8’: 2688.0}, the values are of rank 0 i.e. those are single values. Instead, models should be fed with a batch of examples. So you need to make sure the examples are of rank>=1. The result of tf.expand_dims(v,axis=0) will be: {‘band2’: [346.0], ‘band3’: [625.0], ‘band4’: [443.0], ‘band8’: [2688.0]}.

If you are using the tf.data.Dataset API, you can also use the batch method. The pd_dataframe_to_tf_dataset method also does that for you.

Part 2

Internally, Keras all the features to rank 2. However, when calling the model directly (i.e. model(s)), this logic is skipped, which creates a shape mismatch. In your error, you can see that tensors of rank 2 are expected e.g. TensorSpec(shape=(None, 1). If the model was just trained (model in your example), Keras is able to solve the issue. But if the model was serialized and deserialized (loadedModel in your example), it does not.

The second call tf.expand_dims(v,axis=1) will reshape your features are follow {‘band2’: [[346.0]], ‘band3’: [[625.0]], ‘band4’: [[443.0]], ‘band8’: [[2688.0]]}

This is of course not user friendly :), and we are working with Keras and hoping to solve the problem soon. In the meantime, users have to make sure the model’s calls are of rank 2.

Note: model.predict and model.evaluate do not suffer from this issue i.e. loaded_model.predict(test_ds) works fine in all cases.

1 Like

Hey, thanks for the comprehensive answer! Okay so rank 2 is key - got it. I played around a bit:

print(testSet)

for features,label in testSet:
    features = tf.nest.map_structure(lambda v: tf.expand_dims(v, axis=0), features)
    features = tf.nest.map_structure(lambda v: tf.expand_dims(v, axis=1), features)

print(features)

Gives me

<BatchDataset shapes: ({band2: (None,), band3: (None,), band4: (None,), band8: (None,)}, (None,)), types: ({band2: tf.float64, band3: tf.float64, band4: tf.float64, band8: tf.float64}, tf.int64)>
{'band2': <tf.Tensor: shape=(1, 1, 64), dtype=float64, numpy=
array([[[260., 261., 237., 242., 227., 227., 207., 217., 227., 238.,
         225., 211., 226., 237., 245., 216., 218., 200., 247., 284.,
         223., 209., 199., 200., 231., 216., 222., 195., 213., 213.,
         197., 253., 261., 197., 200., 210., 208., 223., 244., 199.,
         208., 199., 161., 147., 124., 120., 135., 124., 144., 160.,
         141., 111., 145., 134., 122., 116., 155., 163., 164., 151.,
         180., 153., 123., 131.]]])>, 'band3': <tf.Tensor: shape=(1, 1, 64), dtype=float64, numpy=
array([[[569., 555., 502., 507., 514., 533., 512., 491., 524., 517.,
         513., 509., 505., 509., 542., 516., 527., 507., 518., 554.,
         527., 502., 504., 516., 538., 561., 568., 549., 577., 517.,
         531., 549., 575., 523., 472., 506., 509., 538., 548., 524.,
         527., 504., 359., 294., 343., 345., 328., 299., 339., 320.,
         285., 322., 338., 318., 286., 329., 336., 374., 379., 381.,
         397., 342., 339., 321.]]])>, 'band4': <tf.Tensor: shape=(1, 1, 64), dtype=float64, numpy=
array([[[449., 437., 409., 414., 422., 376., 343., 349., 364., 376.,
         367., 379., 383., 394., 400., 381., 342., 339., 386., 393.,
         370., 343., 302., 295., 301., 321., 314., 279., 311., 285.,
         321., 361., 344., 318., 268., 287., 307., 350., 377., 339.,
         337., 332., 239., 200., 224., 202., 198., 188., 219., 215.,
         219., 212., 229., 226., 200., 219., 240., 248., 244., 270.,
         266., 231., 233., 213.]]])>, 'band8': <tf.Tensor: shape=(1, 1, 64), dtype=float64, numpy=
array([[[2166., 2166., 2228., 2220., 2222., 2412., 2632., 2498., 2400.,
         2448., 2466., 2498., 2556., 2522., 2442., 2442., 2556., 2534.,
         2460., 2498., 2570., 2652., 2802., 3034., 3040., 2962., 3008.,
         3090., 3072., 2872., 2834., 2752., 2630., 2532., 2850., 3068.,
         2920., 2514., 2382., 2436., 2498., 2346., 2092., 2474., 2890.,
         2906., 2790., 2568., 2712., 2878., 2690., 2712., 2702., 2688.,
         2624., 2764., 2900., 2916., 2684., 2604., 2894., 2828., 2754.,
         2614.]]])>}

And the same error as before. BUT if I comment out the expand_dims call with axis=1, I actually get a result but it’s only 1 result for the entire set. I feel like I should be getting more. OR I’ve somehow trained my model with the completely wrong data…

for features,label in testSet:
    features = tf.nest.map_structure(lambda v: tf.expand_dims(v, axis=0), features)
    # features = tf.nest.map_structure(lambda v: tf.expand_dims(v, axis=1), features)

print(features)
print(loadedModel.predict(features))

{'band2': <tf.Tensor: shape=(1, 64), dtype=float64, numpy=
array([[260., 261., 237., 242., 227., 227., 207., 217., 227., 238., 225.,
        211., 226., 237., 245., 216., 218., 200., 247., 284., 223., 209.,
        199., 200., 231., 216., 222., 195., 213., 213., 197., 253., 261.,
        197., 200., 210., 208., 223., 244., 199., 208., 199., 161., 147.,
        124., 120., 135., 124., 144., 160., 141., 111., 145., 134., 122.,
        116., 155., 163., 164., 151., 180., 153., 123., 131.]])>, 'band3': <tf.Tensor: shape=(1, 64), dtype=float64, numpy=
array([[569., 555., 502., 507., 514., 533., 512., 491., 524., 517., 513.,
        509., 505., 509., 542., 516., 527., 507., 518., 554., 527., 502.,
        504., 516., 538., 561., 568., 549., 577., 517., 531., 549., 575.,
        523., 472., 506., 509., 538., 548., 524., 527., 504., 359., 294.,
        343., 345., 328., 299., 339., 320., 285., 322., 338., 318., 286.,
        329., 336., 374., 379., 381., 397., 342., 339., 321.]])>, 'band4': <tf.Tensor: shape=(1, 64), dtype=float64, numpy=
array([[449., 437., 409., 414., 422., 376., 343., 349., 364., 376., 367.,
        379., 383., 394., 400., 381., 342., 339., 386., 393., 370., 343.,
        302., 295., 301., 321., 314., 279., 311., 285., 321., 361., 344.,
        318., 268., 287., 307., 350., 377., 339., 337., 332., 239., 200.,
        224., 202., 198., 188., 219., 215., 219., 212., 229., 226., 200.,
        219., 240., 248., 244., 270., 266., 231., 233., 213.]])>, 'band8': <tf.Tensor: shape=(1, 64), dtype=float64, numpy=
array([[2166., 2166., 2228., 2220., 2222., 2412., 2632., 2498., 2400.,
        2448., 2466., 2498., 2556., 2522., 2442., 2442., 2556., 2534.,
        2460., 2498., 2570., 2652., 2802., 3034., 3040., 2962., 3008.,
        3090., 3072., 2872., 2834., 2752., 2630., 2532., 2850., 3068.,
        2920., 2514., 2382., 2436., 2498., 2346., 2092., 2474., 2890.,
        2906., 2790., 2568., 2712., 2878., 2690., 2712., 2702., 2688.,
        2624., 2764., 2900., 2916., 2684., 2604., 2894., 2828., 2754.,
        2614.]])>}
[[0.         0.         0.         0.99666584 0.00333333]]

What’s odd about this is that only calling expand_dims once seems to give a rank 2 result as I currently see it. At least I’m assuming by the double ‘[[’ in each of those nparrays. Why aren’t I getting 64 results from that input structure? Thanks again