Pruning pre-trained model with model-optimization

Hey! I have a model based on MobileNetV2 with added classification head. I followed a tutorial how to prune model, but obviously, when I pruned model, only classification head is pruned and which is not what I need. Is it possible to prune base_model and then add it to my model?

Base model:

base_model = tf.keras.applications.MobileNetV2(
  input_shape=INPUT_IMG_SHAPE,
  include_top=False,
  weights='imagenet',
  pooling='avg'
)

Actual model:

model = tf.keras.models.Sequential()
model.add(base_model)
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(
    units=NUM_CLASSES,
    activation=tf.keras.activations.softmax,
    kernel_regularizer=tf.keras.regularizers.l2(l=0.01)
))

Hi @stepan_zalis, once you have defined your base model

base_model = tf.keras.applications.MobileNetV2(
  input_shape=(220,220,3),
  include_top=False,
  weights='imagenet',
  pooling='avg'
)

you can prune these model using

import tensorflow_model_optimization as tfmot
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
prune_base_model=prune_low_magnitude(base_model)

now can pass these prune model to your model

model = tf.keras.models.Sequential()
model.add(prune_base_model)
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(
    units=10,
    activation=tf.keras.activations.softmax,
    kernel_regularizer=tf.keras.regularizers.l2(l=0.01)
))

Thank You!

Thank you, it was exactly what I tried without any success. I have this now:

pruned_base_model = create_pruning_model(base_model)
compile_model(pruned_base_model)
model = create_model(pruned_base_model)

where create_model is creating the model as second snippet showing and create_pruning_model(_) is this:

def create_pruning_model(base_model):
    prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
    pruning_params = {
             'pruning_schedule': tfmot.sparsity.keras.ConstantSparsity(0.5, 0),
             'block_size': (1, 1),
             'block_pooling_type': 'AVG'           
    }

    model_for_pruning = prune_low_magnitude(base_model, **pruning_params)
    return model_for_pruning

However, when I compile and train the model, gzipped file is almost the same as the non pruned model around 8 MB. Not sure what I’m doing wrong.

Hi @stepan_zalis, The pruning function wraps a tf.keras model or layer with pruning functionality which sparsifies the layer’s weights during training. so the model size will reduce after the training the prune model. For example, I have a model

# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 and 1.
train_images = train_images / 255.0
test_images = test_images / 255.0

# Define the model architecture.
model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation='relu'),
  keras.layers.MaxPooling2D(pool_size=(2, 2)),
  keras.layers.Flatten(),
  keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(train_images, train_labels, epochs=4, validation_split=0.1,)

and save the model

_, keras_file = tempfile.mkstemp('.h5')
tf.keras.models.save_model(model, keras_file, include_optimizer=False)
print('Saved baseline model to:', keras_file)

After training and saving i have pruned the model

import tensorflow_model_optimization as tfmot

prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
batch_size = 128
epochs = 2
validation_split = 0.1
num_images = train_images.shape[0] * (1 - validation_split)
end_step = np.ceil(num_images / batch_size).astype(np.int32) * epochs


pruning_params = {
      'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(initial_sparsity=0.50,
                                                               final_sparsity=0.80,
                                                               begin_step=0,
                                                               end_step=end_step)
}

model_for_pruning = prune_low_magnitude(model, **pruning_params)

# `prune_low_magnitude` requires a recompile.
model_for_pruning.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

and save the prune model

model_for_export = tfmot.sparsity.keras.strip_pruning(model_for_pruning)

_, pruned_keras_file = tempfile.mkstemp('.h5')
tf.keras.models.save_model(model_for_export, pruned_keras_file, include_optimizer=False)
print('Saved pruned Keras model to:', pruned_keras_file)

Now lets see the size of trained model and prune non trained model

Size of gzipped baseline Keras model: 78146.00 bytes
Size of gzipped pruned Keras model: 78146.00 bytes

You can see that the size of the trained model and prune non trained model are of some size.

Now lets train the pruned model

logdir = tempfile.mkdtemp()

callbacks = [
  tfmot.sparsity.keras.UpdatePruningStep(),
  tfmot.sparsity.keras.PruningSummaries(log_dir=logdir),
]
  
model_for_pruning.fit(train_images, train_labels,
                  batch_size=batch_size, epochs=epochs, validation_split=validation_split,
                  callbacks=callbacks)

and save the pruned trained model

model_for_export = tfmot.sparsity.keras.strip_pruning(model_for_pruning)

_, pruned_keras_file = tempfile.mkstemp('.h5')
tf.keras.models.save_model(model_for_export, pruned_keras_file, include_optimizer=False)
print('Saved pruned Keras model to:', pruned_keras_file)

Now lets see the size of trained model and prune trained model

Size of gzipped baseline Keras model: 78146.00 bytes
Size of gzipped pruned Keras model: 25820.00 bytes

You can see the size difference between the trained model and prune trained model.

Please refer to this gist for working code example. Thank You!