Manually modify the weights of TF Lite models

James_He · June 16, 2021, 4:55pm

After post-training quantization, is it possible to change the dense-layer weights in TF Lite models?
An example of what I would like to do:

  interpreter = tf.lite.Interpreter(model_path=Flags.tfl_file_name)
  interpreter.allocate_tensors()
  tensor_details = interpreter.get_tensor_details()
  weight_idx = 0
  for tensor in tensor_details:
    if tensor['name'] == 'sequential/dense/MatMul':
      weight_shape = tensor['shape']
      weight_idx = tensor['index']
      weight = interpreter.get_tensor(weight_idx)
      weight = np.zeros(weight_shape,dtype='int8')
      print(weight)
      interpreter.set_tensor(weight_idx, weight)

This feature is needed for my hardware-accelerated Fully_Connected kernel.

uint8_t · June 9, 2023, 3:19am

Hello,

I know that it has been two years ever since this was asked, but I have been researching the matter for quite a while now and I can provide some help.

The answer to the question is probably no, you can’t directly modify intermediate weight tensors, at least through the Interpreter interface. A workaround that I have come up with is to use the tensorflow.lite.tools.flatbuffer_utils.read_model_with_mutable_tensors function to read the tflite file into a Python object, access the data buffers through model.buffers, modify them, and then rewrite the model to a file using tensorflow.lite.tools.flatbuffer_utils.write_model and read the model through an Interpreter once again (you can probably initialize the Interpreter using the in-memory content but I don’t know what does the model_content argument expect).

Here’s an example:

model_path = "./model.tflite"
model = read_model_with_mutable_tensors(model_path)
n_buffers = len(model.buffers)
for t in range(n_buffers):
  buf = model.buffers[t]
  if buf.data is not None:
    for i in range(len(tnsr.data)):
      buf.data[i] ^= (1 << 0)   # flip the least significant bit of each byte
      write_model(model, model_path)
      interpreter = tf.lite.Interpreter(model_path=model_path)
      interpreter.allocate_tensors()
      ...
      # Here, you can run inference on the modified model