After post-training quantization, is it possible to change the dense-layer weights in TF Lite models?
An example of what I would like to do:
interpreter = tf.lite.Interpreter(model_path=Flags.tfl_file_name)
tensor_details = interpreter.get_tensor_details()
weight_idx = 0
for tensor in tensor_details:
if tensor['name'] == 'sequential/dense/MatMul':
weight_shape = tensor['shape']
weight_idx = tensor['index']
weight = interpreter.get_tensor(weight_idx)
weight = np.zeros(weight_shape,dtype='int8')
This feature is needed for my hardware-accelerated Fully_Connected kernel.
I know that it has been two years ever since this was asked, but I have been researching the matter for quite a while now and I can provide some help.
The answer to the question is probably no, you can’t directly modify intermediate weight tensors, at least through the
Interpreter interface. A workaround that I have come up with is to use the
tensorflow.lite.tools.flatbuffer_utils.read_model_with_mutable_tensors function to read the tflite file into a Python object, access the data buffers through
model.buffers, modify them, and then rewrite the model to a file using
tensorflow.lite.tools.flatbuffer_utils.write_model and read the model through an
Interpreter once again (you can probably initialize the
Interpreter using the in-memory content but I don’t know what does the
model_content argument expect).
Here’s an example:
model_path = "./model.tflite"
model = read_model_with_mutable_tensors(model_path)
n_buffers = len(model.buffers)
for t in range(n_buffers):
buf = model.buffers[t]
if buf.data is not None:
for i in range(len(tnsr.data)):
buf.data[i] ^= (1 << 0) # flip the least significant bit of each byte
interpreter = tf.lite.Interpreter(model_path=model_path)
# Here, you can run inference on the modified model