TFLite model gives different outputs in Python and C++

Hello,
We are trying to bring our tf lite model inference to C++ from Python and observe different results between the two languages for the same inputs.
We have followed the examples from the documentation with our model.

Our code in Python is like this:

import tensorflow as tf
import numpy as np

tf_image = np.loadtxt('input_tensor_00.txt')
tf_image = tf_image.reshape(1, 3, 100, 100).astype(np.float32)

interpreter = tf.lite.Interpreter('emotions_resnet18.tflite')
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

interpreter.set_tensor(input_details[0]['index'], tf_image)
interpreter.invoke()

output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

and with two different inputs we get the expected results of:

[[0.10489367 0.10680896 0.14752425 0.19949542 0.04391989 0.17851768
  0.21884018]]

and

[[0.09514795 0.11852694 0.12012428 0.21183679 0.06756472 0.17618668
  0.21061271]]

In C++ we have ported it like this:

#include <iostream>
#include <fstream>

#include "tensorflow/lite/model.h"
#include "tensorflow/lite/kernels/register.h"

int main(int argc, char* argv[]) 
{
    std::unique_ptr<tflite::FlatBufferModel> model = tflite::FlatBufferModel::BuildFromFile("test_resnet18.tflite");

    // Build the interpreter
    tflite::ops::builtin::BuiltinOpResolver resolver;
    std::unique_ptr<tflite::Interpreter> interpreter;
    tflite::InterpreterBuilder(*model, resolver)(&interpreter);

    std::ifstream is("input_tensor_00.txt");
    std::istream_iterator<float> start(is), end;
    std::vector<float> input_data(start, end);
    std::cout << "Input: " << input_data.size() << " data." << std::endl;
    assert(100*100*3 == input_data.size());
    
    interpreter->AllocateTensors();
    int input_tensor_idx = 0;
    int input = interpreter->inputs()[input_tensor_idx];
    float* input_data_ptr = interpreter->typed_tensor<float>(input);
    std::cout << input << " : " << interpreter->tensors_size() << " : " << input_data_ptr << std::endl;

    memcpy(input_data_ptr, input_data.data(), 100*100*3);
    
    interpreter->Invoke();

    int output_tensor_idx = 0;
    int output = interpreter->outputs()[output_tensor_idx];
    float* output_data_ptr = interpreter->typed_tensor<float>(output);
    
    std::vector<float> output_vec(output_data_ptr, output_data_ptr + 7);
    std::cout << "Results: " << input_data.size() << " data." << std::endl;
    for(int i = 0; i < output_vec.size(); i++)
        std::cout << i << " : "  << output_vec[i] << std::endl;

    return 0;
}

And independently from the input tensor we always get:

INFO: Initialized TensorFlow Lite runtime.
Read 30000 face data.
INFO: Applying 1 TensorFlow Lite delegate(s) lazily.
0 : 185 : 000001F07531C080
149 : 185 : 000001F075339540
Face:
0 : 0
1 : 0
2 : 0
3 : 0
4 : 0
5 : 0
6 : 1

Does anyone have any idea of what we are doing wrong?

We have tried with an example with mobilnet in which the input/output tensors were of type unsigned char and in this case, it was working fine.

Thank you very much for your help.