Same tflite classification model gives different output when run in Python and Kotlin

I have a Keras sequential neural network classification model that I converted to a tflite model so I could run inference in an Android app which I am developing in Kotlin. It takes in 39 features that are float and is supposed to return 0/1 for 22 classes. This is a human activity recognition model that uses accelerometer x, y, z data that has been converted to features. When I run the tflite model in Python, I get different results than when I run it in Kotlin when using the same exact inputs.

Here is the Python code with some sample data. I get class 1 in Python and class 4 in Kotlin. Can anyone tell me why I would get different classes as output?

# manual debugging of the Android code
test_features = np.array([15.121523, 134.00166, 118.880135, 16.775457, 7.369391, 54.307922, 2.4611068, 0.7021117, 0.5910131, -6.358347, 131.46619, 137.82454, -5.3617373, 8.590351, 73.79413, 0.5156907, 3.3576744, 0.5754766, -24.99669, -6.4506474, 18.546043, -6.7344146, 1.1799959, 1.3923903, 0.36815187, 0.14375915, 0.1254036, -14.355868, 6.9462624, 21.30213, -13.576304, 1.4869289, 2.2109578, 1.4814577, 0.051293932, 0.19250752, -0.9708634, 0.855468, -0.93450075])
test_features_reshaped = test_features.reshape(1, 39)
this_activity = activity_model.predict(test_features_reshaped)

# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="MyNeuralNetModel.tflite")
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(test_features_reshaped, dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()

# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Here is the Kotlin code. I have tried running with and without the nativeOrder() function and get different results from Python with both.

var byteBuffer: ByteBuffer = ByteBuffer.allocateDirect(4 * 39)
byteBuffer.order(ByteOrder.nativeOrder())

for ( obj in combinedFeatureObjects){
            byteBuffer = assembleByteBuffer(byteBuffer, obj.accelNormFeatures)
            byteBuffer = assembleByteBuffer(byteBuffer, obj.accelXFeatures)
            byteBuffer = assembleByteBuffer(byteBuffer, obj.accelYFeatures)
            byteBuffer = assembleByteBuffer(byteBuffer, obj.accelZFeatures)
            byteBuffer.putFloat(obj.corrxy.toFloat())
            byteBuffer.putFloat(obj.corrxz.toFloat())
            byteBuffer.putFloat(obj.corryz.toFloat())
            
            var model = MyNeuralNetModel.newInstance(context)
            val inputFeature0 = TensorBufferFloat.createFixedSize(intArrayOf(1, 39), DataType.FLOAT32)
            inputFeature0.loadBuffer(byteBuffer)
            val outputs = model.process(inputFeature0)
            val outputFeature0 = outputs.outputFeature0AsTensorBuffer.floatArray
            model.close()

            var activityNumber: Int? = 25
            for ((index, value) in outputFeature0.withIndex()) {
                   if(value > .4999f) {
                       activityNumber = index
                       break
                   } else if( value > 0) {
                       activityNumber = index * 100
                  }
            }
            if ( activityNumber == 25 ) {
                var x = 0
            }

Hi @Cathy_B ,

Probably the conversion to bytebuffer and then from buffer to outputFeature0 is not correct. Unfortunately I cannot debug with only the information above.

You can feed a TensorFlow Lite Interpreter directly with an array of floats instead of a bytebuffer and get an array as an output. The inference you are trying to do is not time consuming and I think if you use arrays instead of floats the whole procedure will take less than 10 milliseconds.

Check a detailed example with TensorFlow Lite Interpreter, input array and output array here.

Regards

Thank you for your answer! I implemented your approach using the float array as the input to the tensorflow lite model in Android and found that I had previously been feeding my features into the bye buffer incorrectly by passing only 38 features and not 39. The byte buffer didn’t throw an error which was why I was not able to catch it but adding features to the float array made the issue obvious.

1 Like