Creating float tensors from BufferedImage in Java/Kotlin

Continuing a thread started on Gitter:

Hello, I want to run a Tensorflow model I found with a Java app, but I am having difficulty with getting the input just right. Below you can see the result from the layer analysis. I found a few examples for one-dimensional input (mnist) and I got another model working that required integers, but creating Tensor with dimensions {batch, height, width, channels} is a difficult task. I would like some help. The input is just a JPG, basically BufferedImage as I want to keep my options open.

Often TF Java users are looking for a snippet showing how this can be done easily, I’m sharing one here written in Kotlin (warning, I did not test it out after modifying it, but basically the logic should be good):

   fun preprocess(sourceImages: List<BufferedImage>, imageHeight: Int, imageWidth: Int, imageChannels: Int): TFloat32 {
        val imageShape = Shape.of(sourceImages.size.toLong(), imageHeight.toLong(), imageWidth.toLong(), imageChannels.toLong())
        
        return TFloat32.tensorOf(imageShape) { tensor ->
           
            // Copy all images to the tensor
            sourceImages.forEachIndexed { imageIdx, sourceImage ->
                
                // Scale the image to required dimensions if needed
                val image = if (sourceImage.width != imageWidth || sourceImage.height != imageHeight) {
                    val scaledImage = BufferedImage(imageWidth, imageHeight, BufferedImage.TYPE_3BYTE_BGR)
                    scaledImage.createGraphics().apply {
                        setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_NEAREST_NEIGHBOR)
                        drawImage(sourceImage, 0, 0, imageWidth, imageHeight, null)
                        dispose()
                    }
                    scaledImage
                } else {
                    sourceImage
                }
                
                // Converts the image to floats and normalize by subtracting mean values
                var i = 0
                for (h in 0L until imageHeight) {
                   for (w in 0L until imageWidth)  {
                       // "caffe"-style normalization
                       tensor.setFloat(image.data.dataBuffer.getElemFloat(i++) - 103.939f, imageIdx.toLong(), h, w, 0)
                       tensor.setFloat(image.data.dataBuffer.getElemFloat(i++) - 116.779f, imageIdx.toLong(), h, w, 1)
                       tensor.setFloat(image.data.dataBuffer.getElemFloat(i++) - 123.68f, imageIdx.toLong(), h, w, 2)
                   }
                }
            }
        }
    }

So the idea is simply to resample your image if it is not already of the right size and to normalize its pixel values when feeding the tensor. The “caffe”-style normalization is the one used by default by Keras in Python so the mean values to subtract were picked from Keras sources directly.

UPDATED : here’s the Java version

    TFloat32 preprocess(List<BufferedImage> sourceImages, int imageHeight, int imageWidth, int imageChannels) {
        Shape imageShape = Shape.of(sourceImages.size(), imageHeight, imageWidth, imageChannels);
        
        return TFloat32.tensorOf(imageShape, tensor -> {
            // Copy all images to the tensor
            int imageIdx = 0;
            for (BufferedImage sourceImage : sourceImages) {
                // Scale the image to required dimensions if needed
                BufferedImage image;
                if (sourceImage.getWidth() != imageWidth || sourceImage.getHeight() != imageHeight) {
                    image = new BufferedImage(imageWidth, imageHeight, BufferedImage.TYPE_3BYTE_BGR);
                    Graphics2D graphics = image.createGraphics();
                    graphics.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_NEAREST_NEIGHBOR);
                    graphics.drawImage(sourceImage, 0, 0, imageWidth, imageHeight, null);
                    graphics.dispose();
                } else {
                    image = sourceImage;
                }

                // Converts the image to floats and normalize by subtracting mean values
                int i = 0;
                for (long h = 0; h < imageHeight; ++h) {
                    for (long w = 0; w < imageWidth; ++w)  {
                        // "caffe"-style normalization
                        tensor.setFloat(image.getData().getDataBuffer().getElemFloat(i++) - 103.939f, imageIdx, h, w, 0);
                        tensor.setFloat(image.getData().getDataBuffer().getElemFloat(i++) - 116.779f, imageIdx, h, w, 1);
                        tensor.setFloat(image.getData().getDataBuffer().getElemFloat(i++) - 123.68f, imageIdx, h, w, 2);
                    }
                }
                ++imageIdx;
            }
        });
    }
1 Like

Sorry I can’t add links but there’s also some example java code in the tensorflow-java models github repository. You need to drill down to the cnn FasterRcnnInception directory

2 Likes

The example @Keith_Hall is referring to is here - java-models/tensorflow-examples/src/main/java/org/tensorflow/model/examples/cnn/fastrcnn at master · tensorflow/java-models · GitHub

1 Like

Yes this other example is valid also but takes a different approach, it uses TensorFlow to decode and resize the images. The goal of my previous example is to demonstrate how to do it when using image utilities coming with the JDK.

1 Like

I have no experience with Kotlin, but it does look like it is a step in the right direction. I would like to take you up on your offer Karl to try and convert this to Java.

2 Likes

Please @James2026 , see above my initial post, I’ve added the same snippet but in Java

2 Likes