Detect a video with TensorFlow lite

I’m trying to use tensorflow lite to detect a video stored on an android phone.
My approach is to extract all the frames of the video, convert them to a list of bitmaps, and detect each frame like a picture. I learned that MediaMetadataRetriever can get video frames, so I try

   val mmr = MediaMetadataRetriever() 
                val file =
                    File("/sdcard/Android/fun.mp4") 
                if (file.exists()) {
                    mmr.setDataSource(file.absolutePath) 
    val bitmaps = mmr.getFramesAtIndex(0, 1000)
                    thread {
                        bitmaps.forEach { bm ->
                            runOnUiThread { runObjectDetection(bm) }
              
                        }
                    }
                    mmr.release()

runObjectDetectionis the same as this example Build and deploy a custom object detection model with TensorFlow Lite (Android)

However, it seems that getFramesAtIndex can only have a few frames, and the program does not work when numFrames is greater than 100.

It looks like this is not a good solution, what I want is not to detect all frames of the video, but to detect every few frames, the rectangle of the result of the last detection remains on the video.
Is there a better solution?

You’re using the right approach of reading frames from video. However, you should use getFrameAtIndex (Frame without an ‘s’) to read only the frames that you’re interested in and do it one frame each rather than loading many frames at once. It’s because frames are loaded as Bitmap object, which consumed a lot of memory so loading too many frames to memory at once may cause your app to go OOO.

You should also check the video’s frame rate to decide what’s the interval that you’d want to read frames from the video file. See this answer on StackOverflow to learn how to get fps value from video file.

Hope this helps!

1 Like

Thanks for your answer!

I have tried using getFrameAtIndex, and the program can detect one frame per run. But I have no idea how to connect each frame of pictures like a video If I only detect one frame at a time, I tried to use a for loop but it doesn’t work.

And if I just use runObjectDetection(), how to make the last detected rectangle remain on the screen until the next detection frame.

In addition, I wanna to perform real-time detection by calling the camera, I have completed the “Implement Preview use case” with reference to Getting Started with CameraX  |  Android Developers, but I don’t understand how to use the “Implement ImageAnalysis use case” to use runObjectDetection() to implement the detection.

could you give some advice? Thank you.

You can create a for loop to:

  • get the frames
  • detect objects only every X frames and store the result in a variable like latestDetectionResult.
  • draw bounding box on all frames using latestDetectionResult
  • then show the bitmap with the bounding box on an ImageView.

I try to detect every 4 frames,I modified runObjectDetection to not display the result in imageView but return resultToDisplay.
My code:

val bitmap = mmr.getFrameAtIndex(0)
var detectResult = bitmap?.let { runObjectDetection(it) }
for(i in 1..f){ //f is the total frames
                        val bitmaps = mmr.getFrameAtIndex(i)
                        if(i%4 == 0){
                            detectResult = bitmaps?.let { runObjectDetection(it) }
                        }
                        val latestDetectionResult = detectResult
                        val imgWithResult =
                            latestDetectionResult?.let { drawDetectionResult(bitmaps!!, it) }
                        runOnUiThread {
                            inputImageView.setImageBitmap(imgWithResult)
                        }

However, the for loop can’t run the imageView, is it correct to use runOnUiThread ?

And how can I ensure that the imageView is close to the original video frame rate.

Thank you!