Benchmarking: Jetson Nano vs. Coral Devboard

Hello Community,

I’m super new to Tensorflow and single-board computers such as the jetson nano or the coral devboard. I am currently doing my masters and for one of my projects I am tasked with comparing the nano to the coral devboard. Before I start: please - feel free to correct my vocabulary/terminology, should I be talking nonsense or making no sense.

I have set up both boards and am kind of stuck. I have plans but am hitting a wall, hoping to get a few clues by some experienced member(s).

I had to ditch my first approach, which was to write an algorithm (e.g. large matrix multiplication) that can run on the “driver level” for the GPU for the nvidia board and on the “driver level” of the TPU on the google board. The idea was to get some algorithm up and running as natively as possible for both plattforms but I’m also a C#/C++ newb and I figured writing cuda kernels let alone tensor base-functions is probably out of scope for me.

The next idea is finding ‘common’ applications and running them on both devices. There are some imagenet neural net architectures that are both available for the jetson and the coral devboard - but I wonder… are those even comparable? The TPU architecture limits the coral devboard running 8BIT precision models whereas the jetson nano does not have this restriction. Moreover the coral devboard can only run tensorflow lite optimized models which have been trained using quantization whereas the jetson nano is not going to use its full potential with tensorflow (lite or not does not matter) at all if not “written in” tensorflowRT. At least that’s what my research is suggesting. Is this a valid benchmarking approach nevertheless?

The last idea is to just go for the “Hello World” of Deeplearning, throw in the MNIST Dataset and train a specific tensorflow architecture which I then want to convert/optimize to work for tensorflowRT and tensorflowLite. Does this even make sense?

Any hints are appreciated. I haven’t really anybody who is helping me in this regard at the moment. I’m trying my best to come up with ideas but right now I fail to make any proper progress. Maybe there is something I haven’t considered yet at all or someone can say that some of the ideas are not going to work out or something like that. Anything is appreciated!

Big Thanks for your time,



Don’t worry you were asking right question here.

It may not be possible to do apples to apples comparison directly. But there are few avenues where you can compare performance of hardware in Deep learning models. I would like to keep this bench marking simple by just comparing a fixed set of models.

  1. Comparison of int8.tflite models
    there are quite few models which are supported on both hardware ( Mobilenet, Unet, Resenet). You can find Coral version of models here ( ). So you can compare inference time of cpu int8.tflite on Jetson and edgetpu.tflite on Coral.

  2. Comparison of FP16 vs Edgetpu.tflite
    You can use same models from above. But compare inference time of FP16 models on Jetson vs Edgetpu.tflite models on Coral. Since Fp16 models are 2x the size of Int8 models; just to keep the compute work load fair export the coral models with 2x input shape.

I assume your project is limited to inference time comparison only.

Hope this helps,

Good luck