Calculating throughout (image/second) of a model

I would like to know how can I calculate the throughput of CNN models.

In the context of serving a model it is impacted by different factors:

The throughput will depend on:

  • The hardware that you’re running on
  • What else is running on that hardware
  • Whether you’re doing online or batch inference
  • The configuration of your serving framework (such as TF Serving or the BulkInferrer component)

If you can pick a single set of hardware and configuration then you can make comparisons of different models. This is not unlike other types of throughput estimation.