Edge TPU inferencing overview

All inferencing with the Edge TPU is based on the TensorFlow Lite APIs (Python or C/C++). So if you already know how to run an inference using TensorFlow Lite, then running your model on the Edge TPU requires only a few new lines of code.

Note: The Edge TPU is compatible with TensorFlow Lite models only. For details about how to create a model that's compatible with the Edge TPU, read TensorFlow models on the Edge TPU.

With a compatible model in-hand, you can perform inferencing on the Edge TPU using one of three API options. Each option and the software required is described below and illustrated in figure 1.

Figure 1. The three options for inferencing and the corresponding software dependencies
  1. Use the TensorFlow Lite Python API:

    This is the standard Python API for running TensorFlow Lite models. With just a few lines of code, you can make existing TensorFlow Lite code run on the Edge TPU.

    All you need is the TensorFlow Lite Python API and the Edge TPU runtime.

    For details, read Run inference with TensorFlow Lite in Python.

  2. Use the Edge TPU Python API:

    This is a Python library we built on top of the TensorFlow Lite C++ API. It's helpful if you don't have experience with the TensorFlow Lite API and you just want to perform image classification or object detection, because it abstracts-away the code required to prepare the input tensors and parse the results. It also provides unique APIs to perform accelerated transfer-learning for classification models on the Edge TPU.

    All you need is the Edge TPU Python API and the Edge TPU runtime.

    For details, read the Edge TPU Python API overview.

  3. Use C or C++ APIs:

    This uses the standard C/C++ API for running TensorFlow Lite models. With just a few lines of code, you can make existing TensorFlow Lite code run on the Edge TPU.

    For this, you need the Edge TPU runtime API (edgetpu_c.h or edgetpu.h), the Edge TPU runtime, plus the compiled TensorFlow Lite C++ API.

    For details, read Run inference with TensorFlow Lite in C++.

    Optionally, you can also use APIs from our Coral C++ API, which adds features such as pipelining a model with multiple Edge TPUs.