Edge TPU inferencing overview

The Edge TPU is compatible with TensorFlow Lite models only. So you must train a TensorFlow model, convert it to TensorFlow Lite, and compile it for the Edge TPU. Then, you can execute the model with the Edge TPU using one of the options described on this page.

For details about creating a model that's compatible with the Edge TPU, read TensorFlow models on the Edge TPU.

Run inference with Python

If you're using Python to run inference, you have two options:

  • Use the TensorFlow Lite API:

    This is the traditional approach to running TensorFlow Lite models. It gives you complete control of the data input and output, allowing you to perform inference with a wide variety of model architectures.

    If you've used TensorFlow Lite before, then your Interpreter code requires only a small modification so it can execute your model on the Edge TPU.

    For details, read Run inference with TensorFlow Lite in Python.

  • Use the Edge TPU API:

    This is a Python library we built on top of the TensorFlow Lite C++ API so you can more easily perform inferences with image classification and object detection models.

    This API is helpful if you don't have experience with the TensorFlow Lite API and you just want to perform image classification or object detection, because it abstracts-away the code required to prepare the input tensors and parse the results. It also provides unique APIs to perform accelerated transfer-learning for classification models on the Edge TPU.

    For details, read the Edge TPU Python API overview.

Run inference with C++

If you want to write your code in C++, then you need to use the TensorFlow Lite C++ API, just like you would to run TensorFlow Lite on any other platform. However, you need to make a few changes to your code using APIs from our edgetpu.h or edgetpu_c.h file. Essentially, you just need to register the Edge TPU device as an external context for your Interpreter object.

For details, read Run inference with TensorFlow Lite in C++.