Retrain an object detection model

This tutorial shows you how to retrain an object detection model to recognize a new set of classes. You'll use a technique called transfer learning to retrain an existing model and then compile it to run on any device with an Edge TPU, such as the Coral Dev Board or USB Accelerator.

Specifically, this tutorial shows you how to retrain a MobileNet V1 SSD model (originally trained to detect 90 objects from the COCO dataset) so that it detects two pets: Abyssinian cats and American Bulldogs (from the Oxford-IIIT Pets Dataset). But you can reuse these procedures with your own image dataset, and with a different pre-trained model.

Note that this tutorial runs the training scripts on your computer using a Docker virtual environment, so the training time (and even the ability to complete the training) depends on your system specs. As an alternative, we also offer retraining tutorials that run in the cloud, using Google Colab:

What is transfer learning?

Ordinarily, training an object detection model can take several days on a CPU, but transfer learning is a technique that takes a model already trained for a related task and uses it as the starting point to create a new model. Depending on your system and training parameters, this instead takes a few hours or less. (This process is sometimes also called "fine-tuning" the model.)

Transfer learning can be done in two ways:

  • Last layers-only retraining: This approach retrains only the last few layers of the model, where the final classification occurs. This is fast and it can be done with a small dataset.
  • Full model retraining: This approach retrains each layer of the neural network using the new dataset. It can result in a model that is more accurate, but it takes more time, and you must retrain using a dataset of significant sample size to avoid overfitting the model.

Transfer learning is most effective when the features learned in the pre-trained model are general, not highly specialized. For example, a pre-trained model that can recognize household objects might be re-trained to recognize new office supplies, but a model pre-trained to recognize different dog breeds might not.

The steps below show you how to perform transfer-learning using either last-layers-only or full-model retraining. Most of the steps are the same; just keep an eye out for the different commands depending on the technique you desire.

Note: These instructions do not require deep experience with TensorFlow or convolutional neural networks (CNNs), but such experience will definitely help you build a more accurate model. This tutorial also does not teach you how to design and organize a dataset, or tune the hyperparameters to converge your model to the highest possible accuracy. For any of that, refer to other literature about deep learning.

Requirements

You need the following for this tutorial:

  • Any computer supported by Docker (such as Linux, Mac, or Windows).
  • At least 10 GB of RAM.
  • A device with an Edge TPU, such as the Coral Dev Board or USB Accelerator (these each have their own list of requirements).
Note: This tutorial is designed to run training on a desktop CPU—not on a GPU or in the cloud, which requires changes beyond the scope of this tutorial. You also should not try this on the Coral Dev Board due to CPU and memory constraints—this training cannot be accelerated by the Edge TPU.

Set up the Docker container

Docker is a virtualization platform that makes it easy to set up an isolated environment for this tutorial. Using our Docker container, you can easily set up the required environment, which includes TensorFlow, Python, Object Detection API, and the the pre-trained checkpoints for MobileNet V1 and V2.

To set up your container, follow these steps:

  1. First install Docker on your desktop machine (this link is for Ubuntu; select your appropriate platform from the Docker left navigation).

  2. Open a command line and create a directory for all the files in this project. You will clone the Coral tutorials repo into it, so name it accordingly. For example:

    CORAL_DIR=${HOME}/google-coral && mkdir -p ${CORAL_DIR}
    
  3. Move into that directory and clone our tutorials repo, which has all the training scripts:

    cd ${CORAL_DIR}
    
    git clone https://github.com/google-coral/tutorials.git
  4. Move into the directory for this tutorial and build the Docker image:

    cd tutorials/docker/object_detection
    
    docker build . -t detect-tutorial-tf1
  5. Specify the location for the training output files. For example:

    DETECT_DIR=${PWD}/out && mkdir -p $DETECT_DIR
    

    You'll use this as the mount location for a directory in the Docker container, thus saving the training files and final model to your file system (instead of leaving them inside the Docker container).

  6. Start the Docker container:

    docker run --name edgetpu-detect \
    --rm -it --privileged -p 6006:6006 \
    --mount type=bind,src=${DETECT_DIR},dst=/tensorflow/models/research/learn_pet \
    detect-tutorial-tf1
    

When that's finished, your command prompt should be inside the Docker container and in the path /tensorflow/models/research/.

You're ready to start training your model.

Download and configure the training data

Now you'll download the images, labels, and the model checkpoints used in the retraining. We've prepared the following script (in the research/ directory) to take care of that for you.

This script also updates the training configuration file at /tensorflow/models/research/learn_pet/ckpt/pipeline.config to match the new dataset in several ways, such as the number of classes, the path to your checkpoint file, and whether to retrain the last few layers or the whole model. As such, the script accepts arguments to specify the model type with network_type and whether you retrain the whole model or last few layers with train_whole_model. (Beware that setting train_whole_model to true requires a lot more time for training—over 10 hours.)

# Run this from within the Docker container (at tensorflow/models/research/):
./prepare_checkpoint_and_dataset.sh --network_type mobilenet_v1_ssd --train_whole_model false

The network_type can be either mobilenet_v1_ssd, or mobilenet_v2_ssd. This example and those below use MobileNet V1; if you decide to use V2, be sure you update the model name in other commands below, as appropriate.

You can ignore the warning about the missing Abyssinian_104.xml file.

Note: This prepare_checkpoint_and_dataset.sh script handles all the training data setup and configuration, which is designed to train a pet detector model. If you want to know more about what the script does and how to create your own dataset, see the section below about how to configure your own training data.

Start training

Now you can begin the transfer-learning process as follows:

  1. Set some training variables, based on your training strategy:

    • If you're retraining just the last few layers, we suggest the following numbers:

      NUM_TRAINING_STEPS=500 && \
      NUM_EVAL_STEPS=100
      
    • If you're retraining the whole-model, we suggest the following numbers:

      NUM_TRAINING_STEPS=50000 && \
      NUM_EVAL_STEPS=2000
      
  2. Start the training job:

    # From the /tensorflow/models/research/ directory
    ./retrain_detection_model.sh \
    --num_training_steps ${NUM_TRAINING_STEPS} \
    --num_eval_steps ${NUM_EVAL_STEPS}
    
  3. To monitor training progress, start tensorboard in a new terminal:

    1. Start bash in a separate terminal to join the same Docker container.

      sudo docker exec -it edgetpu-detect /bin/bash
      
    2. In the new Docker terminal, execute the following command to start tensorboard in /tensorflow/models/research/ directory. After you execute the command, tensorboard visualizes the model accuracy throughout training in your local machine's browser at http://localhost:6006/.

      tensorboard --logdir=./learn_pet/train/
      

This takes a long time to train. Depending on your machine, it can take 1 - 4 hours to retrain the last few layers, or up to 10 hours to retrain the whole model (based on a 6-core CPU with 64GB memory).

As training progresses, you can see the transfer-learned checkpoint files begin to appear in the /tensorflow/models/research/learn_pet/train directory, which is mounted to the local $DETECT_DIR location you created when starting the Docker container.

Compile the model for the Edge TPU

To run your retrained model on the Edge TPU, you need to convert your checkpoint file to a frozen graph, convert that graph to a TensorFlow Lite flatbuffer file, then compile the model for the Edge TPU. The following steps guide you through it all.

  1. To freeze the graph and convert it to TensorFlow Lite, use the following script and specify the checkpoint number you want to use (this example uses checkpoint 500):

    # From the Docker /tensorflow/models/research directory
    ./convert_checkpoint_to_edgetpu_tflite.sh --checkpoint_num 500
    

    Your converted TensorFlow Lite model is named output_tflite_graph.tflite and is output in the Docker container at tensorflow/models/research/learn_pet/models/, which is the mounted directory available on your host filesystem at $DETECT_DIR.

  2. Open a new terminal (outside the Docker container) and install the Edge TPU Compiler:

    curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
    
    echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
    sudo apt update
    sudo apt-get install edgetpu-compiler
  3. Make sure your user has ownership of the out directory:

    sudo chown -R $USER ${HOME}/google-coral/tutorials/docker/object_detection/out
    
  4. Now change directories to where the trained model is and compile it:

    cd ${HOME}/google-coral/tutorials/docker/object_detection/out/models
    
    edgetpu_compiler output_tflite_graph.tflite

    The compiled file is named output_tflite_graph_edgetpu.tflite and saved in the current directory.

  5. Finally, rename the compiled model to something more specific:

    mv output_tflite_graph_edgetpu.tflite ssd_mobilenet_v1_catsdogs_quant_edgetpu.tflite
    

Run the model

You can now use the retrained model to run an inference on the Edge TPU. Below, you can see how to use this model with the detect_image.py example, which performs object detection using the TensorFlow Lite Python API.

Remember that you've trained this model to recognize just two classes: Abyssinian cats and American Bulldogs. So here are a couple images that should provide results (provided by the Open Images Dataset):

wget https://c3.staticflickr.com/8/7028/6595489185_60fb5cd274_z.jpg -O dog.jpg && \
wget https://c6.staticflickr.com/9/8534/8652503705_687d957a29_z.jpg -O cat.jpg

Using the Coral Dev Board

  1. First, be sure your Dev Board software is up to date.

  2. Use MDT to push the files to the Dev Board and switch to the Dev Board shell:

    mdt push ssd_mobilenet_v1_catsdogs_quant_edgetpu.tflite labels.txt dog.jpg
    
    mdt shell
  3. Now from the Dev Board shell, download the detect_image.py code from GitHub:

    mkdir google-coral && cd google-coral
    
    git clone https://github.com/google-coral/tflite --depth 1
  4. Install the example's requirements:

    cd tflite/python/examples/detection
    
    ./install_requirements.sh
  5. Run the example using the files you pushed in step 2:

    python3 detect_image.py \
      --model ${HOME}/ssd_mobilenet_v1_catsdogs_quant_edgetpu.tflite \
      --labels ${HOME}/labels.txt \
      --input ${HOME}/dog.jpg \
      --output dog_result.jpg
    

Using the Coral USB Accelerator

  1. First, be sure your USB Accelerator is set up.

  2. Although this is also part of the device setup, here's how to get the detect_image.py code from GitHub:

    mkdir google-coral && cd google-coral
    
    git clone https://github.com/google-coral/tflite --depth 1
  3. Install the project requirements:

    cd tflite/python/examples/detection
    
    ./install_requirements.sh
  4. Run the example using the retrained model:

    python3 detect_image.py \
      --model ${HOME}/google-coral/tutorials/docker/object_detection/out/models/ssd_mobilenet_v1_catsdogs_quant_edgetpu.tflite \
      --labels ${HOME}/google-coral/tutorials/docker/object_detection/out/models/labels.txt \
      --input ${HOME}/google-coral/tutorials/docker/object_detection/out/models/dog.jpg \
      --output dog_result.jpg
    

Configure your own training data

If you finished all the previous steps, then you've completed transfer-learning to create a model that detects cats and dogs. But chances are, you'd prefer that your model detect other things. So this section describes how to prepare your own training data to retrain an object detection model.

If you look back at what you've done, you'll see that the bulk of the work is done for you through the script prepare_checkpoint_and_dataset.sh. This script does three important things:

  • Organize the dataset photos, annotations, and label map (the training data), and then convert it all into TFRecord format.

    The images and annotations used above come from the Oxford-IIIT Pets Dataset; the labels map is pet_label_map.pbtxt; and the script to convert it all to TFRecord is create_pet_tf_record.py.

  • Download the model checkpoint files (the neural network graph to retrain).

    The MobileNet files used above (and more) are available from our Models download page.

  • Configure the pipeline.config file.

    This file is included with the model checkpoint files. It's required by the TensorFlow Object Detection API and you need to modify various properties in here to customize the training pipeline for your dataset and training strategy.

So to create your own dataset, you need to prepare this stuff yourself.

Organize your dataset

The first of the three items above is the most time-consuming and the most important: you need to gather and organize all the photos, annotations, and labels to use for training.

This process is also the least documented here; it requires a fair amount of experience with ML data preparation and some experience with TensorFlow APIs. We recommend you follow this TensorFlow guide to preparing inputs.

Also take a look at this tutorial for using TFRecords and the code that converts the pets dataset in create_pet_tf_record.py.

Select your model

Once you have your dataset, you need the checkpoint files for the quantized TensorFlow Lite (object detection) model you want to retrain. (You must use either quantization-aware training (recommended) or full integer post-training quantization.)

We have some Edge TPU compatible models available on our Models download page that you can retrain, but you can use any other object detection model that's compatible with the Edge TPU.

Configure your training pipeline

Now reconfigure the existing pipeline.config file that came with the model, as appropriate. What changes you make depends entirely on your model and your training strategy. You should read more about the config file here.

For demonstration purposes, the following shows the pipeline.config changes required for the retraining performed above (when using the MobileNet V1 SSD model to retrain the last-few-layers only):

  1. At the top of the file, change num_classes for the number of classes in your dataset.

    For example, change num_classes: 90 to num_classes: 2 for a dataset with 2 classes.

  2. Specify your checkpoint file with fine_tune_checkpoint and enable a couple other properties.

    For example, change this line:

    fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
    

    To this:

    fine_tune_checkpoint: "/tensorflow/models/research/learn_pet/ckpt/model.ckpt"
    from_detection_checkpoint: true
    load_all_detection_checkpoint_vars: true
    
  3. Specify your training data location.

    For example, change this:

    train_input_reader {
      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
      tf_record_input_reader {
        input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-00000-of-00100"
      }
    }
    

    To this:

    train_input_reader {
      label_map_path: "/tensorflow/models/research/learn_pet/pet_label_map.pbtxt"
      tf_record_input_reader {
        input_path: "/tensorflow/models/research/learn_pet/pet_faces_train.record-?????-of-00010"
      }
    }
    
  4. Specify the evaluation data location.

    For example, change this:

    eval_input_reader {
      label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
      shuffle: false
      num_readers: 1
      tf_record_input_reader {
        input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-00000-of-00010"
      }
    }
    

    To this:

    eval_input_reader {
      label_map_path: "/tensorflow/models/research/learn_pet/pet_label_map.pbtxt"
      shuffle: false
      num_readers: 1
      tf_record_input_reader {
        input_path: "/tensorflow/models/research/learn_pet/pet_faces_val.record-?????-of-00010"
      }
    }
    
  5. Specify the layers you want to freeze in the model.

    For example (when using MobileNet V1 SSD model to retrain the last-few-layers only), change this:

    max_number_of_boxes: 100
    unpad_groundtruth_tensors: false
    

    To this:

    max_number_of_boxes: 100
    unpad_groundtruth_tensors: false
    freeze_variables:
            ['Conv2d_0',
              'Conv2d_1_pointwise',
              'Conv2d_1_depthwise',
              'Conv2d_2_pointwise',
              'Conv2d_2_depthwise',
              'Conv2d_3_pointwise',
              'Conv2d_3_depthwise',
              'Conv2d_4_pointwise',
              'Conv2d_4_depthwise',
              'Conv2d_5_pointwise',
              'Conv2d_5_depthwise',
              'Conv2d_6_pointwise',
              'Conv2d_6_depthwise',
              'Conv2d_7_pointwise',
              'Conv2d_7_depthwise',
              'Conv2d_8_pointwise',
              'Conv2d_8_depthwise',
              'Conv2d_9_pointwise',
              'Conv2d_9_depthwise']
    

That should be it. But again, you should read more about the config file.

Initiate retraining

So far, everything described here about how to configure your own training data has merely described how to replicate the steps that are performed in the prepare_checkpoint_and_dataset.sh script used above, which prepares training data for a pet detector.

So now that you've prepared your own training data, all that's left is to run the retraining. And for that, you can use the retrain_detection_model.sh script as shown above in Start training.

For more information about creating object detection models with TensorFlow, read the TensorFlow Object Detection documentation.