Techno Blender
Digitally Yours.

Object Detection with TensorFlow 2 Object Detection API | by Derrick Mwiti

0 223


Object detection with Mask R-CNN in TensorFlow

Photo by Joanna Kosinska on Unsplash

Building object detection and image segmentation models is slightly different from other models. Majorly because you have to use specialized models and prepare the data in a particular way. This article will examine how to perform object detection and image segmentation on a custom dataset using the TensorFlow 2 Object Detection API.

Let’s dive right in!

In this article, we’ll use the Coco Car Damage Detection Dataset available on Kaggle. It contains car images with damages. It can be used to train a model to detect damages on cars and car parts. The dataset has already been annotated, and the corresponding COCO files are provided.

If you have a custom dataset you’d like to use, then you have to do the labeling and annotation yourself. There are many tools and online platforms that can help you achieve this. If you would like to stick to open source, Labelme is an excellent alternative.

The video below shows how to create polygons on the car dataset. After completing an annotation, you will have to save it. Once you save it, Labelme will store the resulting JSON file in the same folder as the data.

Image by author

If you are looking for an online tool, here are some platforms that I have interacted with:

  • Roboflow Universe provides numerous object detection and image segmentation datasets. You can search the platform and switch the car images dataset. If you choose that route, download the TFRecord format from the platform. If you have a custom dataset, you can also perform the annotation on Roboflow.
  • Ango AIprovides some public datasets to kickstart your classification and object detection projects. They also offer a platform that you can use to label and annotate the images.
  • Segments AI lists some object detection and image segmentation datasets that you can clone into your projects. You can also perform annotation on their platform.

The TensorFlow Object Detection API is an open-source computer vision framework for building object detection and image segmentation models that can localize multiple objects in the same image. The framework works for both TensorFlow 1 and 2. Users are, however, encouraged to use the TF 2 version because it contains new architectures.

Some of the architectures and models that TensorFlow 2 Object Detection API supports include:

The models can be downloaded from the TensorFlow 2 Detection Model Zoo. You need their corresponding config files to train one of the object detection models from scratch. In this project, we’ll use the Mask RCNN model, but you can also try the other models.

At this point, you now have an object detection dataset. Either the car images data and the corresponding COCO JSON files or a dataset you have created yourself or downloaded somewhere.

We will run this project on Google Colab to utilize free GPU resources for training the model. Let’s install the TensorFlow 2 Object Detection API on Colab. The first step is to clone the TF 2 Object Detection GitHub repo:

!git clone https://github.com/tensorflow/models.git

Next, run these commands to install TF 2 Object Detection API on Colab:

%%bash cd models/research # Compile protos. protoc object_detection/protos/*.proto --python_out=. # Install TensorFlow Object Detection API. cp object_detection/packages/tf2/setup.py . python -m pip install --use-feature=2020-resolver .

If you’d like to use the API locally, the developers recommend that you install it using Docker:

# From the root of the git repository docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od . docker run -it od

Next, import the Object Detection API plus a couple of other common data science packages. If you are able to import the Object Detection package, it means that the installation ran successfully.

The dataset and the config file for the model we’ll be training can be downloaded from this GitHub repo. You have to make some changes after you download the config from the object detection repo. We’ll discuss those changes in a moment.

!git clone https://github.com/mlnuggets/maskrcnn.git

The next step is to download the Mask R-CNN model that we’ll fine tune. Extract the file to get the trained model checkpoint.

The compressed file also contains the model’s configuration file. You will always have to edit this file after downloading each model.

Image by author

Let’s look at the items in the configuration file that you need to update.

The config file you’ll get after cloning this repo has been edited to run smoothly on Google Colab. If you are running this project elsewhere you’ll need to update the file paths. In a nutshell, here are the items you need to update after downloading the Mask RCNN config file from the TensorFlow 2 Object Detection API repo:

  • num_classes to 5 because the dataset has 5 classes, headlamp, front_bumper, hood, door, and rear_bumper.
  • image_resizer to 512 from 1024 reducing the size of the images hence reducing training time.
  • num_steps to 1000 from 200000 to reduce the training time. The more the steps the longer it will take to train the model. You can increase the steps if the loss is still decreasing and validation metrics are going up.
  • batch_size = 1 to dictate the number of images to be fed in memory while training.
  • fine_tune_checkpoint to point to the path of the Mask R-CNN model downloaded above. This ensures that we are not training the model from scratch.
  • fine_tune_checkpoint_type to detection from classification since we are training an object detection model.
  • train_input_reader to point to the label_map_path and path to the TFRecords. More on TF Records later.
  • eval_input_reader is the same as train_input_reader but for the test data.

The object detection models expect the images to be in TFRecord format. Fortunately, the TensorFlow 2 Object Detect API repo provides a script for performing the conversion. The script takes the following arguments:

  • Directory of the training images.
  • Folder containing the test images.
  • File containing training image annotations.
  • File containing test image annotation.
  • Directory where the generated TFRecords should be stored.

You now have everything you need to train this Mask R-CNN object detection model. The next step is to run the training script. The model training script takes the following arguments:

  • pipeline_config_path the path to the updated model configuration file.
  • model_dir the directory where the trained model will be saved.
Image by author

You might get an OpenCV error on Colab. This error can be fixed by installing the right version of OpenCV.

pip uninstall opencv-python-headless==4.5.5.62 pip install opencv-python-headless==4.5.2.52If you get a cuDNN error, you can fix it by installing the right version of cuDNN.!apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2

When training is complete, you can run TensorBoard to show the visualization of the training and testing metrics such as the localization loss.

Image by author

Read more: TensorBoard tutorial (Deep dive with examples and notebook)

The next step is to export the model for inference. The conversion script expects:

  • trained_checkpoint_dir the last checkpoint of the trained model.
  • output_directory where the exported model will be saved.
  • pipeline_config_path the path to the pipeline configuration file.

The conversion script will output checkpoint files, a SavedModel, and the model config file.

Image by author

You may want to download the converted model or trained model. That can be done by zipping the files and using Colab utilities to download the compressed file.

It’s now time to use the trained Mask R-CNN model to perform object detection on test car images. Luckily, the TensorFlow 2 Object Detection API provides all the utilities needed to do this. The first one is a function that loads an image and converts it into a NumPy array.

The function expects a path to an image file and returns a NumPy array.

The next utility is a function for plotting the detections using Matplotlib.

Let’s now create a detection model from the last saved model checkpoint.

Next, we declare variables that are important for decoding the model output. For instance, the categories and file containing the training categories.

The next step is to run the Mask R-CNN object detection model on some test images.

Image by author

The Mask R-CNN object detection model can be used for both object detection and image segmentation. Let’s start by loading the fined tuned model.

The model also needs labels for decoding the output.

The next step is to define the path to the test images. In this case, we’ll use all the test images because they aren’t that many.

The segmentation utility is also provided on the TensorFlow 2 Object Detection API repo.

The next step is to load an image as a NumPy array and use the above function to start detecting objects.

Image by author


Object detection with Mask R-CNN in TensorFlow

Photo by Joanna Kosinska on Unsplash

Building object detection and image segmentation models is slightly different from other models. Majorly because you have to use specialized models and prepare the data in a particular way. This article will examine how to perform object detection and image segmentation on a custom dataset using the TensorFlow 2 Object Detection API.

Let’s dive right in!

In this article, we’ll use the Coco Car Damage Detection Dataset available on Kaggle. It contains car images with damages. It can be used to train a model to detect damages on cars and car parts. The dataset has already been annotated, and the corresponding COCO files are provided.

If you have a custom dataset you’d like to use, then you have to do the labeling and annotation yourself. There are many tools and online platforms that can help you achieve this. If you would like to stick to open source, Labelme is an excellent alternative.

The video below shows how to create polygons on the car dataset. After completing an annotation, you will have to save it. Once you save it, Labelme will store the resulting JSON file in the same folder as the data.

Image by author

If you are looking for an online tool, here are some platforms that I have interacted with:

  • Roboflow Universe provides numerous object detection and image segmentation datasets. You can search the platform and switch the car images dataset. If you choose that route, download the TFRecord format from the platform. If you have a custom dataset, you can also perform the annotation on Roboflow.
  • Ango AIprovides some public datasets to kickstart your classification and object detection projects. They also offer a platform that you can use to label and annotate the images.
  • Segments AI lists some object detection and image segmentation datasets that you can clone into your projects. You can also perform annotation on their platform.

The TensorFlow Object Detection API is an open-source computer vision framework for building object detection and image segmentation models that can localize multiple objects in the same image. The framework works for both TensorFlow 1 and 2. Users are, however, encouraged to use the TF 2 version because it contains new architectures.

Some of the architectures and models that TensorFlow 2 Object Detection API supports include:

The models can be downloaded from the TensorFlow 2 Detection Model Zoo. You need their corresponding config files to train one of the object detection models from scratch. In this project, we’ll use the Mask RCNN model, but you can also try the other models.

At this point, you now have an object detection dataset. Either the car images data and the corresponding COCO JSON files or a dataset you have created yourself or downloaded somewhere.

We will run this project on Google Colab to utilize free GPU resources for training the model. Let’s install the TensorFlow 2 Object Detection API on Colab. The first step is to clone the TF 2 Object Detection GitHub repo:

!git clone https://github.com/tensorflow/models.git

Next, run these commands to install TF 2 Object Detection API on Colab:

%%bash cd models/research # Compile protos. protoc object_detection/protos/*.proto --python_out=. # Install TensorFlow Object Detection API. cp object_detection/packages/tf2/setup.py . python -m pip install --use-feature=2020-resolver .

If you’d like to use the API locally, the developers recommend that you install it using Docker:

# From the root of the git repository docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od . docker run -it od

Next, import the Object Detection API plus a couple of other common data science packages. If you are able to import the Object Detection package, it means that the installation ran successfully.

The dataset and the config file for the model we’ll be training can be downloaded from this GitHub repo. You have to make some changes after you download the config from the object detection repo. We’ll discuss those changes in a moment.

!git clone https://github.com/mlnuggets/maskrcnn.git

The next step is to download the Mask R-CNN model that we’ll fine tune. Extract the file to get the trained model checkpoint.

The compressed file also contains the model’s configuration file. You will always have to edit this file after downloading each model.

Image by author

Let’s look at the items in the configuration file that you need to update.

The config file you’ll get after cloning this repo has been edited to run smoothly on Google Colab. If you are running this project elsewhere you’ll need to update the file paths. In a nutshell, here are the items you need to update after downloading the Mask RCNN config file from the TensorFlow 2 Object Detection API repo:

  • num_classes to 5 because the dataset has 5 classes, headlamp, front_bumper, hood, door, and rear_bumper.
  • image_resizer to 512 from 1024 reducing the size of the images hence reducing training time.
  • num_steps to 1000 from 200000 to reduce the training time. The more the steps the longer it will take to train the model. You can increase the steps if the loss is still decreasing and validation metrics are going up.
  • batch_size = 1 to dictate the number of images to be fed in memory while training.
  • fine_tune_checkpoint to point to the path of the Mask R-CNN model downloaded above. This ensures that we are not training the model from scratch.
  • fine_tune_checkpoint_type to detection from classification since we are training an object detection model.
  • train_input_reader to point to the label_map_path and path to the TFRecords. More on TF Records later.
  • eval_input_reader is the same as train_input_reader but for the test data.

The object detection models expect the images to be in TFRecord format. Fortunately, the TensorFlow 2 Object Detect API repo provides a script for performing the conversion. The script takes the following arguments:

  • Directory of the training images.
  • Folder containing the test images.
  • File containing training image annotations.
  • File containing test image annotation.
  • Directory where the generated TFRecords should be stored.

You now have everything you need to train this Mask R-CNN object detection model. The next step is to run the training script. The model training script takes the following arguments:

  • pipeline_config_path the path to the updated model configuration file.
  • model_dir the directory where the trained model will be saved.
Image by author

You might get an OpenCV error on Colab. This error can be fixed by installing the right version of OpenCV.

pip uninstall opencv-python-headless==4.5.5.62 pip install opencv-python-headless==4.5.2.52If you get a cuDNN error, you can fix it by installing the right version of cuDNN.!apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2

When training is complete, you can run TensorBoard to show the visualization of the training and testing metrics such as the localization loss.

Image by author

Read more: TensorBoard tutorial (Deep dive with examples and notebook)

The next step is to export the model for inference. The conversion script expects:

  • trained_checkpoint_dir the last checkpoint of the trained model.
  • output_directory where the exported model will be saved.
  • pipeline_config_path the path to the pipeline configuration file.

The conversion script will output checkpoint files, a SavedModel, and the model config file.

Image by author

You may want to download the converted model or trained model. That can be done by zipping the files and using Colab utilities to download the compressed file.

It’s now time to use the trained Mask R-CNN model to perform object detection on test car images. Luckily, the TensorFlow 2 Object Detection API provides all the utilities needed to do this. The first one is a function that loads an image and converts it into a NumPy array.

The function expects a path to an image file and returns a NumPy array.

The next utility is a function for plotting the detections using Matplotlib.

Let’s now create a detection model from the last saved model checkpoint.

Next, we declare variables that are important for decoding the model output. For instance, the categories and file containing the training categories.

The next step is to run the Mask R-CNN object detection model on some test images.

Image by author

The Mask R-CNN object detection model can be used for both object detection and image segmentation. Let’s start by loading the fined tuned model.

The model also needs labels for decoding the output.

The next step is to define the path to the test images. In this case, we’ll use all the test images because they aren’t that many.

The segmentation utility is also provided on the TensorFlow 2 Object Detection API repo.

The next step is to load an image as a NumPy array and use the above function to start detecting objects.

Image by author

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment