YOLOv5 on Customized Teapots Dataset

Recently, I have been working on edge-side computer vision models (🚗,📱), and YOLOv5 with its outstanding computation speed and accuracy quickly fit into my list. I have been working with dataset in Open Image for a while, but sometimes, what you really want is just a little subset of it, rather than the entire hundreds GB with 100,000 classes. In this project, we only looking at one class: teapot🍵, fetching only data from that class quickly from OpenImage database and transfer learning on our favorite model: YOLOv5

This project will be in four parts: Data Fetching, Data Formatting & Processing, Model Training and Model Predictions

Table of Content

Data Fetching

Normally, I would write a scrapper with beautiful soup, but I have found this amazing open-source data fetching tool called OIDv4_ToolKit, which explicitly gathers all the data we need for certain classes in Open Images Dataset V4.

However, this software only supports V4, as the name states, let me know if you guys find a more general tool that works for all versions in Open Image

The code follows will be effective on Google Colab, for the universal operating environment or equivalent OS in Ubuntu.

First, we need to clone the software and install all the dependencies

!git clone https://github.com/EscVM/OIDv4_ToolKit.git
!pip install -r OIDv4_ToolKit/requirements.txt

Then you can go to its Doc page and scapes as desires. Here, since we only demonstrate on Teapot dataset, we run this command:

!python OIDv4_ToolKit/main.py downloader --classes Teapot --type_csv all --multiclasses 1

The Teapot after --classes is the classes of data we want, the name must aligns exactly with Open Images website labels.

The --multiclasses 1 is making sure that all the data collected will be in the same folder. It is redundant here, but convenient when you have more than 1 class.

After you run this command, a prompt will show up like this: image-center

OIDv4 Prompt

You need to input Y three times to keep the program forward, for training, testing and validation respectively.

after that, the date will be downloaded to folder OID/Dataset/, run this command to download all dataset from Colab to local:

from google.colab import files
!zip -r /content/file.zip /content/OID/Dataset/
files.download("/content/file.zip")

Then, you will have a file.zip download to you local setup. Inside, you will find each folder for train, test and validation, which contains the images and bounding box information (.txt) inside.

Take on txt file for example:

Teapot 361.516032 2.175744 770.8211200000001 358.4448

If you follows my previous post, you will recognize that this is YOLO format, where the first Teapot is the class name, and 361.516032 2.175744 is location of bounding box bottom left corner, 770.8211200000001 358.4448 is location of bounding box top right corner

Data Formatting & Processing

Since YOLO need to recognize the data to proceed the training process, we have to convert our data and labels in specific PyTorch YOLOv5 format. The PyTorch YOLOv5 takes a data.yml file to locate train, test, validate data and labels, which also arranged in a specific format. Luckily, we don’t have to worry about he nuance with the help of RoboFlow.

  • Go to RoboFlow
  • Create a free account
  • Create a dataset following the instructions
  • Drag and drop all the images in all three folder, then the labels, you will find that RoboFlow does the bounding box label automatically for you 🤟

image-center

RoboFlow Interface
  • After uploading and labeling, press “Finish Upload” on top right corner, customize your train test split, usually we leave it with 70-20-10 as default.
  • Continue, then “Generate” on top right corner with a version name
  • After the web does all its works, hit “show download code”
  • Copy the command there. That command is what all this struggle actually for. It will looks like this: !curl -L "SOME-CHARACTERS" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

  • Paste that to our colab and run, you will find train, test, valid folders downloaded for you, with data.yml of course

Now, we can train our models! Yeaaaaaa! 🍕

Model Training

As usual, clone YOLOv5 to Colab and test our GPU. Don’t forget to set Colab runtime type to GPU for it will be painful ☠.

import torch
from IPython.display import Image  # for displaying images
from utils.google_utils import gdrive_download  # for downloading models/datasets
print('torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

Then, navigate into yolov5 folder and download all its models: ‘yolov5x.pt’, ‘yolov5m.pt’, ‘yolov5s.pt’ and ‘yolov5l.pt’. Besides their difference in sizes, they also different in structure and performances. We have already talked about this in previous post.

%cd yolov5/
!weights/download_weights.sh

Now we TRAIN!

!python train.py --img 640 --batch 8 --epochs 30 --data ../data.yaml --weights yolov5s.pt --device 0 --cfg ./models/yolov5s.yaml

This command specify that we will train our model at image size 640px, 8 batches, 30 epochs, from data.yaml, pretrained with yolov5s.pt and its structure yolov5s.yaml. You can change the structure in yolov5s.yaml. Here, we highly recommend pretain with one of the models to achieve good result even in limited data and resources. You can set --weight '', which randomize the weight initialization, but the result will be really poor. Trust me on this.

When training, it will looks like this: image-center

Training Process

Where first you can see the structure of the model you are training, then the training by epochs.

After training, you will find you model saved in address at the prompt bottom. We will use the best.pt model to test our model

The model will be saved in runs/train/exp/weights/best.pt; the second time you train it, it will be saved at runs/train/exp2/weights/best.pt. The more you train, the more increments on exp holder name.

Model Prediction

At last, run this

!python detect.py --weights runs/train/exp/weights/best.pt --conf 0.4 --source ../test/images/

This command predicts base on model runs/train/exp/weights/best.pt, taking images from source file ../test/images/, and bounding box all predictions have confidence larger or equal to 40%

The prediction will be super fast as it is.

Finally

Well, we are done, take a rest and have a cup of tea 🍵. You know where to find 😋

If you are interested in my projects or have any new ideas you wanna talk about, feel free to contact me!

image-center

A BEER would be perfect, but remember NO CORONA! 🍻

Buy me a BeerBuy me a Beer