As an amateur baseball player, I always want to analyze my pitching and swing to quantify my skills during practice. There are actually some commercial tools applying different technologies to do that, however, most of them are quite expensive and required extra equipment.

I was wondering if I can analyze my motion from a simple video made by phone. To do that, I need to gather information from the video. First I want to detect the baseball and identify its location, then I can do further analysis such as launch velocity and angle.

In this post, I am going to show how I create a custom baseball dataset and train an object detection model and apply it to baseball video using Detectron2 (

The baseball detector was build following three steps which will be discussed in details in the post:

1. Create custom baseball dataset in COCO format

2. Play around with Detectron2 and train the model in Colab

3. Load the video/image and apply the trained model to make a detection.

Create custom baseball dataset in COCO format

The baseball image in a real video clip is usually not clear and perfect. It might be a bit blur and distorted as shown below. Therefore, instead of using the pre-trained model provided by Detectron 2. I decided to create a dataset containing the real baseball images in the video for better detection.

Image for post

Screenshot from a video swinging a baseball (image by author)

To start, I used some baseball game video on Youtube and got 120 images containing baseball for this first attempt.

Image for post

Training images captured from video clips.

Then I used Labelimg to hand-label the baseball. Labelimg is a convenient tool to label object. Following the instruction in the repo, it can be easily installed and used.

Image for post

Labeling baseball using Labelimg

  1. Select “PascalVOC”, that is the default annotation format.
  2. Open the folder contains the images
  3. Set the folder to save the annotation
  4. Label the ball and save the annotation

It took me around 20 minutes to manually label 120 images, after that the annotations were saved in the folder you set in xml.

Then I used this script from Tony607’s GitHub repo to convert the PascalVOC xml files to one COCO formatted JSON file.

Now I have to custom baseball dataset with annotations in COCO format ready for training.

#baseball #computer-vision #deep-learning #artificial-intelligence #machine-learning #deep learning

How to build a baseball detector using Detectron2?
35.80 GEEK