As an amateur baseball player, I always want to analyze my pitching and swing to quantify my skills during practice. There are actually some commercial tools applying different technologies to do that, however, most of them are quite expensive and required extra equipment.
I was wondering if I can analyze my motion from a simple video made by phone. To do that, I need to gather information from the video. First I want to detect the baseball and identify its location, then I can do further analysis such as launch velocity and angle.
In this post, I am going to show how I create a custom baseball dataset and train an object detection model and apply it to baseball video using Detectron2 (https://github.com/facebookresearch/detectron2).
The baseball detector was build following three steps which will be discussed in details in the post:
1. Create custom baseball dataset in COCO format
2. Play around with Detectron2 and train the model in Colab
3. Load the video/image and apply the trained model to make a detection.
The baseball image in a real video clip is usually not clear and perfect. It might be a bit blur and distorted as shown below. Therefore, instead of using the pre-trained model provided by Detectron 2. I decided to create a dataset containing the real baseball images in the video for better detection.
Screenshot from a video swinging a baseball (image by author)
To start, I used some baseball game video on Youtube and got 120 images containing baseball for this first attempt.
Training images captured from video clips.
Then I used Labelimg to hand-label the baseball. Labelimg is a convenient tool to label object. Following the instruction in the repo, it can be easily installed and used.
Labeling baseball using Labelimg
It took me around 20 minutes to manually label 120 images, after that the annotations were saved in the folder you set in xml.
Then I used this voc2coco.py script from Tony607’s GitHub repo to convert the PascalVOC xml files to one COCO formatted JSON file.
Now I have to custom baseball dataset with annotations in COCO format ready for training.
#baseball #computer-vision #deep-learning #artificial-intelligence #machine-learning #deep learning