Mobility is a priority theme for the European Union in the context of urban development. At the same time, hundreds of people, including cyclists and pedestrians, lose their lives on the roads. Therefore, planning and ordering of cities through appropriate infrastructures is urging, alongside a safe and efficient transport network aimed at active mobility — both on foot and by bicycle.It is now presented the object detection model that was trained to identify whether cyclists are wearing a helmet and, potentially, studying their prevalence.YOLOv5YOLOv5 is the most recent version of YOLO which was originally developed by Joseph Redmon. First version runs in a framework called Darknet which was purposely built to execute YOLO. Version 5 is the second model that was not developed by the original author (after version 4), and the first running in a state-of-the-art machine learning framework — PyTorch. YOLOv5 GitHub repository contains a pre-trained model in the MS Coco dataset. Plus, benchmark tests (Figure 1) on the same dataset and detailed documentation on how to execute or retrain it using different data.
Figure 1 The most up to date YOLO model is the version 5 (July 2020). It was released with 4 different sets of weights varying in accuracy and storage requirements. The presence of EfficientDet (the most accurate OD model) highlights the speed of detection of YOLOv5, while keeping the same high accuracy.
In Table 1, there is a comparison in terms of precision, speed and storage requirements for each YOLOv5 set of available weights.
Table 1 Specifications for all sets of weights released with YOLOv5. Generally, as average precision increases, more processing power is required from the GPU to be executed.
The architecture of YOLOv5 consists in three important parts, as in any single-stage object detector: model backbone, neck and head. The first is used to extract the main features of a given input image. In version 5, Cross Stage Partial Networks are used. These have shown significant improvements in processing time with deeper networks. Model neck PANet was used to obtain feature pyramids and helps generalizing the model on object scaling. The final detection part is performed by the head of the model (same as in YOLOv3 and YOLOv4). It applies anchor boxes on features and generates output vectors including class probabilities and bounding boxes.Each potential detection has an associated confidence score. This indicates how certain is the model about the presence of an object inside the bounding box and, at the same time, whether the box is capturing it correctly.
#deep-learning #yolo #artificial-intelligence #developer