The following post shows how to train object detection models based on YOLO-architecture (links to research articles on this topic in the «References» down below), get mAP, average loss statistics in Google Colab and test trained models using custom Python scripts.

Repository preparation

**1. **Clone the repository (DarkNet framework):

!git clone https://github.com/AlexeyAB/darknet.git

**2. **Create folder «build-release»:

cd /content/drive/My\ Drive/darknet/
!mkdir build-release
cd /content/drive/My\ Drive/darknet/build-release/

**Note: **Optionally, before compilation user might comment out the lines in the source code that produce printed output in the terminal, since by default training process produces output which is too redundant, additionally loads operating memory even to display the printed output (!), at some point causes freezing and even sudden termination (experienced several times by the author). The excerpt below contains output only for two iterations:

truncated
7: 679.634888, 679.691040 avg loss, 0.000000 rate, 0.385118 seconds, 224 images, 419.996108 hours left
Loaded: 2.210867 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.605842, GIOU: 0.574251), Class: 0.500359, Obj: 0.501755, No Obj: 0.500504, .5R: 1.000000, .75R: 0.000000, count: 5, class_loss = 271.598236, iou_loss = 0.801147, total_loss = 272.399384 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500954, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870 
 total_bbox = 248, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.614312, GIOU: 0.605125), Class: 0.500208, Obj: 0.501565, No Obj: 0.500505, .5R: 0.750000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.536957, total_loss = 272.073975 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500934, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293945, iou_loss = 0.000000, total_loss = 1087.293945 
 total_bbox = 252, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.640582, GIOU: 0.634447), Class: 0.499148, Obj: 0.500193, No Obj: 0.500503, .5R: 0.600000, .75R: 0.200000, count: 5, class_loss = 271.477295, iou_loss = 0.716827, total_loss = 272.194122 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500943, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.298706, iou_loss = 0.000000, total_loss = 1087.298706 
 total_bbox = 257, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.578494, GIOU: 0.560010), Class: 0.500962, Obj: 0.502538, No Obj: 0.500505, .5R: 0.800000, .75R: 0.000000, count: 5, class_loss = 271.405792, iou_loss = 0.654083, total_loss = 272.059875 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683 
 total_bbox = 262, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.635607, GIOU: 0.608577), Class: 0.500960, Obj: 0.502537, No Obj: 0.500506, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.092285, iou_loss = 0.535187, total_loss = 271.627472 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.300903, iou_loss = 0.000000, total_loss = 1087.300903 
 total_bbox = 266, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.571984, GIOU: 0.536152), Class: 0.500218, Obj: 0.501519, No Obj: 0.500505, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.347656, iou_loss = 0.689331, total_loss = 272.036987 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.295410, iou_loss = 0.000000, total_loss = 1087.295410 
 total_bbox = 270, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.674173, GIOU: 0.652403), Class: 0.500960, Obj: 0.502531, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.155334, iou_loss = 0.497925, total_loss = 271.653259 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.292358, iou_loss = 0.000000, total_loss = 1087.292358 
 total_bbox = 274, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.639989, GIOU: 0.600127), Class: 0.500965, Obj: 0.502533, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 270.776886, iou_loss = 0.559204, total_loss = 271.336090 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296143, iou_loss = 0.000000, total_loss = 1087.296143 
 total_bbox = 278, rewritten_bbox = 0.000000 % 

 8: 679.609314, 679.682861 avg loss, 0.000000 rate, 0.393729 seconds, 256 images, 418.879547 hours left
Loaded: 1.570684 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.471209, GIOU: 0.434264), Class: 0.500210, Obj: 0.501556, No Obj: 0.500503, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.976410, total_loss = 272.513428 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296387, iou_loss = 0.000000, total_loss = 1087.296387 
 total_bbox = 282, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.616591, GIOU: 0.601557), Class: 0.500208, Obj: 0.501559, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.410980, iou_loss = 0.328857, total_loss = 271.739838 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500925, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.294678, iou_loss = 0.000000, total_loss = 1087.294678 
 total_bbox = 286, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.560825, GIOU: 0.559239), Class: 0.499954, Obj: 0.501222, No Obj: 0.500504, .5R: 0.833333, .75R: 0.000000, count: 6, class_loss = 271.661560, iou_loss = 0.975983, total_loss = 272.637543 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500941, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683 
 total_bbox = 292, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.594659, GIOU: 0.592429), Class: 0.499454, Obj: 0.500582, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.161102, iou_loss = 0.735657, total_loss = 271.896759 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500947, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870 
 total_bbox = 296, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.546971, GIOU: 0.527006), Class: 0.500188, Obj: 0.501606, No Obj: 0.500506, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.620148, total_loss = 272.157166 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500932, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293457, iou_loss = 0.000000, total_loss = 1087.293457 
 total_bbox = 300, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.703408, GIOU: 0.697468), Class: 0.500990, Obj: 0.502102, No Obj: 0.500505, .5R: 1.000000, .75R: 0.200000, count: 5, class_loss = 271.091461, iou_loss = 0.492645, total_loss = 271.584106 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.326191, GIOU: 0.083068), Class: 0.502488, Obj: 0.502020, No Obj: 0.500926, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.353516, iou_loss = 0.583740, total_loss = 1087.937256 
 total_bbox = 306, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.652279, GIOU: 0.637695), Class: 0.500962, Obj: 0.502520, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 270.902740, iou_loss = 0.349670, total_loss = 271.252411 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500928, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.301758, iou_loss = 0.000000, total_loss = 1087.301758 
 total_bbox = 310, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.673402, GIOU: 0.651781), Class: 0.500969, Obj: 0.502514, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 271.029327, iou_loss = 0.494690, total_loss = 271.524017 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.297485, iou_loss = 0.000000, total_loss = 1087.297485 
 total_bbox = 314, rewritten_bbox = 0.000000 % 
truncated

The length per iteration depends on the amount of subdivisions (affects only memory utilization — the less subdivisions, the higher workload on memory since it must process together more images at the same time) and convnet (DarkNet19, DarkNet53, DenseNet, ResNet etc.)

Open «darknet/src/region_layer.c» and comment out the following line:

printf(“Region Avg IOU: %f, Class: %f, Obj: %f, No Obj: %f, Avg Recall: %f, count: %d\n”, avg_iou/count, avg_cat/class_count, avg_obj/count, avg_anyobj/(l.w*l.h*l.n*l.batch), recall/count, count);

Open «darknet/src/yolo_layer.c» and comment out the following line:

fprintf(stderr, “v3 (%s loss, Normalizer: (iou: %.2f, cls: %.2f) Region %d Avg (IOU: %f, GIOU: %f), Class: %f, Obj: %f, No Obj: %f, .5R: %f, .75R: %f, count: %d, class_loss = %f, iou_loss = %f, total_loss = %f \n”,
 (l.iou_loss == MSE ? “mse” : (l.iou_loss == GIOU ? “giou” : “iou”)), l.iou_normalizer, l.cls_normalizer, state.index, tot_iou / count, tot_giou / count, avg_cat / class_count, avg_obj / count, avg_anyobj / (l.w*l.h*l.n*l.batch), recall / count, recall75 / count, count,
 classification_loss, iou_loss, loss);

In some cases in the end is used «region» layer instead of «yolo» (for ex., for DenseNet) therefore one should comment out lines in both files.

#convolutional-neural-net #python #object-detection #yolo #machine-learning

Object detection with YOLO
22.25 GEEK