The following post shows how to train object detection models based on YOLO-architecture (links to research articles on this topic in the «References» down below), get mAP, average loss statistics in Google Colab and test trained models using custom Python scripts.
**1. **Clone the repository (DarkNet framework):
!git clone https://github.com/AlexeyAB/darknet.git
**2. **Create folder «build-release»:
cd /content/drive/My\ Drive/darknet/
!mkdir build-release
cd /content/drive/My\ Drive/darknet/build-release/
**Note: **Optionally, before compilation user might comment out the lines in the source code that produce printed output in the terminal, since by default training process produces output which is too redundant, additionally loads operating memory even to display the printed output (!), at some point causes freezing and even sudden termination (experienced several times by the author). The excerpt below contains output only for two iterations:
truncated
7: 679.634888, 679.691040 avg loss, 0.000000 rate, 0.385118 seconds, 224 images, 419.996108 hours left
Loaded: 2.210867 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.605842, GIOU: 0.574251), Class: 0.500359, Obj: 0.501755, No Obj: 0.500504, .5R: 1.000000, .75R: 0.000000, count: 5, class_loss = 271.598236, iou_loss = 0.801147, total_loss = 272.399384
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500954, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870
total_bbox = 248, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.614312, GIOU: 0.605125), Class: 0.500208, Obj: 0.501565, No Obj: 0.500505, .5R: 0.750000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.536957, total_loss = 272.073975
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500934, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293945, iou_loss = 0.000000, total_loss = 1087.293945
total_bbox = 252, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.640582, GIOU: 0.634447), Class: 0.499148, Obj: 0.500193, No Obj: 0.500503, .5R: 0.600000, .75R: 0.200000, count: 5, class_loss = 271.477295, iou_loss = 0.716827, total_loss = 272.194122
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500943, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.298706, iou_loss = 0.000000, total_loss = 1087.298706
total_bbox = 257, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.578494, GIOU: 0.560010), Class: 0.500962, Obj: 0.502538, No Obj: 0.500505, .5R: 0.800000, .75R: 0.000000, count: 5, class_loss = 271.405792, iou_loss = 0.654083, total_loss = 272.059875
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683
total_bbox = 262, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.635607, GIOU: 0.608577), Class: 0.500960, Obj: 0.502537, No Obj: 0.500506, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.092285, iou_loss = 0.535187, total_loss = 271.627472
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.300903, iou_loss = 0.000000, total_loss = 1087.300903
total_bbox = 266, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.571984, GIOU: 0.536152), Class: 0.500218, Obj: 0.501519, No Obj: 0.500505, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.347656, iou_loss = 0.689331, total_loss = 272.036987
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.295410, iou_loss = 0.000000, total_loss = 1087.295410
total_bbox = 270, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.674173, GIOU: 0.652403), Class: 0.500960, Obj: 0.502531, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.155334, iou_loss = 0.497925, total_loss = 271.653259
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.292358, iou_loss = 0.000000, total_loss = 1087.292358
total_bbox = 274, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.639989, GIOU: 0.600127), Class: 0.500965, Obj: 0.502533, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 270.776886, iou_loss = 0.559204, total_loss = 271.336090
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296143, iou_loss = 0.000000, total_loss = 1087.296143
total_bbox = 278, rewritten_bbox = 0.000000 %
8: 679.609314, 679.682861 avg loss, 0.000000 rate, 0.393729 seconds, 256 images, 418.879547 hours left
Loaded: 1.570684 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.471209, GIOU: 0.434264), Class: 0.500210, Obj: 0.501556, No Obj: 0.500503, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.976410, total_loss = 272.513428
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296387, iou_loss = 0.000000, total_loss = 1087.296387
total_bbox = 282, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.616591, GIOU: 0.601557), Class: 0.500208, Obj: 0.501559, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.410980, iou_loss = 0.328857, total_loss = 271.739838
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500925, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.294678, iou_loss = 0.000000, total_loss = 1087.294678
total_bbox = 286, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.560825, GIOU: 0.559239), Class: 0.499954, Obj: 0.501222, No Obj: 0.500504, .5R: 0.833333, .75R: 0.000000, count: 6, class_loss = 271.661560, iou_loss = 0.975983, total_loss = 272.637543
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500941, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683
total_bbox = 292, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.594659, GIOU: 0.592429), Class: 0.499454, Obj: 0.500582, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.161102, iou_loss = 0.735657, total_loss = 271.896759
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500947, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870
total_bbox = 296, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.546971, GIOU: 0.527006), Class: 0.500188, Obj: 0.501606, No Obj: 0.500506, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.620148, total_loss = 272.157166
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500932, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293457, iou_loss = 0.000000, total_loss = 1087.293457
total_bbox = 300, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.703408, GIOU: 0.697468), Class: 0.500990, Obj: 0.502102, No Obj: 0.500505, .5R: 1.000000, .75R: 0.200000, count: 5, class_loss = 271.091461, iou_loss = 0.492645, total_loss = 271.584106
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.326191, GIOU: 0.083068), Class: 0.502488, Obj: 0.502020, No Obj: 0.500926, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.353516, iou_loss = 0.583740, total_loss = 1087.937256
total_bbox = 306, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.652279, GIOU: 0.637695), Class: 0.500962, Obj: 0.502520, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 270.902740, iou_loss = 0.349670, total_loss = 271.252411
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500928, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.301758, iou_loss = 0.000000, total_loss = 1087.301758
total_bbox = 310, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.673402, GIOU: 0.651781), Class: 0.500969, Obj: 0.502514, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 271.029327, iou_loss = 0.494690, total_loss = 271.524017
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.297485, iou_loss = 0.000000, total_loss = 1087.297485
total_bbox = 314, rewritten_bbox = 0.000000 %
truncated
The length per iteration depends on the amount of subdivisions (affects only memory utilization — the less subdivisions, the higher workload on memory since it must process together more images at the same time) and convnet (DarkNet19, DarkNet53, DenseNet, ResNet etc.)
Open «darknet/src/region_layer.c» and comment out the following line:
printf(“Region Avg IOU: %f, Class: %f, Obj: %f, No Obj: %f, Avg Recall: %f, count: %d\n”, avg_iou/count, avg_cat/class_count, avg_obj/count, avg_anyobj/(l.w*l.h*l.n*l.batch), recall/count, count);
Open «darknet/src/yolo_layer.c» and comment out the following line:
fprintf(stderr, “v3 (%s loss, Normalizer: (iou: %.2f, cls: %.2f) Region %d Avg (IOU: %f, GIOU: %f), Class: %f, Obj: %f, No Obj: %f, .5R: %f, .75R: %f, count: %d, class_loss = %f, iou_loss = %f, total_loss = %f \n”,
(l.iou_loss == MSE ? “mse” : (l.iou_loss == GIOU ? “giou” : “iou”)), l.iou_normalizer, l.cls_normalizer, state.index, tot_iou / count, tot_giou / count, avg_cat / class_count, avg_obj / count, avg_anyobj / (l.w*l.h*l.n*l.batch), recall / count, recall75 / count, count,
classification_loss, iou_loss, loss);
In some cases in the end is used «region» layer instead of «yolo» (for ex., for DenseNet) therefore one should comment out lines in both files.
#convolutional-neural-net #python #object-detection #yolo #machine-learning