@wanghuijiao
2021-09-22T08:40:36.000000Z
字数 14802
阅读 4756
技术文档
/ssd01/wanghuijiao/pose_detector02环境配置(默认使用Linux系统)
使用make编译
make命令。GPU=1 用CUDA编译,CUDA需要在\myurl{/usr/local/cuda}目录下CUDNN=1 用cuDNN v5-v7编译,cuDNN需要在/usr/local/cudnn目录下CUDNN_HALF=1 使用单精度来训练和测试,可以在Titan V / Tesla V100 / DGX-2或一些更新的设备上获得测试性能3倍的提升,训练速度2倍的提升。OPENCV=1,编译过程中使用OpenCV 4.x/3.x/2.4.x,方便测试的时候可以使用视频和摄像头。DEBUG=1,编译debug版本的YOLO。OPENMP=1,在多核CPU平台上使用OpenMP加速。LIBSO=1,编译生成darknet.so和可执行的库文件。下载预训练模型
./darknet detector test ./cfg/coco.data ./cfg/yolov4.cfg ./yolov4.weights data/dog.jpg -i 0 -thresh 0.25 ./darknet detector demo ./cfg/coco.data ./cfg/yolov4.cfg ./yolov4.weights test50.mp4 -i 0 -thresh 0.25./darknet detector demo ./cfg/coco.data ./cfg/yolov4.cfg ./yolov4.weights rtsp://admin:admin12345@192.168.0.228:554 -i 0 -thresh 0.2510.0.10.56:/ssd01/wanghuijiao/dataset/manfu_human/v1.0标签格式说明
标签文件中每一行格式为:<object-class> <x_center> <y_center> <width> <height>,一行表示一个bbox。
<object-class>表示标签编号,取值范围从0到class-1
<x_center> <y_center> <width> <height> 依次是bbox的中心横纵坐标和宽高,均为相对于图片宽高的比值,是浮点数,计算方式如:<x> = <absolute_x> / <image_width> 或 <height> = <absolute_height> / <image_height>。比如close_2m_stand_coldlight_longsleeve_backshot_withoutshelter_CJ67_P0964.mp4#t=1.966667.txt内容为:
0 0.41476345840130513 0.4560756579689358 0.20309951060358902 0.91215131593787160 0.4942903752039152 0.3217391304347827 0.07014681892332793 0.6434782608695654
图片来自CrowdHuman官网配置文件
darknet/cfg/yolov4_pose.cfg, YOLOv4通过.cfg文件定义网络结构,此类文件一般放在darknet/cfg目录下。 数据配置文件:准备数据文件darknet/cfg/pose.data,YOLOv4用.data文件记录数据集相关信息;
darknet/cfg/pose.data内容示例如下:
classes = 6train = /ssd01/wanghuijiao/dataset/manfu_human/v1.0/train_v1.0.txtvalid = /ssd01/wanghuijiao/dataset/manfu_human/v1.0/test_v1.0.txtnames = /ssd01/wanghuijiao/pose_detector02/data/pose.namesbackup = ./backup_manfu_human_v1.0_512eval = coco
darknet/data/pose.names,此文件记录数据集中标签名称,本实验的darknet/data/pose.names内容示例如下:
standsitsquatliehalf_lieother_pose
/ssd01/wanghuijiao/pose_detector02/cfg/yolov4-tiny_human_416.cfg/ssd01/wanghuijiao/pose_detector02/cfg/crowdhuman-416.data/ssd01/wanghuijiao/pose_detector02/data/crowdhuman.names/ssd01/wanghuijiao/pose_detector02/cfg/yolov4-tiny_human_pose_416.cfg/ssd01/wanghuijiao/pose_detector02/cfg/pose.data/ssd01/wanghuijiao/pose_detector02/data/pose.names训练命令
10.0.10.56: /ssd01/wanghuijiao/pose_detector02/csdarknet53-omega.conv.105下载csdarknet53-omega.conv.105
./darknet detector train \cfg/pose_b.data \ # 数据配置文件cfg/pose_yolov4.cfg \ # 模型配置文件./csdarknet53-omega.conv.105 \ # ImageNet预训练权重-dont_show \-map \-gpus 4,5,6,7
调参方法
0.00065,若是单卡,则使用默认的0.00231.cfg中参数配置(大约第24行)处增加mosaic=1, 关闭则注释掉这一行。cutmix=1。
# 1.先在COCO person上训练6000batch,预训练权重是官方提供的yolov4-tiny.conv.29(应该是COCO80类上训练的)./darknet detector train \cfg/yolov4-tiny_cocohuman_416.data \cfg/yolov4-tiny_human_416.cfg \yolov4-tiny.conv.29 \-dont_show \-map \-gpus 4,5,6,7# 2.然后在CrowdHuman person类上继续训练14000个batch./darknet detector train \cfg/crowdhuman-416.data \cfg/yolov4-tiny_human_416.cfg \backup/yolov4-tiny_cocohuman_416/yolov4-tiny_human_416_best.weights \ # 在COCO person类上的batch6000预训练权重-dont_show \-map \-gpus 4,5,6,7 \# -clear 添加这个选项会从batch=0开始训练,不添加就从batch6000继续
./darknet detector train \cfg/pose.data \cfg/yolov4-tiny_human_pose_416.cfg \backup/yolov4-tiny_manfu_humanPose_416/yolov4-tiny_human_pose_416_best.weights \-dont_show \-map \-gpus 4,5,6,7 \-clear

性能指标
| 实验编号/分支名 | 参数 | 精度 |
|---|---|---|
| 0817 yolov4_tiny_crowdhuman_416 | IOU_th=0.25; w=416,h=416; 预训练模型:yolov4-tiny.conv.29 | person ap = 59.12% |
权重:/ssd01/wanghuijiao/pose_detector02/backup/yolov4_tiny_crowdhuman_416/yolov4-tiny_human_416_best.weights
模型性能分析& Demo
模型训练结果图mAP、Loss等指标变化(只保存了训练 40000 batch的后20000 batch部分)
性能指标
| 实验编号/分支名 | 参数 | 精度 |
|---|---|---|
| 0820 yolov4-tiny_manfu_humanPose_416 | IOU_th=0.25; w=416,h=416; 测试集:test_v1.0.txt; 预训练模型:v4-tiny 人体检测器 | stand ap = 68.94%; sit ap = 84.11%; squat ap = 49.3%; lie ap = 32.5%; half_lie ap = 34.15%; other_pose ap = 14.10%, mAP = 47.19 % |
/ssd01/wanghuijiao/pose_detector02/backup/yolov4-tiny_manfu_humanPose_416/yolov4-tiny_human_pose_416_best.weights
# test_v1.0.txt# weights: ./backup_manfu_human_v1.0_512/pose_yolov4_best.weightsclass_id = 0, name = stand, ap = 64.18% (TP = 2450, FP = 1119)class_id = 1, name = sit, ap = 72.67% (TP = 2426, FP = 924)class_id = 2, name = squat, ap = 33.98% (TP = 84, FP = 96)class_id = 3, name = lie, ap = 23.06% (TP = 10, FP = 5)class_id = 4, name = half_lie, ap = 33.38% (TP = 2, FP = 2)class_id = 5, name = other_pose, ap = 1.21% (TP = 0, FP = 0)for conf_thresh = 0.25, precision = 0.70, recall = 0.58, F1-score = 0.63for conf_thresh = 0.25, TP = 4972, FP = 2146, FN = 3591, average IoU = 48.36 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recallmean average precision (mAP@0.50) = 0.380806, or 38.08 %
# test_v1.0.txt# weights: ./backup_manfu_human_v1.0_b_512/pose_yolov4_b_best.weightsclass_id = 0, name = stand, ap = 68.38% (TP = 536, FP = 238)class_id = 1, name = sit, ap = 79.06% (TP = 429, FP = 143)class_id = 2, name = squat, ap = 63.94% (TP = 125, FP = 36)class_id = 3, name = lie, ap = 55.34% (TP = 132, FP = 78)class_id = 4, name = half_lie, ap = 30.82% (TP = 58, FP = 47)class_id = 5, name = other_pose, ap = 34.12% (TP = 15, FP = 6)for conf_thresh = 0.25, precision = 0.70, recall = 0.58, F1-score = 0.64for conf_thresh = 0.25, TP = 1295, FP = 548, FN = 919, average IoU = 53.73 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recallmean average precision (mAP@0.50) = 0.552780, or 55.28 %
calculation mAP (mean average precision)...Detection layer: 139 - type = 27Detection layer: 150 - type = 27Detection layer: 161 - type = 274708detections_count = 13112, unique_truth_count = 8563class_id = 0, name = stand, ap = 73.59% (TP = 2901, FP = 856)class_id = 1, name = sit, ap = 86.05% (TP = 2810, FP = 462)class_id = 2, name = squat, ap = 57.60% (TP = 138, FP = 96)class_id = 3, name = lie, ap = 38.77% (TP = 126, FP = 105)class_id = 4, name = half_lie, ap = 44.31% (TP = 513, FP = 330)class_id = 5, name = other_pose, ap = 39.18% (TP = 25, FP = 17)for conf_thresh = 0.25, precision = 0.78, recall = 0.76, F1-score = 0.77for conf_thresh = 0.25, TP = 6513, FP = 1866, FN = 2050, average IoU = 63.42 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recallmean average precision (mAP@0.50) = 0.565842, or 56.58 %Total Detection Time: 90 SecondsSet -points flag:`-points 101` for MS COCO`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom datasetmean_average_precision (mAP@0.5) = 0.565842Saving weights to ./backup_manfu_human_v1.0_512/pose_yolov4_120000.weights
1036detections_count = 2797, unique_truth_count = 2214class_id = 0, name = stand, ap = 68.25% (TP = 530, FP = 140)class_id = 1, name = sit, ap = 80.36% (TP = 457, FP = 104)class_id = 2, name = squat, ap = 63.12% (TP = 148, FP = 38)class_id = 3, name = lie, ap = 61.53% (TP = 164, FP = 50)class_id = 4, name = half_lie, ap = 33.34% (TP = 133, FP = 114)class_id = 5, name = other_pose, ap = 54.09% (TP = 35, FP = 16)for conf_thresh = 0.25, precision = 0.76, recall = 0.66, F1-score = 0.71for conf_thresh = 0.25, TP = 1467, FP = 462, FN = 747, average IoU = 61.19 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recallmean average precision (mAP@0.50) = 0.601145, or 60.11 %Total Detection Time: 19 SecondsSet -points flag:`-points 101` for MS COCO`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom datasetmean_average_precision (mAP@0.5) = 0.601145Saving weights to ./backup_manfu_human_v1.0_b_512/pose_yolov4_b_120000.weights
calculation mAP (mean average precision)...Detection layer: 30 - type = 27Detection layer: 37 - type = 272696detections_count = 57081, unique_truth_count = 11004class_id = 0, name = person, ap = 59.12% (TP = 5692, FP = 1653)for conf_thresh = 0.25, precision = 0.77, recall = 0.52, F1-score = 0.62for conf_thresh = 0.25, TP = 5692, FP = 1653, FN = 5312, average IoU = 58.60 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recallmean average precision (mAP@0.50) = 0.591248, or 59.12 %Total Detection Time: 20 SecondsSet -points flag:`-points 101` for MS COCO`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom datasetmean_average_precision (mAP@0.5) = 0.591248
calculation mAP (mean average precision)...Detection layer: 30 - type = 27Detection layer: 37 - type = 272696detections_count = 63890, unique_truth_count = 11004class_id = 0, name = person, ap = 60.66% (TP = 5477, FP = 1682)for conf_thresh = 0.25, precision = 0.77, recall = 0.50, F1-score = 0.60for conf_thresh = 0.25, TP = 5477, FP = 1682, FN = 5527, average IoU = 58.19 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recallmean average precision (mAP@0.50) = 0.606624, or 60.66 %Total Detection Time: 22 SecondsSet -points flag:`-points 101` for MS COCO`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom datasetmean_average_precision (mAP@0.5) = 0.606624
/ssd01/wanghuijiao/pose_detector02/backup/yolov4-tiny_manfu_humanPose_416/yolov4-tiny_human_pose_416_best.weights
calculation mAP (mean average precision)...Detection layer: 30 - type = 27Detection layer: 37 - type = 274708detections_count = 49418, unique_truth_count = 8563class_id = 0, name = stand, ap = 68.94% (TP = 2448, FP = 969)class_id = 1, name = sit, ap = 84.11% (TP = 2684, FP = 656)class_id = 2, name = squat, ap = 49.37% (TP = 107, FP = 68)class_id = 3, name = lie, ap = 32.50% (TP = 60, FP = 55)class_id = 4, name = half_lie, ap = 34.15% (TP = 182, FP = 144)class_id = 5, name = other_pose, ap = 14.10% (TP = 5, FP = 4)for conf_thresh = 0.25, precision = 0.74, recall = 0.64, F1-score = 0.69for conf_thresh = 0.25, TP = 5486, FP = 1896, FN = 3077, average IoU = 58.11 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recallmean average precision (mAP@0.50) = 0.471944, or 47.19 %Total Detection Time: 75 Seconds
训练前
.cfg 文件中random=1,这样有助于提升对不同分辨率的precision.cfg文件中增加模型分辨率,比如height=608, width=608。-show_imgs ,如果看到正确的bbox标在物体上(windows或者aug...jpg图片上).cfg文件的最后一个[yolo]层或者[region]层添加参数max=200(YoloV3在整个图片全局最大值是可以检测出0.0615234375*(width*height)个物体)。.cfg中L895改为layer=23, L892改stride=4,L989改为stride=4。为啥???.cfg文件的flip=0翻转数据增强关掉train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width.cfg文件的136层stopbackward=1训练后 - 用于检测
.cfg图像分辨率,这样会提高precision,并且有助于检测小目标.weights即可Out of memory,就更改.cfg文件的subdivisions=16,32 or 64