文章于2020年已更新
http://m.itdecent.cn/p/2206db894b28
工具準(zhǔn)備
Darknet-YOLO:https://pjreddie.com/darknet/yolo/
labelImg:https://github.com/tzutalin/labelImg
創(chuàng)建文件夾
在darknet/scripts目錄下創(chuàng)建以下目錄
├── VOCdevkit
│ └── VOC2007
│ ├── Annotations
│ │ ├── 0a0a0b1a-7c39d841.xml
│ │ └── lena.xml
│ ├── ImageSets
│ │ ├── Layout
│ │ ├── Main
│ │ │ ├── test.txt
│ │ │ ├── train.txt
│ │ │ └── val.txt
│ │ └── Segmentation
│ ├── JPEGImages
│ │ ├── 0a0a0b1a-7c39d841.jpg
│ │ └── lena.jpg
│ └── labels
│ └── 0a0a0b1a-7c39d841.txt
└── voc_label.py
其中
JPEGImages下為訓(xùn)練測試集圖片

Annotations下為VOC格式的xml標(biāo)注
如
<annotation>
<folder>JPEGImages</folder>
<filename>0a0a0b1a-7c39d841.jpg</filename>
<path>/home/dew/CV2018/yolo/darknet/scripts/VOCdevkit/VOC2007/JPEGImages/0a0a0b1a-7c39d841.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>1280</width>
<height>720</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>557</xmin>
<ymin>275</ymin>
<xmax>688</xmax>
<ymax>398</ymax>
</bndbox>
</object>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>160</xmin>
<ymin>297</ymin>
<xmax>252</xmax>
<ymax>373</ymax>
</bndbox>
</object>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>392</xmin>
<ymin>298</ymin>
<xmax>459</xmax>
<ymax>353</ymax>
</bndbox>
</object>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>492</xmin>
<ymin>304</ymin>
<xmax>523</xmax>
<ymax>345</ymax>
</bndbox>
</object>
</annotation>
Main下txt文件為對應(yīng)的測試、訓(xùn)練文件名稱
如:
0a0a0b1a-7c39d841
轉(zhuǎn)換標(biāo)注集格式
修改voc_label.py, 如只有一個(gè)class:car
sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
classes = ["car"]
'''
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
'''
運(yùn)行文件script/voc_label.py
python ./voc_label.py
會(huì)在目錄下生成一系列文件并將VOC格式標(biāo)注轉(zhuǎn)為YOLO格式txt標(biāo)注(歸一化處理)見/darknet/scripts/VOCdevkit/VOC2007/labels/0a0a0b1a-7c39d841.txt
0 0.485546875 0.465972222222 0.10234375 0.170833333333
0 0.16015625 0.463888888889 0.071875 0.105555555556
0 0.331640625 0.450694444444 0.05234375 0.0763888888889
0 0.395703125 0.449305555556 0.02421875 0.0569444444444
修改cfg/voc.data
classes= 1
train = /home/dew/Desktop/CV2018/yolo/darknet/scripts/2007_train.txt
valid = /home/dew/Desktop/CV2018/yolo/darknet/scripts/2007_val.txt
names = data/voc.names
backup = backup
修改cfg/yolov3-voc.cfg

查找?guī)в衃convolutional]以及[yolo]標(biāo)簽處(共3處)
修改
classes = 標(biāo)注種類數(shù)
filters=3*(classes+1+4)
ramdom=0 //顯存足夠1,不足夠0
修改data/voc.names
備份后將內(nèi)容修改為訓(xùn)練集classes名
下載預(yù)訓(xùn)練權(quán)重文件(只包含卷積層)并訓(xùn)練
wget https://pjreddie.com/media/files/darknet53.conv.74
./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg darknet53.conv.74
log說明

Region xx: cfg文件中yolo-layer的索引;
Avg IOU:當(dāng)前迭代中,預(yù)測的box與標(biāo)注的box的平均交并比,越大越好,期望數(shù)值為1;
Class: 標(biāo)注物體的分類準(zhǔn)確率,越大越好,期望數(shù)值為1;
obj: 越大越好,期望數(shù)值為1;
No obj: 越小越好;
.5R: 以IOU=0.5為閾值時(shí)候的recall; recall = 檢出的正樣本/實(shí)際的正樣本
0.75R: 以IOU=0.75為閾值時(shí)候的recall;
count:正樣本數(shù)目。