pytorch 目标檢測 coco API 講解資料生成

摘要

在目标檢測中進入訓練前，會有幾種不同形式的準備資料，今天就具體的講解幾種常見的資料準備格式，當然，最常用的還是coco資料集形式進入訓練。

voc coco資料集

我們通常提到的voc是07年就有的，有照片和xml檔案進行的标注資訊，xml是labeling标注之後最原始的儲存資訊的檔案，json是全部提取出來之後形式字典形式的資訊，效率比xml快很多。imagesets存儲的全部是照片，annotations是xml檔案，其他的seg是語義分割裡面的資料标出形式，這裡先不用學習。

pytorch 目标檢測 coco API 講解資料生成

在最新的目标檢測論文基本都是以coco資料集的map值來展現一個算法的好壞，這裡可以先看一下coco資料集的形式，我處理好的對于目标檢測隻要val2017，和train2017，都是存儲照片，annotations是json檔案。

pytorch 目标檢測 coco API 講解資料生成

train2017下的檔案，隻需要是照片就可以

pytorch 目标檢測 coco API 講解資料生成

val2017下的照片

pytorch 目标檢測 coco API 講解資料生成

這裡也隻需要訓練和驗證二個json檔案，現在打開val2017.json具體的來看下

pytorch 目标檢測 coco API 講解資料生成

看起來很亂，沒錯，因為記錄照片的資訊很詳細，不光隻有目标标檢測用的需要記錄下來，還有其他幾種方向的标注，但是在我們使用生成coco資料集的時候不要這樣，隻需要生成images，annotations，categories三種，images記錄照片，annotations記錄box資訊categories記錄類别資訊，利用我自己做的資料簡單了解一下

pytorch 目标檢測 coco API 講解資料生成

images這個字典就記錄照片的名字好高寬資訊，在記錄一個唯一的id。有點資料庫的味道，也是記錄數量的方式。

pytorch 目标檢測 coco API 講解資料生成

annotations是記錄box的資訊，同樣需要知道對應哪一張照片，是以對應照片id就可以了，還有類别，框的面積。

pytorch 目标檢測 coco API 講解資料生成

categories就是記錄不同的類别了，這裡的name可以是中文也可以是這種壓縮之後的方式。到這裡應該對coco形式的資料了解很全面了。下面就要學習使用API進行快速的處理資料。

coco API講解

coco API是專門處理json檔案，對json處理十分友善，本身json以字典的形式儲存圖像的資訊，我們需要自己寫讀取部分，比較麻煩。現在有了coco API極大的友善使用，隻需要幾個簡單的操作便将資料輕松提取出來加載訓練。

from pycocotools.coco import COCO

coco = COCO('./test/annotations.json')
ids = list(coco.imgs.keys())   #這種加載是比較特殊的
value = list(coco.imgs.values())
print(ids)
print(value)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[{'file_name': '20190816_095611.jpg', 'id': 1, 'height': 4032, 'width': 3024}, {'file_name': '20190816_095633.jpg', 'id': 2, 'height': 4032, 'width': 3024}, {'file_name': '81872f020e8ac5489c0c51cad67c435.jpg', 'id': 3, 'height': 1440, 'width': 1080}, {'file_name': '9d0ac700d8dafd5c568fd3d78224ffb.jpg', 'id': 4, 'height': 1440, 'width': 1080}, {'file_name': 'eed7d90379acd8c427b5b73f0a229e6.jpg', 'id': 5, 'height': 1440, 'width': 1080}, {'file_name': 'img_10.jpg', 'id': 6, 'height': 690, 'width': 750}, {'file_name': 'img_100.jpg', 'id': 7, 'height': 357, 'width': 688}, {'file_name': 'img_111.jpg', 'id': 8, 'height': 227, 'width': 287}, {'file_name': 'img_18.jpg', 'id': 9, 'height': 500, 'width': 375}, {'file_name': 'img_22.jpg', 'id': 10, 'height': 145, 'width': 210}, {'file_name': 'img_35.jpg', 'id': 11, 'height': 800, 'width': 800}, {'file_name': 'img_36.jpg', 'id': 12, 'height': 220, 'width': 293}, {'file_name': 'img_44.jpg', 'id': 13, 'height': 415, 'width': 475}, {'file_name': 'img_54.jpg', 'id': 14, 'height': 369, 'width': 429}, {'file_name': 'img_65.jpg', 'id': 15, 'height': 768, 'width': 1024}, {'file_name': 'img_78.jpg', 'id': 16, 'height': 645, 'width': 700}, {'file_name': 'img_83.jpg', 'id': 17, 'height': 736, 'width': 800}, {'file_name': 'img_92.jpg', 'id': 18, 'height': 210, 'width': 295}, {'file_name': 'img_97.jpg', 'id': 19, 'height': 768, 'width': 1024}]

這裡是最簡單的加載，然後使用提取出照片的id和value進行檢視每個照片的images資訊，在COCO使用的時候就已經全部讀取好了，這裡我們隻需要提取對應的資訊就可以了，接下來就講解幾個常用的函數，

三個get

getAnnIds，getCatIds，getImgIds顧名思義，coco裡面的函數不是瞎起名字的，就是獲得box資訊的id，獲得類别資訊的id，獲得照片的id，這是為了友善下一步的操作。

from pycocotools.coco import COCO

coco = COCO('./test/annotations.json')
ids1 = coco.getAnnIds()
print(ids1)
ids2 = coco.getImgIds()
print(ids2)
ids3 = coco.getCatIds()
print(ids3)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43]

可以看出，就将各自當時記錄的id加載進來了，這一步加載是必須的，這樣才可以進行下一步的load操作

三個load

loadAnns，loadCats，loadImgs，這一步才是實質性的加載資料

from pycocotools.coco import COCO

coco = COCO('./test/annotations.json')
ids1 = coco.getAnnIds()
ids2 = coco.getImgIds()
ids3 = coco.getCatIds()
data1=coco.loadAnns(ids1[1])
print(data1)
data2=coco.loadImgs(ids3[1])
print(data2)
data3=coco.loadCats(ids3[1])
print(data3)
[{'id': 2, 'image_id': 2, 'bbox': [933, 88, 2178, 2559], 'category_id': 1, 'area': 5573502, 'iscrowd': 0}]
[{'file_name': '20190816_095611.jpg', 'id': 1, 'height': 4032, 'width': 3024}]
[{'id': 1, 'name': '書籍紙張'}]

學會這幾個簡單的操作就可以進行資料分析，統計json各種名額。

coco資料集json檔案生成

import os
import cv2
import json
import xml.dom.minidom
import xml.etree.ElementTree as ET

data_dir = './data' #根目錄檔案，其中包含image檔案夾和box檔案夾（根據自己的情況修改這個路徑）

image_file_dir = os.path.join(data_dir, 'image')
xml_file_dir = os.path.join(data_dir, 'box')

annotations_info = {'images': [], 'annotations': [], 'categories': []}

categories_map = {'holothurian': 1, 'echinus': 2, 'scallop': 3, 'starfish': 4}

for key in categories_map:
    categoriy_info = {"id":categories_map[key], "name":key}
    annotations_info['categories'].append(categoriy_info)

file_names = [image_file_name.split('.')[0]
              for image_file_name in os.listdir(image_file_dir)]
ann_id = 1
for i, file_name in enumerate(file_names):
    print(i)
    image_file_name = file_name + '.jpg'
    xml_file_name = file_name + '.xml'
    image_file_path = os.path.join(image_file_dir, image_file_name)
    xml_file_path = os.path.join(xml_file_dir, xml_file_name)

    image_info = dict()
    image = cv2.cvtColor(cv2.imread(image_file_path), cv2.COLOR_BGR2RGB)
    height, width, _ = image.shape
    image_info = {'file_name': image_file_name, 'id': i+1,
                  'height': height, 'width': width}
    annotations_info['images'].append(image_info)

    DOMTree = xml.dom.minidom.parse(xml_file_path)
    collection = DOMTree.documentElement

    names = collection.getElementsByTagName('name')
    names = [name.firstChild.data for name in names]

    xmins = collection.getElementsByTagName('xmin')
    xmins = [xmin.firstChild.data for xmin in xmins]
    ymins = collection.getElementsByTagName('ymin')
    ymins = [ymin.firstChild.data for ymin in ymins]
    xmaxs = collection.getElementsByTagName('xmax')
    xmaxs = [xmax.firstChild.data for xmax in xmaxs]
    ymaxs = collection.getElementsByTagName('ymax')
    ymaxs = [ymax.firstChild.data for ymax in ymaxs]

    object_num = len(names)

    for j in range(object_num):
        if names[j] in categories_map:
            image_id = i + 1
            x1,y1,x2,y2 = int(xmins[j]),int(ymins[j]),int(xmaxs[j]),int(ymaxs[j])
            x1,y1,x2,y2 = x1 - 1,y1 - 1,x2 - 1,y2 - 1

            if x2 == width:
                x2 -= 1
            if y2 == height:
                y2 -= 1

            x,y = x1,y1
            w,h = x2 - x1 + 1,y2 - y1 + 1
            category_id = categories_map[names[j]]
            area = w * h
            annotation_info = {"id": ann_id, "image_id":image_id, "bbox":[x, y, w, h], "category_id": category_id, "area": area,"iscrowd": 0}
            annotations_info['annotations'].append(annotation_info)
            ann_id += 1

with  open('./data/annotations.json', 'w')  as f:
    json.dump(annotations_info, f, indent=4)

print('---整理後的标注檔案---')
print('所有圖檔的數量：',  len(annotations_info['images']))
print('所有标注的數量：',  len(annotations_info['annotations']))
print('所有類别的數量：',  len(annotations_info['categories']))

這裡就生成json檔案的代碼，隻需要修改檔案位置和類别就可以，是xml檔案存儲資訊轉化json檔案，仔細閱讀下代碼就很好了解。

總結

這一步學習能夠了解各種目标檢測進入訓練的資料準備工作，學會之後就可以輕松應對比賽給出的各種形式都可以轉化，配合之前我部落格講解的資料讀取方面，目标檢測這方面運用就可以了

pytorch 目标檢測 coco API 講解資料生成

摘要

voc coco資料集

coco API講解

三個get

三個load

coco資料集json檔案生成

總結

繼續閱讀

證券從業合格證書什麼時候列印？有哪些注意事項？

【幹貨滿滿】初級銀行從業考試《個人理财》重點梳理

2020年經濟師考試，難嗎？

初級銀行從業資格證有什麼用？

MBA提前面試純幹貨分享

MBA值得學麼

通俗了解查準率(precision)和查全率(recall)

吳恩達logistic回歸實作

【人工智能行業大師訪談1】吳恩達采訪 Geoffery Hinton

深度學習模型分析人類複雜疾病的準确性

人工智能如何有效地運用于自然語言處理

【趨高機器視覺】機器視覺技術原了解析及解決方案

吳恩達 coursera ML 第七課總結+作業答案前言目錄正文模型表示作業答案

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

cs231n斯坦福基于卷積神經網絡的CV學習筆記（一）KNN和線性分類器/分類器損失/反向傳播一，KNN圖像分類算法二，線性分類器三，線性分類器損失四，反向傳播五，神經網絡

【Torch】最簡潔logging使用指南

pytorch 目标檢測 coco API 講解 資料生成

摘要

voc coco資料集

coco API講解

三個get

三個load

coco資料集json檔案生成

總結

繼續閱讀

pytorch 目标檢測 coco API 講解資料生成