本文來源:caffe官方文檔翻譯,位址:RCNN檢測
R-CNN是一種最先進的探測器,它可以通過一個finetuned Caffe模型對區域提案進行分類。關于R-CNN系統和模型的完整細節,請參考其項目現場和論文:
豐富的特征層次結構,用于精确的目标檢測和語義分割。羅斯·格希克,傑夫·多納休,特雷弗·達雷爾,吉滕德拉·馬利克。CVPR 2014。Arxiv 2013。
注:論文位址:Arxiv
在本例中,我們使用ImageNet的R-CNN模型的完全caffe版本進行檢測。R-CNN檢測器輸出ILSVRC13的200個檢測類的類分數。請記住,這些是原始的一個對比所有的SVM得分,是以它們沒有經過機率校準,也沒有在不同的班級之間進行精确的比較。注意,這個現成的模型隻是為了友善,并不是完整的R-CNN模型。
讓我們對一張圖檔進行檢測,這是一張在沙漠中騎魚自行車的人的圖檔(來自ImageNet challenge ,并非開玩笑)。首先,我們需要 region proposals 和Caffe R-CNN ImageNet模型:
選擇性搜尋是R-CNN使用的region proposer(區域建議者)。帶有python子產品的 selective search ijcv with python負責通過選擇性搜尋MATLAB實作抽取建議。安裝它,要下載下傳子產品并将其目錄名為selective_search_ijcv_with_python,在MATLAB中運作示範程式來編譯必要的函數,然後将其添加到python路徑中進行導入。(如果您已經準備好了自己的建議性區域,或者不希望麻煩執行此步驟,detect.py必須接受CSV格式的圖像清單和包圍框。)
運作以下指令擷取Caffe R-CNN ImageNet模型:
./scripts/download_model_binary.py models / bvlc_reference_rcnn_ilsvrc13
完成後,我們将調用附帶的的detect.py來生成區域提議并運作網絡。 有關參數的解釋,請執行./dectct.py --help。
In[1]:
!mkdir -p _temp
!echo `pwd`/images/fish-bike.jpg > _temp/det_input.txt
!../python/detect.py --crop_mode=selective_search --pretrained_model=../models/bvlc_reference_rcnn_ilsvrc13/bvlc_reference_rcnn_ilsvrc13.caffemodel --model_def=../models/bvlc_reference_rcnn_ilsvrc13/deploy.prototxt --gpu --raw_scale=255 _temp/det_input.txt _temp/det_output.h5
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0218 20:43:25.383932 2099749632 net.cpp:42] Initializing net from parameters:
name: "R-CNN-ilsvrc13"
input: "data"
input_dim: 10
input_dim: 3
input_dim: 227
input_dim: 227
state {
phase: TEST
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-rcnn"
type: "InnerProduct"
bottom: "fc7"
top: "fc-rcnn"
inner_product_param {
num_output: 200
}
}
I0218 20:43:25.385720 2099749632 net.cpp:336] Input 0 -> data
I0218 20:43:25.385769 2099749632 layer_factory.hpp:74] Creating layer conv1
I0218 20:43:25.385783 2099749632 net.cpp:76] Creating Layer conv1
I0218 20:43:25.385790 2099749632 net.cpp:372] conv1 <- data
I0218 20:43:25.385802 2099749632 net.cpp:334] conv1 -> conv1
I0218 20:43:25.385815 2099749632 net.cpp:105] Setting up conv1
I0218 20:43:25.386574 2099749632 net.cpp:112] Top shape: 10 96 55 55 (2904000)
I0218 20:43:25.386610 2099749632 layer_factory.hpp:74] Creating layer relu1
I0218 20:43:25.386625 2099749632 net.cpp:76] Creating Layer relu1
I0218 20:43:25.386631 2099749632 net.cpp:372] relu1 <- conv1
I0218 20:43:25.386641 2099749632 net.cpp:323] relu1 -> conv1 (in-place)
I0218 20:43:25.386649 2099749632 net.cpp:105] Setting up relu1
I0218 20:43:25.386656 2099749632 net.cpp:112] Top shape: 10 96 55 55 (2904000)
I0218 20:43:25.386663 2099749632 layer_factory.hpp:74] Creating layer pool1
I0218 20:43:25.386675 2099749632 net.cpp:76] Creating Layer pool1
I0218 20:43:25.386682 2099749632 net.cpp:372] pool1 <- conv1
I0218 20:43:25.386690 2099749632 net.cpp:334] pool1 -> pool1
I0218 20:43:25.386699 2099749632 net.cpp:105] Setting up pool1
I0218 20:43:25.386716 2099749632 net.cpp:112] Top shape: 10 96 27 27 (699840)
I0218 20:43:25.386725 2099749632 layer_factory.hpp:74] Creating layer norm1
I0218 20:43:25.386736 2099749632 net.cpp:76] Creating Layer norm1
I0218 20:43:25.386744 2099749632 net.cpp:372] norm1 <- pool1
I0218 20:43:25.386803 2099749632 net.cpp:334] norm1 -> norm1
I0218 20:43:25.386819 2099749632 net.cpp:105] Setting up norm1
I0218 20:43:25.386832 2099749632 net.cpp:112] Top shape: 10 96 27 27 (699840)
I0218 20:43:25.386842 2099749632 layer_factory.hpp:74] Creating layer conv2
I0218 20:43:25.386852 2099749632 net.cpp:76] Creating Layer conv2
I0218 20:43:25.386865 2099749632 net.cpp:372] conv2 <- norm1
I0218 20:43:25.386878 2099749632 net.cpp:334] conv2 -> conv2
I0218 20:43:25.386899 2099749632 net.cpp:105] Setting up conv2
I0218 20:43:25.387024 2099749632 net.cpp:112] Top shape: 10 256 27 27 (1866240)
I0218 20:43:25.387042 2099749632 layer_factory.hpp:74] Creating layer relu2
I0218 20:43:25.387050 2099749632 net.cpp:76] Creating Layer relu2
I0218 20:43:25.387058 2099749632 net.cpp:372] relu2 <- conv2
I0218 20:43:25.387066 2099749632 net.cpp:323] relu2 -> conv2 (in-place)
I0218 20:43:25.387075 2099749632 net.cpp:105] Setting up relu2
I0218 20:43:25.387081 2099749632 net.cpp:112] Top shape: 10 256 27 27 (1866240)
I0218 20:43:25.387089 2099749632 layer_factory.hpp:74] Creating layer pool2
I0218 20:43:25.387097 2099749632 net.cpp:76] Creating Layer pool2
I0218 20:43:25.387104 2099749632 net.cpp:372] pool2 <- conv2
I0218 20:43:25.387112 2099749632 net.cpp:334] pool2 -> pool2
I0218 20:43:25.387121 2099749632 net.cpp:105] Setting up pool2
I0218 20:43:25.387130 2099749632 net.cpp:112] Top shape: 10 256 13 13 (432640)
I0218 20:43:25.387137 2099749632 layer_factory.hpp:74] Creating layer norm2
I0218 20:43:25.387145 2099749632 net.cpp:76] Creating Layer norm2
I0218 20:43:25.387152 2099749632 net.cpp:372] norm2 <- pool2
I0218 20:43:25.387161 2099749632 net.cpp:334] norm2 -> norm2
I0218 20:43:25.387168 2099749632 net.cpp:105] Setting up norm2
I0218 20:43:25.387176 2099749632 net.cpp:112] Top shape: 10 256 13 13 (432640)
I0218 20:43:25.387228 2099749632 layer_factory.hpp:74] Creating layer conv3
I0218 20:43:25.387249 2099749632 net.cpp:76] Creating Layer conv3
I0218 20:43:25.387258 2099749632 net.cpp:372] conv3 <- norm2
I0218 20:43:25.387266 2099749632 net.cpp:334] conv3 -> conv3
I0218 20:43:25.387276 2099749632 net.cpp:105] Setting up conv3
I0218 20:43:25.389375 2099749632 net.cpp:112] Top shape: 10 384 13 13 (648960)
I0218 20:43:25.389408 2099749632 layer_factory.hpp:74] Creating layer relu3
I0218 20:43:25.389421 2099749632 net.cpp:76] Creating Layer relu3
I0218 20:43:25.389430 2099749632 net.cpp:372] relu3 <- conv3
I0218 20:43:25.389438 2099749632 net.cpp:323] relu3 -> conv3 (in-place)
I0218 20:43:25.389447 2099749632 net.cpp:105] Setting up relu3
I0218 20:43:25.389456 2099749632 net.cpp:112] Top shape: 10 384 13 13 (648960)
I0218 20:43:25.389462 2099749632 layer_factory.hpp:74] Creating layer conv4
I0218 20:43:25.389472 2099749632 net.cpp:76] Creating Layer conv4
I0218 20:43:25.389478 2099749632 net.cpp:372] conv4 <- conv3
I0218 20:43:25.389487 2099749632 net.cpp:334] conv4 -> conv4
I0218 20:43:25.389497 2099749632 net.cpp:105] Setting up conv4
I0218 20:43:25.391810 2099749632 net.cpp:112] Top shape: 10 384 13 13 (648960)
I0218 20:43:25.391856 2099749632 layer_factory.hpp:74] Creating layer relu4
I0218 20:43:25.391871 2099749632 net.cpp:76] Creating Layer relu4
I0218 20:43:25.391880 2099749632 net.cpp:372] relu4 <- conv4
I0218 20:43:25.391888 2099749632 net.cpp:323] relu4 -> conv4 (in-place)
I0218 20:43:25.391898 2099749632 net.cpp:105] Setting up relu4
I0218 20:43:25.391906 2099749632 net.cpp:112] Top shape: 10 384 13 13 (648960)
I0218 20:43:25.391913 2099749632 layer_factory.hpp:74] Creating layer conv5
I0218 20:43:25.391923 2099749632 net.cpp:76] Creating Layer conv5
I0218 20:43:25.391929 2099749632 net.cpp:372] conv5 <- conv4
I0218 20:43:25.391937 2099749632 net.cpp:334] conv5 -> conv5
I0218 20:43:25.391947 2099749632 net.cpp:105] Setting up conv5
I0218 20:43:25.393072 2099749632 net.cpp:112] Top shape: 10 256 13 13 (432640)
I0218 20:43:25.393108 2099749632 layer_factory.hpp:74] Creating layer relu5
I0218 20:43:25.393122 2099749632 net.cpp:76] Creating Layer relu5
I0218 20:43:25.393129 2099749632 net.cpp:372] relu5 <- conv5
I0218 20:43:25.393138 2099749632 net.cpp:323] relu5 -> conv5 (in-place)
I0218 20:43:25.393148 2099749632 net.cpp:105] Setting up relu5
I0218 20:43:25.393157 2099749632 net.cpp:112] Top shape: 10 256 13 13 (432640)
I0218 20:43:25.393167 2099749632 layer_factory.hpp:74] Creating layer pool5
I0218 20:43:25.393175 2099749632 net.cpp:76] Creating Layer pool5
I0218 20:43:25.393182 2099749632 net.cpp:372] pool5 <- conv5
I0218 20:43:25.393190 2099749632 net.cpp:334] pool5 -> pool5
I0218 20:43:25.393199 2099749632 net.cpp:105] Setting up pool5
I0218 20:43:25.393209 2099749632 net.cpp:112] Top shape: 10 256 6 6 (92160)
I0218 20:43:25.393218 2099749632 layer_factory.hpp:74] Creating layer fc6
I0218 20:43:25.393226 2099749632 net.cpp:76] Creating Layer fc6
I0218 20:43:25.393232 2099749632 net.cpp:372] fc6 <- pool5
I0218 20:43:25.393240 2099749632 net.cpp:334] fc6 -> fc6
I0218 20:43:25.393249 2099749632 net.cpp:105] Setting up fc6
I0218 20:43:25.516396 2099749632 net.cpp:112] Top shape: 10 4096 1 1 (40960)
I0218 20:43:25.516445 2099749632 layer_factory.hpp:74] Creating layer relu6
I0218 20:43:25.516463 2099749632 net.cpp:76] Creating Layer relu6
I0218 20:43:25.516470 2099749632 net.cpp:372] relu6 <- fc6
I0218 20:43:25.516480 2099749632 net.cpp:323] relu6 -> fc6 (in-place)
I0218 20:43:25.516490 2099749632 net.cpp:105] Setting up relu6
I0218 20:43:25.516497 2099749632 net.cpp:112] Top shape: 10 4096 1 1 (40960)
I0218 20:43:25.516505 2099749632 layer_factory.hpp:74] Creating layer drop6
I0218 20:43:25.516515 2099749632 net.cpp:76] Creating Layer drop6
I0218 20:43:25.516521 2099749632 net.cpp:372] drop6 <- fc6
I0218 20:43:25.516530 2099749632 net.cpp:323] drop6 -> fc6 (in-place)
I0218 20:43:25.516538 2099749632 net.cpp:105] Setting up drop6
I0218 20:43:25.516557 2099749632 net.cpp:112] Top shape: 10 4096 1 1 (40960)
I0218 20:43:25.516566 2099749632 layer_factory.hpp:74] Creating layer fc7
I0218 20:43:25.516576 2099749632 net.cpp:76] Creating Layer fc7
I0218 20:43:25.516582 2099749632 net.cpp:372] fc7 <- fc6
I0218 20:43:25.516589 2099749632 net.cpp:334] fc7 -> fc7
I0218 20:43:25.516599 2099749632 net.cpp:105] Setting up fc7
I0218 20:43:25.604786 2099749632 net.cpp:112] Top shape: 10 4096 1 1 (40960)
I0218 20:43:25.604838 2099749632 layer_factory.hpp:74] Creating layer relu7
I0218 20:43:25.604852 2099749632 net.cpp:76] Creating Layer relu7
I0218 20:43:25.604859 2099749632 net.cpp:372] relu7 <- fc7
I0218 20:43:25.604868 2099749632 net.cpp:323] relu7 -> fc7 (in-place)
I0218 20:43:25.604878 2099749632 net.cpp:105] Setting up relu7
I0218 20:43:25.604885 2099749632 net.cpp:112] Top shape: 10 4096 1 1 (40960)
I0218 20:43:25.604893 2099749632 layer_factory.hpp:74] Creating layer drop7
I0218 20:43:25.604902 2099749632 net.cpp:76] Creating Layer drop7
I0218 20:43:25.604908 2099749632 net.cpp:372] drop7 <- fc7
I0218 20:43:25.604917 2099749632 net.cpp:323] drop7 -> fc7 (in-place)
I0218 20:43:25.604924 2099749632 net.cpp:105] Setting up drop7
I0218 20:43:25.604933 2099749632 net.cpp:112] Top shape: 10 4096 1 1 (40960)
I0218 20:43:25.604939 2099749632 layer_factory.hpp:74] Creating layer fc-rcnn
I0218 20:43:25.604948 2099749632 net.cpp:76] Creating Layer fc-rcnn
I0218 20:43:25.604954 2099749632 net.cpp:372] fc-rcnn <- fc7
I0218 20:43:25.604962 2099749632 net.cpp:334] fc-rcnn -> fc-rcnn
I0218 20:43:25.604971 2099749632 net.cpp:105] Setting up fc-rcnn
I0218 20:43:25.606878 2099749632 net.cpp:112] Top shape: 10 200 1 1 (2000)
I0218 20:43:25.606904 2099749632 net.cpp:165] fc-rcnn does not need backward computation.
I0218 20:43:25.606909 2099749632 net.cpp:165] drop7 does not need backward computation.
I0218 20:43:25.606916 2099749632 net.cpp:165] relu7 does not need backward computation.
I0218 20:43:25.606922 2099749632 net.cpp:165] fc7 does not need backward computation.
I0218 20:43:25.606928 2099749632 net.cpp:165] drop6 does not need backward computation.
I0218 20:43:25.606935 2099749632 net.cpp:165] relu6 does not need backward computation.
I0218 20:43:25.606940 2099749632 net.cpp:165] fc6 does not need backward computation.
I0218 20:43:25.606946 2099749632 net.cpp:165] pool5 does not need backward computation.
I0218 20:43:25.606952 2099749632 net.cpp:165] relu5 does not need backward computation.
I0218 20:43:25.606958 2099749632 net.cpp:165] conv5 does not need backward computation.
I0218 20:43:25.606964 2099749632 net.cpp:165] relu4 does not need backward computation.
I0218 20:43:25.606971 2099749632 net.cpp:165] conv4 does not need backward computation.
I0218 20:43:25.606976 2099749632 net.cpp:165] relu3 does not need backward computation.
I0218 20:43:25.606982 2099749632 net.cpp:165] conv3 does not need backward computation.
I0218 20:43:25.606988 2099749632 net.cpp:165] norm2 does not need backward computation.
I0218 20:43:25.606995 2099749632 net.cpp:165] pool2 does not need backward computation.
I0218 20:43:25.607002 2099749632 net.cpp:165] relu2 does not need backward computation.
I0218 20:43:25.607007 2099749632 net.cpp:165] conv2 does not need backward computation.
I0218 20:43:25.607013 2099749632 net.cpp:165] norm1 does not need backward computation.
I0218 20:43:25.607199 2099749632 net.cpp:165] pool1 does not need backward computation.
I0218 20:43:25.607213 2099749632 net.cpp:165] relu1 does not need backward computation.
I0218 20:43:25.607219 2099749632 net.cpp:165] conv1 does not need backward computation.
I0218 20:43:25.607225 2099749632 net.cpp:201] This network produces output fc-rcnn
I0218 20:43:25.607239 2099749632 net.cpp:446] Collecting Learning Rate and Weight Decay.
I0218 20:43:25.607255 2099749632 net.cpp:213] Network initialization done.
I0218 20:43:25.607262 2099749632 net.cpp:214] Memory required for data: 62425920
E0218 20:43:26.388214 2099749632 upgrade_proto.cpp:618] Attempting to upgrade input file specified using deprecated V1LayerParameter: ../models/bvlc_reference_rcnn_ilsvrc13/bvlc_reference_rcnn_ilsvrc13.caffemodel
I0218 20:43:27.089423 2099749632 upgrade_proto.cpp:626] Successfully upgraded file specified using deprecated V1LayerParameter
GPU mode
Loading input...
selective_search_rcnn({'/Users/shelhamer/h/desk/caffe/caffe-dev/examples/images/fish-bike.jpg'}, '/var/folders/bk/dtkn5qjd11bd17b2j36zplyw0000gp/T/tmpakaRLL.mat')
Processed 1570 windows in 102.895 s.
/Users/shelhamer/anaconda/lib/python2.7/site-packages/pandas/io/pytables.py:2453: PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed,key->block1_values] [items->['prediction']]
warnings.warn(ws, PerformanceWarning)
Saved to _temp/det_output.h5 in 0.298 s.
這次運作是在GPU模式。對于CPU模式檢測,在 without the --gpu 參數的情況下調用detection .py。
運作此操作會将帶有檔案名,標明視窗及其檢測分數的DataFrame輸出到HDF5檔案。 (我們隻在一張圖檔上運作,是以檔案名都是一樣的。
In[2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_hdf('_temp/det_output.h5', 'df')
print(df.shape)
print(df.iloc[0])
(1570, 5)
prediction [-2.62247, -2.84579, -2.85122, -3.20838, -1.94...
ymin 79.846
xmin 9.62
ymax 246.31
xmax 339.624
Name: /Users/shelhamer/h/desk/caffe/caffe-dev/examples/images/fish-bike.jpg, dtype: object
j結果顯示,模型提出了1570個區域的R-CNN配置的選擇性搜尋。提案的數量會根據圖檔的内容和大小而有所不同——選擇性搜尋并不是規模不變的。
一般來說,detect.py在運作大量圖像時效率最高:它首先為所有圖像提取視窗建議,然後批量處理視窗,以便高效GPU處理,然後輸出結果。隻需在圖像檔案中每行列出一個圖像,它就會處理所有這些圖像。
雖然本指南提供了一個R-CNN ImageNet檢測示例,但是detect.py足夠聰明,能夠适應不同的Caffe模型的輸入次元、批大小和輸出類别。您可以根據需要切換模型定義和預訓練模型。參考python detect.py --help描述資料集。不需要寫死。
Anyway, let's now load the ILSVRC13 detection class names and make a DataFrame of the predictions. Note you'll need the auxiliary ilsvrc2012 data fetched by
data/ilsvrc12/get_ilsvrc12_aux.sh
.
無論如何,現在讓我們加載ILSVRC13檢測類名稱并制作預測的DataFrame。 請注意,您需要data / ilsvrc12 / get_ilsvrc12_aux.sh提取的輔助ilsvrc2012資料。
In[3]:
with open('../data/ilsvrc12/det_synset_words.txt') as f:
labels_df = pd.DataFrame([
{
'synset_id': l.strip().split(' ')[0],
'name': ' '.join(l.strip().split(' ')[1:]).split(',')[0]
}
for l in f.readlines()
])
labels_df.sort('synset_id')
predictions_df = pd.DataFrame(np.vstack(df.prediction.values), columns=labels_df['name'])
print(predictions_df.iloc[0])
name
accordion -2.622471
airplane -2.845788
ant -2.851219
antelope -3.208377
apple -1.949950
armadillo -2.472935
artichoke -2.201684
axe -2.327404
baby bed -2.737925
backpack -2.176763
bagel -2.681061
balance beam -2.722538
banana -2.390628
band aid -1.598909
banjo -2.298197
...
trombone -2.582361
trumpet -2.352853
turtle -2.360859
tv or monitor -2.761043
unicycle -2.218467
vacuum -1.907717
violin -2.757079
volleyball -2.723689
waffle iron -2.418540
washer -2.408994
water bottle -2.174899
watercraft -2.837425
whale -3.120338
wine bottle -2.772960
zebra -2.742913
Name: 0, Length: 200, dtype: float32
讓我們來看看激活情況。
In[4]:
plt.gray()
plt.matshow(predictions_df.values)
plt.xlabel('Classes')
plt.ylabel('Windows')
out[4]:
<matplotlib.text.Text at 0x114f15f90>
<matplotlib.figure.Figure at 0x114254b50>
現在讓我們以最大交叉為例,周遊所有視窗并繪制頂級類。
In[5]:
max_s = predictions_df.max(0)
max_s.sort(ascending=False)
print(max_s[:10])
name
person 1.835771
bicycle 0.866110
unicycle 0.057080
motorcycle -0.006122
banjo -0.028209
turtle -0.189831
electric fan -0.206788
cart -0.214235
lizard -0.393519
helmet -0.477942
dtype: float32
最高的探測器實際上是人和自行車。選擇好的本地化是一項正在進行的工作;我們選擇得分最高的人和自行車檢測。
In[6]:
# Find, print, and display the top detections: person and bicycle.
i = predictions_df['person'].argmax()
j = predictions_df['bicycle'].argmax()
# Show top predictions for top detection.
f = pd.Series(df['prediction'].iloc[i], index=labels_df['name'])
print('Top detection:')
print(f.order(ascending=False)[:5])
print('')
# Show top predictions for second-best detection.
f = pd.Series(df['prediction'].iloc[j], index=labels_df['name'])
print('Second-best detection:')
print(f.order(ascending=False)[:5])
# Show top detection in red, second-best top detection in blue.
im = plt.imread('images/fish-bike.jpg')
plt.imshow(im)
currentAxis = plt.gca()
det = df.iloc[i]
coords = (det['xmin'], det['ymin']), det['xmax'] - det['xmin'], det['ymax'] - det['ymin']
currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor='r', linewidth=5))
det = df.iloc[j]
coords = (det['xmin'], det['ymin']), det['xmax'] - det['xmin'], det['ymax'] - det['ymin']
currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor='b', linewidth=5))
Top detection:
name
person 1.835771
swimming trunks -1.150371
rubber eraser -1.231106
turtle -1.266037
plastic bag -1.303265
dtype: float32
Second-best detection:
name
bicycle 0.866110
unicycle -0.359139
scorpion -0.811621
lobster -0.982891
lamp -1.096808
dtype: float32
out[6]:
<matplotlib.patches.Rectangle at 0x118576a90>
這很酷。 讓我們采取所有'自行車'檢測和NMS它們擺脫重疊的視窗。
In[7]:
def nms_detections(dets, overlap=0.3):
"""
Non-maximum suppression: Greedily select high-scoring detections and
skip detections that are significantly covered by a previously
selected detection.
This version is translated from Matlab code by Tomasz Malisiewicz,
who sped up Pedro Felzenszwalb's code.
Parameters
----------
dets: ndarray
each row is ['xmin', 'ymin', 'xmax', 'ymax', 'score']
overlap: float
minimum overlap ratio (0.3 default)
Output
------
dets: ndarray
remaining after suppression.
"""
x1 = dets[:, 0]
y1 = dets[:, 1]
x2 = dets[:, 2]
y2 = dets[:, 3]
ind = np.argsort(dets[:, 4])
w = x2 - x1
h = y2 - y1
area = (w * h).astype(float)
pick = []
while len(ind) > 0:
i = ind[-1]
pick.append(i)
ind = ind[:-1]
xx1 = np.maximum(x1[i], x1[ind])
yy1 = np.maximum(y1[i], y1[ind])
xx2 = np.minimum(x2[i], x2[ind])
yy2 = np.minimum(y2[i], y2[ind])
w = np.maximum(0., xx2 - xx1)
h = np.maximum(0., yy2 - yy1)
wh = w * h
o = wh / (area[i] + area[ind] - wh)
ind = ind[np.nonzero(o <= overlap)[0]]
return dets[pick, :]
In[8]:
scores = predictions_df['bicycle']
windows = df[['xmin', 'ymin', 'xmax', 'ymax']].values
dets = np.hstack((windows, scores[:, np.newaxis]))
nms_dets = nms_detections(dets)
在圖像中顯示前3個NMS' '檢測“自行車”,并注意頂部得分框(紅色)和其餘框之間的差距。
In[9]:
plt.imshow(im)
currentAxis = plt.gca()
colors = ['r', 'b', 'y']
for c, det in zip(colors, nms_dets[:3]):
currentAxis.add_patch(
plt.Rectangle((det[0], det[1]), det[2]-det[0], det[3]-det[1],
fill=False, edgecolor=c, linewidth=5)
)
print 'scores:', nms_dets[:3, 4]
scores: [ 0.86610985 -0.70051557 -1.34796357]
這是一個簡單的自行車執行個體,因為它在類的訓練集中。然而,person結果是一個真實的檢測,因為它不在類的集合中。您應該嘗試檢測對您自己的圖像下一步!(删除要清理的臨時目錄,就完成了。)
ps:又一個無聊的草稿.....