天天看點

Nvidia-Docker 建構可使用 GPUs 的容器:Python3 + Tensorflow-GPU + Opencv + Dlib

在之前寫的文檔 《 Docker Images: Centos7 + Python3.6 + Tensorflow + Opencv + Dlib 》中建構了基于 CPU 的圖像處理常用開發環境的 Docker 鏡像。但随着圖形處理器 GPU 的快速發展, GPU 能夠更好的保證内部高資料帶寬和執行計算能力,有效代替 CPU 的部分計算,是以,也常常搭建基于 GPU 的開發環境。本文檔主要是建構可使用 GPUs 的容器,包含: centos7 + cuda 9.0 + cudnn 7.0.5 + Python 3.6 + Tensorflow-GPU 1.5.0 + Opencv-Python + Dlib 等開發環境,并記錄了建構過程中遇到的各種問題。

Nvidia-Docker 建構可使用 GPUs 的容器 : cuda9.0 + cudnn7.0.5 + Tensorflow-GPU + Opencv-Python + Dlib

    • 基礎鏡像的選擇
    • 基礎鏡像及作者資訊
    • 修改時區,安裝中文支援
    • 安裝 cudnn7.0.5
    • 安裝 python3.6
    • 安裝 tensorflow-gpu
    • 安裝 opencv-python
    • 安裝 dlib
    • 安裝其他 python 依賴包
    • 其他設定
    • 完整的 dockerfile
    • 建構鏡像并測試

基礎鏡像的選擇

在使用 Nvidia-Docker 建構可使用 GPUs 的容器之前,先要确定所需建構的環境的版本對應關系,這裡主要指的是 Tensorflow-GPU 與 CUDA 、 cuDNN 的版本對應關系。

如,這裡使用的是 Tensorflow-GPU 1.5.0 版本,對應 CUDA 9.0 版本以及 cuDNN 7.0.x 版本。如果版本不對,報錯如下:

  • Tensorflow-GPU 1.5.0 版本對應 CUDA 9.0 版本,否則報錯 ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory ;
  • 對應 cuDNN 7.0.x 版本,否則執行 Tensorflow-GPU 代碼的時候報錯提示 cuDNN 版本不對(原本安裝的是 cuDNN 7.3.1 版本,後來降至 cuDNN 7.0.5 ,測試通過)。

其它版本的對應資訊見 Tensorflow 中文官網中 經過測試的建構配置 或者 Tensorflow 英文官網中 Tested build configurations 。

Nvidia-Docker 建構可使用 GPUs 的容器:Python3 + Tensorflow-GPU + Opencv + Dlib

是以,這裡選擇 nvidia/cuda:9.0-devel-centos7 作為基礎鏡像,該鏡像不帶 cuDNN ,需要自己安裝對應的 cuDNN 版本。 nvidia/cuda 官方也有提供帶 cuDNN 的基礎鏡像—— nvidia/cuda:9.0-cudnn7-devel-centos7 ,當時是 cuDNN 7.3.1 版本的。

基礎鏡像及作者資訊

這裡使用的是 nvidia/cuda:9.0-devel-centos7 作為基礎鏡像,作者是 ELN ,還可以添加電子郵箱。

FROM nvidia/cuda:9.0-devel-centos7
MAINTAINER ELN
           

修改時區,安裝中文支援

在基礎鏡像中直接安裝 python3.6 ,進入 python3.6 中 print 中文字元時報如下錯誤,檢查 python3.6 的預設編碼為 utf-8 ,後來發現是 docker 中的基礎鏡像出現中文亂碼。

[[email protected] /]# python3.6
Python 3.6.5 (default, Apr 10 2018, 17:08:37)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print(u"������")  # print(u"中文")
  File "<stdin>", line 0

    ^
SyntaxError: 'ascii' codec can't decode byte 0xe4 in position 8: ordinal not in range(128)
>>> import sys
>>> print(sys.getdefaultencoding())
utf-8
>>> exit()
[[email protected] /]# echo "������"
中文
           

檢視

/etc/localtime

的結果是

lrwxrwxrwx. 1 root root 25 Oct 6 19:15 localtime -> ../usr/share/zoneinfo/UTC

,需要修改時區,安裝中文支援,配置顯示中文。

在 dockerfile 中修改時區,安裝中文支援,配置顯示中文 :

# 修改時區,安裝中文支援,配置顯示中文
RUN rm -rf /etc/localtime  && \
    ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime  && \
    yum -y install kde-l10n-Chinese  && \
    yum -y reinstall glibc-common  && \
    localedef -c -f UTF-8 -i zh_CN zh_CN.utf8  && \
    yum clean all  &&  rm -rf /var/cache/yum

ENV LC_ALL zh_CN.utf8
# 在終端執行: export LC_ALL=zh_CN.utf8
           

安裝 cudnn7.0.5

安裝 cudnn7.0.5 ,首先需要下載下傳對應版本的安裝包,安裝包較大,下載下傳較慢。當然,也可以先在本機下載下傳好( 268M),并将安裝包複制到 docker 容器中,再進行安裝。

# Install cudnn7.0.5
RUN curl -fsSL https://developer.download.nvidia.com/compute/redist/cudnn/v7.0.5/cudnn-8.0-linux-x64-v7.tgz -O && \
    tar --no-same-owner -xzf cudnn-8.0-linux-x64-v7.tgz -C /usr/local  && \
    rm -rf cudnn-8.0-linux-x64-v7.tgz  && \
    ldconfig
           

安裝 python3.6

在 dockerfile 中,安裝 python3.6 的純淨環境,安裝一些基礎的 python 包:

# Install Python 3.6
RUN yum -y install https://centos7.iuscommunity.org/ius-release.rpm && \
    yum -y install python36 && \
    yum -y install python36-pip && \
    yum -y install vim && \
    yum clean all  &&  rm -rf /var/cache/yum  && \
    # ln /usr/bin/python3.6 /usr/bin/python3  && \
    # ln /usr/bin/pip3.6 /usr/bin/pip3  && \
    mkdir ~/.pip/  && \
    echo -e "[global]\nindex-url = http://mirrors.aliyun.com/pypi/simple\n\n[install]\ntrusted-host=mirrors.aliyun.com" > ~/.pip/pip.conf

RUN pip3.6 --no-cache-dir install \
        Pillow \
        h5py \
        ipykernel \
        jupyter \
        matplotlib==2.1.1 \
        numpy==1.15.4 \
        pandas \
        scipy==1.1.0 \
        sklearn \
        && \
    python3.6 -m ipykernel.kernelspec
           

安裝 tensorflow-gpu

這裡直接

pip

安裝即可:

# Install TensorFlow GPU version from central repo
RUN pip3.6 --no-cache-dir install tensorflow-gpu==1.5.0
           

注意,為了避免 numpy 1.17.0+ 下

import tensorflow

報如下錯誤,需指定 numpy==1.15.4 (版本号 <1.17.0+ )。

  • FutureWarning: Deprecated numpy API calls in tf.python.framework.dtypes #30427
  • Fix numpy warning with numpy 1.17.0+ #30559
[[email protected] docker]$ sudo docker run -it --rm --runtime=nvidia 1e3fc1854e8c /bin/bash
[[email protected] test]# python3
Python 3.6.8 (default, Apr 25 2019, 21:02:35) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:493: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:494: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:495: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:496: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:497: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:502: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
>>> exit()
           

查詢 CUDA 、 cuDNN 版本資訊:

[[email protected] docker]$ sudo docker run -it --rm --runtime=nvidia 1e3fc1854e8c /bin/bash

# 在容器裡安裝的 CUDA 版本是 9.0 的,在本機上安裝的 CUDA 版本是 10.1 的
[[email protected] /]# nvidia-smi
Mon Jul 29 17:26:42 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P6000        Off  | 00000000:01:00.0  On |                  Off |
| 26%   29C    P8    11W / 250W |    472MiB / 24447MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

# 查詢 CUDA 版本資訊
[[email protected] /]# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

[[email protected] /]# cat /usr/local/cuda/version.txt
CUDA Version 9.0.176

# 查詢 cuDNN 版本資訊
[[email protected] /]# cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 0
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION    (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"
           

安裝 opencv-python

總結了一開始在 docker 容器中安裝 opencv-python 遇到的問題,這裡簡化了之前的文檔《 Docker Images: Centos7 + Python3.6 + Tensorflow + Opencv + Dlib 》中安裝 opencv-python 的指令。

# Install opencv-python
RUN yum -y install libSM.x86_64 \
        libXrender.x86_64 \
        libXext.x86_64  && \
    yum clean all  &&  rm -rf /var/cache/yum

RUN pip3.6 --no-cache-dir install opencv-python==3.4.1.15
           

安裝 dlib

安裝 dlib 時遇到的問題及解決方法:

  • 直接安裝 dlib 報錯

    CMake must be installed to build the following extensions: dlib

  • yum -y install cmake

    安裝 cmake 後,再安裝 dlib 報錯

    subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-build-g_ptsyo_/dlib/tools/python', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/pip-build-g_ptsyo_/dlib/build/lib.linux-x86_64-3.6', '-DPYTHON_EXECUTABLE=/usr/bin/python3.6', '-DCMAKE_BUILD_TYPE=Release']' returned non-zero exit status 1.

  • yum install -y python36u-devel.x86_64

    依然報上面的錯誤
  • yum -y groupinstall "Development tools"

    ,再安裝 dlib 則安裝成功
# Install dlib
RUN yum -y groupinstall "Development tools"  && \
    yum -y install cmake && \
    yum clean all  # &&  rm -rf /var/cache/yum

RUN yum install -y python36-devel.x86_64  && \
    yum clean all  # &&  rm -rf /var/cache/yum
# yum search python3 | grep devel

RUN pip3.6 --no-cache-dir install dlib
           

安裝其他 python 依賴包

# Install keras ...
RUN pip3.6 --no-cache-dir install Cython

RUN pip3.6 --no-cache-dir install \
        keras \
        flask \
        flask_cors \
        flask_socketio \
        scikit-image \
        mrcnn \
        imgaug \
        pycocotools
           

其他設定

RUN mkdir /test

WORKDIR /test

CMD ["/bin/bash"]
           

完整的 dockerfile

這裡隻是簡單的按照建構步驟寫的 dockerfile ,可以根據需要調整鏡像的分層結構。

FROM nvidia/cuda:9.0-devel-centos7
MAINTAINER ELN

# 修改時區,安裝中文支援,配置顯示中文
RUN rm -rf /etc/localtime  && \
    ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime  && \
    yum -y install kde-l10n-Chinese  && \
    yum -y reinstall glibc-common  && \
    localedef -c -f UTF-8 -i zh_CN zh_CN.utf8  && \
    yum clean all  &&  rm -rf /var/cache/yum

ENV LC_ALL zh_CN.utf8
# 在終端執行: export LC_ALL=zh_CN.utf8

# Install cudnn7.0.5
#RUN curl -fsSL https://developer.download.nvidia.com/compute/redist/cudnn/v7.0.5/cudnn-8.0-linux-x64-v7.tgz -O && \
#    tar --no-same-owner -xzf cudnn-8.0-linux-x64-v7.tgz -C /usr/local  && \
#    rm -rf cudnn-8.0-linux-x64-v7.tgz  && \
#    ldconfig

ADD cudnn-8.0-linux-x64-v7.tgz /usr/local/
RUN ldconfig

# Install Python 3.6
RUN yum -y install https://centos7.iuscommunity.org/ius-release.rpm && \
    yum -y install python36 && \
    yum -y install python36-pip && \
    yum -y install vim && \
    yum clean all  &&  rm -rf /var/cache/yum  && \
    # ln /usr/bin/python3.6 /usr/bin/python3  && \
    # ln /usr/bin/pip3.6 /usr/bin/pip3  && \
    mkdir ~/.pip/  && \
    echo -e "[global]\nindex-url = http://mirrors.aliyun.com/pypi/simple\n\n[install]\ntrusted-host=mirrors.aliyun.com" > ~/.pip/pip.conf

RUN pip3.6 --no-cache-dir install \
        Pillow \
        h5py \
        ipykernel \
        jupyter \
        matplotlib==2.1.1 \
        numpy==1.15.4 \
        pandas \
        scipy==1.1.0 \
        sklearn \
        && \
    python3.6 -m ipykernel.kernelspec

# Install TensorFlow GPU version from central repo
RUN pip3.6 --no-cache-dir install tensorflow-gpu==1.5.0

# Install opencv-python
RUN yum -y install libSM.x86_64 \
        libXrender.x86_64 \
        libXext.x86_64  && \
    yum clean all  &&  rm -rf /var/cache/yum

RUN pip3.6 --no-cache-dir install opencv-python==3.4.1.15

# Install dlib
RUN yum -y groupinstall "Development tools"  && \
    yum -y install cmake && \
    yum clean all  # &&  rm -rf /var/cache/yum

RUN yum install -y python36-devel.x86_64  && \
    yum clean all  # &&  rm -rf /var/cache/yum
# yum search python3 | grep devel

RUN pip3.6 --no-cache-dir install dlib

# Install keras ...
RUN pip3.6 --no-cache-dir install Cython

RUN pip3.6 --no-cache-dir install \
        keras \
        flask \
        flask_cors \
        flask_socketio \
        scikit-image \
        mrcnn \
        imgaug \
        pycocotools

RUN mkdir /test

WORKDIR /test

CMD ["/bin/bash"]
           

建構鏡像并測試

将上述内容寫入 dockerfile 中,建構鏡像并測試:

[[email protected] docker]$ vim dockerfile
[[email protected] docker]$ sudo docker build -t="test" .
[[email protected] docker]$ sudo docker images
REPOSITORY                                     TAG                                  IMAGE ID            CREATED             SIZE
test                                           latest                               eb7674684afa        2 seconds ago       4.18GB
nvidia/cuda                                    9.0-devel-centos7                    ff358ea56625        3 months ago        1.9GB
[[email protected] docker]$ sudo docker run -it --rm --runtime=nvidia test
           
docker 運作指令加上

--runtime=nvidia

,如果有多塊顯示卡可以通過

-e

指定使用哪塊,如

-e NVIDIA_VISIBLE_DEVICES=0

[[email protected] test]# echo "中文"
中文
[[email protected] test]# pip3 --version
pip 8.1.2 from /usr/lib/python3.6/site-packages (python 3.6)
[[email protected] test]# python3
Python 3.6.8 (default, Apr 25 2019, 21:02:35) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
>>> import cv2
>>> import dlib
>>> print("中文")
中文
>>> exit()
           

測試 GPU 的計算能力,測試 tensorflow-gpu 版是否安裝正确:

在容器中運作 tensorflow-gpu 測試代碼:

[[email protected] test]# vim testgpu.py
[[email protected] test]# cat testgpu.py 
# -*- coding: utf-8 -*-
"""
測試 GPU 的計算能力,測試 tensorflow-GPU 版是否安裝正确
"""
 
import tensorflow as tf
import numpy as np
import time
 
value = np.random.randn(5000, 1000)
a = tf.constant(value)
 
b = a * a
 
c =0
tic = time.time()
with tf.Session() as sess:
        for i in range(1000):
            sess.run(b)
 
            c+=1
            if c%100 == 0:
 
                d = c / 10
                # print(d)
                print("計算進行%s%%" % d)
 
toc = time.time()
t_cost = toc - tic
 
print("測試所用時間%s"%t_cost)

[[email protected] test]# python3 testgpu.py 
2019-07-29 20:18:54.026579: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-07-29 20:18:54.290955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties: 
name: Quadro P6000 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
totalMemory: 23.87GiB freeMemory: 23.26GiB
2019-07-29 20:18:54.291030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro P6000, pci bus id: 0000:01:00.0, compute capability: 6.1)
計算進行10.0%
計算進行20.0%
計算進行30.0%
計算進行40.0%
計算進行50.0%
計算進行60.0%
計算進行70.0%
計算進行80.0%
計算進行90.0%
計算進行100.0%
測試所用時間14.024679899215698
           

運作測試代碼的同時另起兩個終端,分别在容器與本機中檢視 GPU 運作情況:

[[email protected] docker]$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
7da5e0966487        test                "/bin/bash"         2 minutes ago       Up 2 minutes                            quirky_kapitsa
[[email protected] docker]$ docker exec -it 7da5e0966487 /bin/bash
[[email protected] test]# nvidia-smi
Mon Jul 29 20:19:00 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P6000        Off  | 00000000:01:00.0  On |                  Off |
| 26%   26C    P8    19W / 250W |  23260MiB / 24447MiB |     95%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
[[email protected] test]# exit
exit
           
[[email protected] docker]$ nvidia-smi
Mon Jul 29 20:19:05 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P6000        Off  | 00000000:01:00.0  On |                  Off |
| 26%   27C    P8    19W / 250W |  23260MiB / 24447MiB |     95%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1099      G   /usr/lib/xorg/Xorg                            36MiB |
|    0      1734      C   python3                                    22779MiB |
|    0      2569      G   fcitx-qimpanel                                36MiB |
|    0      3667      G   /usr/lib/xorg/Xorg                            39MiB |
|    0      4497      G   fcitx-qimpanel                                36MiB |
|    0      4521      G   unity-control-center                           4MiB |
|    0      5564      G   /usr/lib/xorg/Xorg                            51MiB |
|    0      6430      G   /usr/lib/xorg/Xorg                           107MiB |
+-----------------------------------------------------------------------------+

# 實時監控 GPU 運作情況
[[email protected] docker]$ watch -n 0.1 -d nvidia-smi