深度學習筆記16：CNN經典論文研讀之AlexNet及其Tensorflow實作

在 Yann Lecun 提出 Le-Net5 之後的十幾年内，由于神經網絡本身較差的可解釋性以及受限于計算能力的影響，神經網絡發展緩慢且在較長一段時間内處于低谷。2012年，深度學習三巨頭之一、具有神經網絡之父之稱的 Geoffrey Hinton 的學生 Alex Krizhevsky 率先提出了 AlexNet，并在當年度的 ILSVRC（ImageNet大規模視覺挑戰賽）以顯著的優勢獲得當屆冠軍，top-5 的錯誤率降至了 16.4%，相比于第二名 26.2% 的錯誤率有了極大的提升。這一成績引起了學界和業界的極大關注，計算機視覺也開始逐漸進入深度學習主導的時代。

AlexNet 繼承了 LeCun 的 Le-Net5 思想，将卷積神經網絡的發展到很寬很深的網絡當中，相較于 Le-Net5 的六萬個參數，AlexNet 包含了 6 億三千萬條連接配接，6000 萬個參數和 65 萬個神經元，其網絡結構包括 5 層卷積，其中第一、第二和第五層卷積後面連接配接了最大池化層，然後是 3 個全連接配接層。AlexNet 的創新點在于：

● 首次成功使用

relu

作為激活函數，使其在較深的網絡上效果超過傳統的

sigmoid

激活函數，極大的緩解了梯度消失問題。

● 首次在實踐中發揮了

dropout

的作用，為全連接配接層添加

dropout

防止過拟合。

● 相較于之前 Le-Net5 中采用的平均池化，AlexNet 首次采用了重疊的最大池化，避免了平均池化的模糊化效果。

● 提出了 LRN 層，對局部神經元的活動建立了競争機制。

● 使用多 GPU 進行并行計算。

● 采用了一定的資料增強手段，一定程度上也緩解了過拟合。

AlexNet 網絡結構 以上是 AlexNet 的基本介紹和創新點，下面我們看一下 AlexNet 的網絡架構。

深度學習筆記16：CNN經典論文研讀之AlexNet及其Tensorflow實作

AlexNet 不算池化層總共有 8 層，前 5 層為卷積層，其中第一、第二和第五層卷積都包含了一個最大池化層，後三層為全連接配接層。是以 AlexNet 的簡略結構如下：

輸入>卷積>池化>卷積>池化>卷積>卷積>卷積>池化>全連接配接>全連接配接>全連接配接>輸出

各層的結構和參數如下：

C1層是個卷積層，其輸入輸出結構如下：

輸入： 227 x 227 x 3 濾波器大小： 11 x 11 x 3 濾波器個數：96

輸出： 55 x 55 x 96

P1層是C1後面的池化層，其輸入輸出結構如下：

輸入： 55 x 55 x 96 濾波器大小： 3 x 3 濾波器個數：96

輸出： 27 x 27 x 96

C2層是個卷積層，其輸入輸出結構如下：

輸入： 27 x 27 x 96 濾波器大小： 5 x 5 x 96 濾波器個數：256

輸出： 27 x 27 x 256

P2層是C2後面的池化層，其輸入輸出結構如下：

輸入： 27 x 27 x 256 濾波器大小： 3 x 3 濾波器個數：256

輸出： 13 x 13 x 256

C3層是個卷積層，其輸入輸出結構如下：

輸入： 13 x 13 x 256 濾波器大小： 3 x 3 x 256 濾波器個數：384

輸出： 13 x 13 x 384

C4層是個卷積層，其輸入輸出結構如下：

輸入： 13 x 13 x 384 濾波器大小： 3 x 3 x 384 濾波器個數：384

C5層是個卷積層，其輸入輸出結構如下：

輸入： 13 x 13 x 384 濾波器大小： 3 x 3 x 384 濾波器個數：256

P5層是C5後面的池化層，其輸入輸出結構如下：

輸入： 13 x 13 x 256 濾波器大小： 3 x 3 濾波器個數：256

輸出： 6 x 6 x 256

F6層是個全連接配接層，其輸入輸出結構如下：

輸入：6 x 6 x 256

輸出：4096

F7層是個全連接配接層，其輸入輸出結構如下：

輸入：4096

F8層也是個全連接配接層，即輸出層，其輸入輸出結構如下：

輸出：1000

在論文中，輸入圖像大小為 224 x 224 x 3，實際為 227 x 227 x 3。各層輸出采用

relu

進行激活。前五層卷積雖然計算量極大，但參數量并不如後三層的全連接配接層多，但前五層卷積層的作用卻要比全連接配接層重要許多。

AlexNet 在驗證集和測試集上的分類錯誤率表現：

AlexNet 的 tensorflow 實作 我們繼續秉持前面關于利用

tensorflow

建構卷積神經網絡的基本步驟和方法：定義建立輸入輸出的占位符變量子產品、初始化各層參數子產品、建立前向傳播子產品、定義模型優化疊代模型，以及在最後設定輸入資料。

● 定義卷積過程

def conv(x, filter_height, filter_width, num_filters, stride_y, stride_x, name,

padding='SAME', groups=1):

# Get number of input channels

input_channels = int(x.get_shape()[-1])

# Create lambda function for the convolution

convolve = lambda i, k: tf.nn.conv2d(i, k,

padding=padding)

strides=[1, stride_y, stride_x, 1],

with tf.variable_scope(name) as scope:

# Create tf variables for the weights and biases of the conv layer

weights = tf.get_variable('weights', shape=[filter_height,

biases = tf.get_variable('biases', shape=[num_filters])

filter_width,

input_channels/groups,

num_filters])

if groups == 1:

conv = convolve(x, weights)

# In the cases of multiple groups, split inputs & weights and

else:

# Split input and weights and convolve them separately

input_groups = tf.split(axis=3, num_or_size_splits=groups, value=x)

weight_groups = tf.split(axis=3, num_or_size_splits=groups,

output_groups = [convolve(i, k) for i, k in zip(input_groups, weight_groups)]

value=weights)

# Concat the convolved output together again

conv = tf.concat(axis=3, values=output_groups)

# Add biases

bias = tf.reshape(tf.nn.bias_add(conv, biases), tf.shape(conv))

# Apply relu function

relu_result = tf.nn.relu(bias, name=scope.name)

return relu_result

● 定義全連接配接層

def fc(x, num_in, num_out, name, relu=True):

# Create tf variables for the weights and biases

weights = tf.get_variable('weights', shape=[num_in, num_out],

trainable=True)

biases = tf.get_variable('biases', [num_out], trainable=True)

# Matrix multiply weights and inputs and add bias

act = tf.nn.xw_plus_b(x, weights, biases, name=scope.name)

if relu:

relu = tf.nn.relu(act)

return relu

return act

● 定義最大池化過程

def max_pool(x, filter_height, filter_width, stride_y, stride_x, name,

padding='SAME'):

return tf.nn.max_pool(x, ksize=[1, filter_height, filter_width, 1],

padding=padding, name=name)

● 定義 LRN

def lrn(x, radius, alpha, beta, name, bias=1.0):

return tf.nn.local_response_normalization(x, depth_radius=radius,

alpha=alpha, beta=beta,

bias=bias, name=name)

● 定義 dropout 操作

def dropout(x, keep_prob): 
return tf.nn.dropout(x,keep_prob)
 以上關于搭建 AlexNet 的各個元件我們都已準備好，下面我們利用這些組建建立一個 AlexNet 類來實作 AlexNet。

class AlexNet(object):
 def __init__(self, x, keep_prob, num_classes, skip_layer,
 weights_path='DEFAULT'):
 self.NUM_CLASSES = num_classes
 # Parse input arguments into class variables
 self.X = x
 if weights_path == 'DEFAULT':
 self.KEEP_PROB = keep_prob
 self.SKIP_LAYER = skip_layer

 # Call the create function to build the computational graph of AlexNet
 self.WEIGHTS_PATH = 'bvlc_alexnet.npy'
 else:
 self.create()
 self.WEIGHTS_PATH = weights_path


 conv1 = conv(self.X, 11, 11, 96, 4, 4, padding='VALID', name='conv1')
 def create(self):
 # 1st Layer: Conv (w ReLu) -> Lrn -> Pool
 norm1 = lrn(conv1, 2, 1e-04, 0.75, name='norm1')
 # 2nd Layer: Conv (w ReLu) -> Lrn -> Pool with 2 groups
 pool1 = max_pool(norm1, 3, 3, 2, 2, padding='VALID', name='pool1')

 conv2 = conv(pool1, 5, 5, 256, 1, 1, groups=2, name='conv2')
 conv3 = conv(pool2, 3, 3, 384, 1, 1, name='conv3')
 norm2 = lrn(conv2, 2, 1e-04, 0.75, name='norm2')
 pool2 = max_pool(norm2, 3, 3, 2, 2, padding='VALID', name='pool2')

 # 3rd Layer: Conv (w ReLu)

 conv5 = conv(conv4, 3, 3, 256, 1, 1, groups=2, name='conv5')
 # 4th Layer: Conv (w ReLu) splitted into two groups
 conv4 = conv(conv3, 3, 3, 384, 1, 1, groups=2, name='conv4')

 # 5th Layer: Conv (w ReLu) -> Pool splitted into two groups
 fc6 = fc(flattened, 6*6*256, 4096, name='fc6')
 pool5 = max_pool(conv5, 3, 3, 2, 2, padding='VALID', name='pool5')

 # 6th Layer: Flatten -> FC (w ReLu) -> Dropout
 dropout6 = dropout(fc6, self.KEEP_PROB)
 flattened = tf.reshape(pool5, [-1, 6*6*256])

 # 7th Layer: FC (w ReLu) -> Dropout
 def load_initial_weights(self, session):
 fc7 = fc(dropout6, 4096, 4096, name='fc7')
 dropout7 = dropout(fc7, self.KEEP_PROB)

 # 8th Layer: FC and return unscaled activations
 self.fc8 = fc(dropout7, 4096, self.NUM_CLASSES, relu=False, name='fc8')

 with tf.variable_scope(op_name, reuse=True):
 # Load the weights into memory
 weights_dict = np.load(self.WEIGHTS_PATH, encoding='bytes').item()
 for op_name in weights_dict:

 # Loop over all layer names stored in the weights dict

 if op_name not in self.SKIP_LAYER:
 # Check if layer should be trained from scratch


 session.run(var.assign(data))
 # Assign weights/biases to their corresponding tf variable
 for data in weights_dict[op_name]:
 if len(data.shape) == 1:

 # Biases
 var = tf.get_variable('biases', trainable=False)
 session.run(var.assign(data))

 # Weights
 else:
 var = tf.get_variable('weights', trainable=False)

在上述代碼中，我們利用了之前定義的各個元件封裝了前向計算過程，從http://www.cs.toronto.edu/~guerzhoy/tf_alexnet/上導入了預訓練好的模型權重。這樣一來，我們就将 AlexNet 基本搭建好了。

原文釋出時間為：2018-09-28

本文作者：louwill

本文來自雲栖社群合作夥伴“

Python愛好者社群

”，了解相關資訊可以關注“

”。

深度學習筆記16：CNN經典論文研讀之AlexNet及其Tensorflow實作

繼續閱讀

anaconda中科大鏡像

安裝tensorflow1.12出現illegal hardware instruction python錯誤1、問題2、定位問題3、問題解決4、驗證

Linux下Anaconda安裝tensorflow-gpu

tensorflow筆記實踐：正則化優化過拟合

TensorFlow運作模型——會話

【Ubuntu-Tensorflow】TF1.0到TF1.2出現“Key LSTM/basic_lstm_cell/bias not found in checkpoin”問題

linux下的conda安裝tensorflow

Linux環境下 TensorFlow的安裝和使用基于Anaconda的tensorflow安裝

MindSpore儲存模型的格式疑惑

【Tensorflow】Tensorflow介紹

鸢尾花分類

利用tensorflow建構AlexNet模型，實作小數量級的貓狗分類（隻有train）

ImportError: libcublas.so.10.0: cannot open shared object file: No such file解決方法

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory（完美解決）

一種解決思路： ImportError: libcublas.so.10.0: cannot open shared object file: No such file

K-近鄰算法以及圖像分類應用