天天看點

TensorFlow實作卷積、反卷積和空洞卷積

TensorFlow實作卷積、反卷積和空洞卷積

    TensorFlow已經實作了卷積(tf.nn.conv2d卷積函數),反卷積(tf.nn.conv2d_transpose反卷積函數)以及空洞卷積(tf.nn.atrous_conv2d空洞卷積(dilated convolution)),這三個函數的參數了解,可參考網上。比較難的是計算次元,這裡提供三種方式封裝卷積、反卷積和空洞卷積的方法,方面調用:

一、卷積

  • 輸入圖檔大小W×W
  • Filter大小F×F
  • 步長S
  • padding的像素數P

   于是我們可以得出

N = [(W − F + 2P )/S]+1

    輸出圖檔大小為 N×N,卷積次元計算方法​

    可以使用TensorFlow進階的API的slim.conv2d

net = slim.conv2d(inputs=inputs,
                      num_outputs=num_outputs,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=reg,
                      kernel_size=[kernel, kernel],
                      activation_fn=activation_fn,
                      stride=stride,
                      padding=padding,
                      trainable=True,
                      scope=scope)      

     一些特殊情況,可以自己對feature進行填充:

def slim_conv2d(inputs,num_outputs,stride,padding,kernel,activation_fn,reg,scope):
    if padding=="VALID":
        padding_size=int(kernel /2)
        inputs = tf.pad(inputs, paddings=[[0, 0], [padding_size, padding_size], [padding_size, padding_size], [0, 0]],
                     mode='REFLECT')
        print("pad.inputs.shape:{}".format(inputs.get_shape()))
    net = slim.conv2d(inputs=inputs,
                      num_outputs=num_outputs,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=reg,
                      kernel_size=[kernel, kernel],
                      activation_fn=activation_fn,
                      stride=stride,
                      padding=padding,
                      trainable=True,
                      scope=scope)
    print("net.{}.shape:{}".format(scope,net.get_shape()))
    return net      

    下面是使用TensorFlow自己封裝的卷積,與TensorFlow自帶的slim.conv2d進階API類似的功能

def conv2D_layer(inputs,num_outputs,kernel_size,activation_fn,stride,padding,scope,weights_regularizer):
    '''
    根據tensorflow slim子產品封裝一個卷積層API:包含卷積和激活函數,但不包含池化層
    :param inputs:
    :param num_outputs:
    :param kernel_size: 卷積核大小,一般是[1,1],[3,3],[5,5]
    :param activation_fn:激活函數
    :param stride: 例如:[2,2]
    :param padding: SAME or VALID
    :param scope: scope name
    :param weights_regularizer:正則化,例如:weights_regularizer = slim.l2_regularizer(scale=0.01)
    :return:
    '''
    with  tf.variable_scope(name_or_scope=scope):
        in_channels = inputs.get_shape().as_list()[3]
        # kernel=[height, width, in_channels, output_channels]
        kernel=[kernel_size[0],kernel_size[1],in_channels,num_outputs]
        strides=[1,stride[0],stride[1],1]
        # filter_weight=tf.Variable(initial_value=tf.truncated_normal(shape,stddev=0.1))
        filter_weight = slim.variable(name='weights',
                                      shape=kernel,
                                      initializer=tf.truncated_normal_initializer(stddev=0.1),
                                      regularizer=weights_regularizer)
        bias = tf.Variable(tf.constant(0.01, shape=[num_outputs]))
        inputs = tf.nn.conv2d(inputs, filter_weight, strides, padding=padding) + bias
        if not activation_fn is None:
            inputs = activation_fn(inputs)
        return inputs      

二、反卷積

     TensorFlow的進階API已經封裝好了反卷積函數,分别是: tf.layers.conv2d_transpose以及slim.conv2d_transpose,其用法基本一樣,如果想使用tf.nn.conv2d_transpose實作反卷積功能,那麼需要自己根據padding='VALID'和‘SAME’計算輸出次元,這裡提供一個函數deconv_output_length函數,可以根據輸入的次元,filter_size, padding, stride自動計算其輸出次元。

# -*-coding: utf-8 -*-
"""
    @Project: YNet-python
    @File   : myTest.py
    @Author : panjq
    @E-mail : [email protected]
    @Date   : 2019-01-10 15:51:23
"""
import tensorflow as tf
import tensorflow.contrib.slim as slim

def deconv_output_length(input_length, filter_size, padding, stride):
  """Determines output length of a transposed convolution given input length.
  Arguments:
      input_length: integer.
      filter_size: integer.
      padding: one of SAME or VALID ,FULL
      stride: integer.
  Returns:
      The output length (integer).
  """
  if input_length is None:
    return None
  # 預設SAME
  input_length *= stride
  if padding == 'VALID':
    input_length += max(filter_size - stride, 0)
  elif padding == 'FULL':
    input_length -= (stride + filter_size - 2)
  return input_length



def conv2D_transpose_layer(inputs,num_outputs,kernel_size,activation_fn,stride,padding,scope,weights_regularizer):
    '''
    實作反卷積的API:包含反卷積和激活函數,但不包含池化層
    :param inputs:input Tensor=[batch, in_height, in_width, in_channels]
    :param num_outputs:
    :param kernel_size: 卷積核大小,一般是[1,1],[3,3],[5,5]
    :param activation_fn:激活函數
    :param stride: 例如:[2,2]
    :param padding: SAME or VALID
    :param scope: scope name
    :param weights_regularizer:正則化,例如:weights_regularizer = slim.l2_regularizer(scale=0.01)
    :return:
    '''
    with  tf.variable_scope(name_or_scope=scope):
        # shape = [batch_size, height, width, channel]
        in_shape = inputs.get_shape().as_list()
        # 計算反卷積的輸出次元
        output_height=deconv_output_length(in_shape[1], kernel_size[0], padding=padding, stride=stride[0])
        output_width =deconv_output_length(in_shape[2], kernel_size[1], padding=padding, stride=stride[1])
        output_shape=[in_shape[0],output_height,output_width,num_outputs]

        strides=[1,stride[0],stride[1],1]

        # kernel=[kernel_size, kernel_size, output_channel, input_channel ]
        kernel=[kernel_size[0],kernel_size[1],num_outputs,in_shape[3]]
        filter_weight = slim.variable(name='weights',
                                      shape=kernel,
                                      initializer=tf.truncated_normal_initializer(stddev=0.1),
                                      regularizer=weights_regularizer)
        bias = tf.Variable(tf.constant(0.01, shape=[num_outputs]))
        inputs = tf.nn.conv2d_transpose(value=inputs, filter=filter_weight,output_shape=output_shape, strides=strides, padding=padding) + bias
        if not activation_fn is None:
            inputs = activation_fn(inputs)
        return inputs

if __name__ == "__main__":
    inputs = tf.ones(shape=[4, 100, 100, 3])
    stride=2
    kernel_size=10
    padding="SAME"
    net1 = tf.layers.conv2d_transpose(inputs=inputs,
                                      filters=32,
                                      kernel_size=kernel_size,
                                      strides=stride,
                                      padding=padding)
    net2 = slim.conv2d_transpose(inputs=inputs,
                                 num_outputs=32,
                                 kernel_size=[kernel_size, kernel_size],
                                 stride=[stride, stride],
                                 padding=padding)
    net3 = conv2D_transpose_layer(inputs=inputs,
                                  num_outputs=32,
                                  kernel_size=[kernel_size, kernel_size],
                                  activation_fn=tf.nn.relu,
                                  stride=[stride, stride],
                                  padding=padding,
                                  scope="conv2D_transpose_layer",
                                  weights_regularizer=None)
    print("net1.shape:{}".format(net1.get_shape()))
    print("net2.shape:{}".format(net2.get_shape()))
    print("net3.shape:{}".format(net3.get_shape()))

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())      

三、空洞卷積:增大感受野

    Dilated/Atrous conv 空洞卷積/多孔卷積,又名擴張卷積(dilated convolutions),向卷積層引入了一個稱為 “擴張率(dilation rate)”的新參數,該參數定義了卷積核處理資料時各值的間距。該結構的目的是在不用pooling(pooling層會導緻資訊損失)且計算量相當的情況下,提供更大的感受野

    在空洞卷積中有個重要的參數叫rate,這個參數代表了空洞的大小。 要了解空洞概念和如何操作可以從兩個角度去看。 

1)從原圖角度,所謂空洞就是在原圖上做采樣。采樣的頻率是根據rate參數來設定的,當rate為1時候,就是原圖不丢失任何資訊采樣,此時卷積操作就是标準的卷積操作,當rate>1,比如2的時候,就是在原圖上每隔一(rate-1)個像素采樣,如圖b,可以把紅色的點想象成在原圖上的采樣點,然後将采樣後的圖像與kernel做卷積,這樣做其實變相增大了感受野。 

2)從kernel角度去看空洞的話就是擴大kernel的尺寸,在kernel中,相鄰點之間插入rate-1個零,然後将擴大的kernel和原圖做卷積 ,這樣還是增大了感受野。

3)标準卷積為了提高感受野,可以通過池化pooling下采樣,降低圖像尺度的同時增大感受野,但pooling本身是不可學習的,也會丢失很多細節資訊。而dilated conv空洞卷積,不需要pooling,也能有較大的感受野看到更多的資訊。

4)增大卷積核的大小也可以提高感受野,但這會增加計算量

标準卷積: 

TensorFlow實作卷積、反卷積和空洞卷積

空洞卷積 :

TensorFlow實作卷積、反卷積和空洞卷積
def dilated_conv2D_layer(inputs,num_outputs,kernel_size,activation_fn,rate,padding,scope,weights_regularizer):
    '''
    使用Tensorflow封裝的空洞卷積層API:包含空洞卷積和激活函數,但不包含池化層
    :param inputs:
    :param num_outputs:
    :param kernel_size: 卷積核大小,一般是[1,1],[3,3],[5,5]
    :param activation_fn:激活函數
    :param rate:
    :param padding: SAME or VALID
    :param scope: scope name
    :param weights_regularizer:正則化,例如:weights_regularizer = slim.l2_regularizer(scale=0.01)
    :return:
    '''
    with  tf.variable_scope(name_or_scope=scope):
        in_channels = inputs.get_shape().as_list()[3]
        kernel=[kernel_size[0],kernel_size[1],in_channels,num_outputs]
        # filter_weight=tf.Variable(initial_value=tf.truncated_normal(shape,stddev=0.1))
        filter_weight = slim.variable(name='weights',
                                      shape=kernel,
                                      initializer=tf.truncated_normal_initializer(stddev=0.1),
                                      regularizer=weights_regularizer)
        bias = tf.Variable(tf.constant(0.01, shape=[num_outputs]))
        # inputs = tf.nn.conv2d(inputs,filter_weight, strides, padding=padding) + bias
        inputs = tf.nn.atrous_conv2d(inputs, filter_weight, rate=rate, padding=padding) + bias
        if not activation_fn is None:
            inputs = activation_fn(inputs)
        return inputs