天天看點

解決numpy 和 tensorflow預處理資料時精度不一緻問題

解決numpy 和 tensorflow處理資料時精度不一緻問題

原因可能與tf.reduce_mean有關,具體見 https://github.com/tensorflow/tensorflow/issues/12387 https://github.com/tensorflow/tensorflow/issues/5527

時間有限沒有深究,暫時的解決方法是 tensorflow資料處理過程改為tf.float64, numpy的資料處理過程也改為np.float64

tensorflow 資料處理部分代碼如下

#image = tf.to_float(image)
  image = tf.to_double(image)
  # Resize and crop if needed.
  resized_image = tf.image.resize_image_with_crop_or_pad(image,
                                                         output_width,
                                                         output_height)
  # Subtract off the mean and divide by the variance of the pixels.
  norm_image = tf.image.per_image_standardization(resized_image)

  # 轉換回float32因為conv2d暫時不支援float64 (參考https://github.com/tensorflow/tensorflow/issues/12941) 
  return tf.to_float(norm_image)
           

主要是norm部分産生的差異:是以修改了per_image_standardization的源碼(float32->float64,列印部分變量的值以友善對比)

路徑在(/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops)

def per_image_standardization(image):
  """Linearly scales `image` to have zero mean and unit norm.

  This op computes `(x - mean) / adjusted_stddev`, where `mean` is the average
  of all values in image, and
  `adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))`.

  `stddev` is the standard deviation of all values in `image`. It is capped
  away from zero to protect against division by 0 when handling uniform images.

  Args:
    image: 3-D tensor of shape `[height, width, channels]`.

  Returns:
    The standardized image with same shape as `image`.

  Raises:
    ValueError: if the shape of 'image' is incompatible with this function.
  """
  image = ops.convert_to_tensor(image, name='image')
  image = control_flow_ops.with_dependencies(
      _Check3DImage(image, require_static=False), image)
  num_pixels = math_ops.reduce_prod(array_ops.shape(image))

  #image = math_ops.cast(image, dtype=dtypes.float32)
  image = math_ops.cast(image, dtype=dtypes.float64)
  image_mean = math_ops.reduce_mean(image)
  image_mean = Print(image_mean, [image_mean], message='####image_mean is')

  variance = (math_ops.reduce_mean(math_ops.square(image)) -
              math_ops.square(image_mean))
  variance = Print(variance, [variance], message='####variance raw is')

  variance = gen_nn_ops.relu(variance)
  variance = Print(variance, [variance], message='####variance relu is')

  stddev = math_ops.sqrt(variance)
  stddev = Print(stddev, [stddev], message='####stddev is')

  # Apply a minimum normalization that protects us against uniform images.
  #min_stddev = math_ops.rsqrt(math_ops.cast(num_pixels, dtypes.float32))
  min_stddev = math_ops.rsqrt(math_ops.cast(num_pixels, dtypes.float64))
  min_stddev = Print(min_stddev, [min_stddev], message='####min_stddev is')

  pixel_value_scale = math_ops.maximum(stddev, min_stddev)
  pixel_value_scale = Print(pixel_value_scale, [pixel_value_scale], message='####pixel_value_scale is')

  pixel_value_offset = image_mean

  image = math_ops.subtract(image, pixel_value_offset)
  image = Print(image, [image], message='####image subtract is')


  image = math_ops.div(image, pixel_value_scale)
  image = Print(image, [image], message='####image div is')
  return image
           

numpy部分代碼為:

# crop
im = crop(im, random_crop=is_train, image_size=crop_size)
# random flip left or right
im = flip(im, random_flip=is_train)

# normlization
im = im.astype(np.float64, copy=False)
mean = np.mean(im)
stddev = np.std(im)
adjusted_stddev = max(stddev,  / np.sqrt(np.array(im.size, dtype=np.float64) ) )
im = (im - mean)/adjusted_stddev
           

繼續閱讀