论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

- abstract
- Introduction
- Related Work
  - Binary Descriptors
- Approach
  - Overall Learning Objectives
  - Learning Discriminative Binary Descriptors
  - Learning Efficient Binary Descriptors
  - Learning Rotation Invariant Binary Descriptors
  - 整体算法
- Experimental Results
  - Datasets
  - Results on Image Matching
  - Results on Image Retrieval
  - Results on Object Recognition
- Conclusions

abstract

本文主要提出了一个无监督的深度神经网络来学习得到二进制描述（二值描述子），相比于之前的有监督或者无监督的二值描述子要好。主要设计三个内容：

最小化量化损失 - minimal loss quantization,
使得二值尽量的均匀分布 - evenly distributed codes
不相关的二值(bits) - uncorrelated bits [不太明白]

代码是caffe环境下编写的

Introduction

一个好的特征描述子应该具备：高质量的特征表示、低的计算开销，要能够找到图像中的可以用于区分的信息，并且对于图像的旋转变换要有鲁棒性。如果用于移动设备，还需要考虑计算的实时性(real-time)。

CNN、SIFT等描述子能够学习得到更为深层的信息，更具有区分性，缩小了像素层次和语义层次的差距，但是特征描述子常常维度很高，需要的计算开销较大。

几个二值描述子为了减小计算开销：BRIEF , ORB , BRISK , and FREAK，通过这些描述子可以使用汉明距离计算不同图像之间的相似度，但是这些早期的二值描述子是通过简单的亮度对比（intensity comparison）计算得到，对于尺度、旋转、噪声敏感。也有一些方法对其进行了改进，但是都是基于成对的相似标签(pair-wised similarity labels)，也就是说训练数据必须要有标签。

所以，本文提出了无监督的方法，DeepBit

Related Work

Binary Descriptors

早期的：BRIEF , ORB , BRISK , and FREAK，基于手动采样以及成对的亮度对比项（a set of pairwise intensity comparisons）

改进：

D-BRIEF：对期望的相似关系进行编码（encodes the desired similarity relationships）并且学习得到一个投影矩阵来计算具有区分性的二值特征

Local Difference Binary (LDB)：使用Adaboost得到最有的采样对

Linear Discriminat Analysis (LDA)

BinBoost：使用boosting得到一系列的投影矩阵

这些改进都是基于成对的标签，并且不支持将这些二值描述子迁移到另外的任务上

非监督的方法：

Locality Sensitive Hashing (LSH)：使用随机投影的方式将原始数据映射到一个低维度的特征空间，然后对其二值化

Semantic hashing (SH)：建立了一个多层的Restricted Boltzmann Machines (RBM) 来学习得到 compact 二值码，针对文本和文档

Spectral hashing (SpeH): 谱分割方法生成二值码

Iterative qauntization (ITQ)：使用迭代优化策略找到二值损失最小时的投影

这些无监督的方法的得到的二值码的准确度与实值得到准确度还是有差距

深度学习：获得很大的成功，很多方法通过利用中间层的图像表示( mid-level image representation )获得较好的效果。有人通过预训练的CNN以及深度迁移学习提升了比如图像检测、图像分割、图像检索的效果。

SSDH: 通过构建哈希函数作为隐藏层获得了很好的效果

Deep Hashing (DH): 构建了三层分层结构的神经网络(three layers hierarchical neural networks )来学习得到具有区分性的投影矩阵，但是并没有利用到迁移学习，所以二值码不是那么有效，也就是说不太可以迁移到其他的任务上。

Approach

本文提出的方法利用了从ImageNet上预训练的中间层的图像表示( mid-level image representation )并且无监督学习到了二值描述子

之前的一些方式使用 hand-crafted 特征以及成对的相似信息来优化投影函数（投影函数是什么形式），deepbit 使用一系列的非线性投影函数计算二值描述子，使用到了三个重要的objectives（损失），通过SGD优化，本方法不需要有标签的训练数据

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

这是整个的框架

Overall Learning Objectives

deepbit 计算二值描述子的方式是：首先对输入图像进行投影操作，然后得到二值信息。

b=0.5×(sign(F(x;W))+1) b = 0.5 × ( s i g n ( F ( x ; W ) ) + 1 )

其中 F(x;W)=fk(⋅⋅⋅f2(f1(x;w1);w2)⋅⋅⋅;wk) F ( x ; W ) = f k ( ⋅ ⋅ ⋅ f 2 ( f 1 ( x ; w 1 ) ; w 2 ) ⋅ ⋅ ⋅ ; w k )

deepbit 的目的是学习得到 W=(w1,w2,...,wk) W = ( w 1 , w 2 , . . . , w k ) ，然后得到二值b

对于 W W ，需要具有一下性质：

需要保留最后一层结构的局部数据结构(不太懂)，在投影操作之后，量化损失应该尽量小
二值描述子应该尽可能的均匀分布
描述子应该对于图像旋转和噪声具有鲁棒性，（这样才能够获取到更多的不相关信息？？tend to capture more uncorrelated information from input image）

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

N N 是一个batch中的训练数据

MM 是二值码的长度

R R 指的是图像旋转角度

bn,θbn,θ 指 xn x n 经过 θ θ 旋转投影之后的二值码

C(θ) C ( θ ) 是惩罚函数，作用在训练数据上，根据旋转角度的不同，损失函数不同。

Learning Discriminative Binary Descriptors

deepbit的目标是找到投影函数可以将输入图像映射到一个二值数据中，同时保留原始图像的具有区分性的信息，量化损失越小，二值描述子保留原始图像信息的效果越好，也就是越接近原始投影值

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

Learning Efficient Binary Descriptors

尽可能的使二值码均匀分布，熵越大，能够表达的信息越多，以 50% 分界

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

其中

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

Learning Rotation Invariant Binary Descriptors

我们希望得到的描述能具有旋转不变性，

estimation error 可能会随着角度增大而变得很大，所以增加了一个惩罚项 C(θ) C ( θ ) ，

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

μ=0,σ=1 μ = 0 , σ = 1

所以最小化函数：

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

整体算法

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

算法主要有两个部分，第一部分是对网络的初始化，本文使用的是预训练的16层的VGGNet中的权重，然后将VGGNet的最后一层换成一个新的全连接层，然后使用这些结果来获得二值码，最后使用SGD以及BP算法来训练网络。

用到的设置项： α=1.0,β=1.0,γ=0.01,θ={10,5,0,−5,−10} α = 1.0 , β = 1.0 , γ = 0.01 , θ = { 10 , 5 , 0 , − 5 , − 10 } ，mini-batch=32, bit-length=256

image_size= 224×224 224 × 224

Experimental Results

主要测试了deepbit在三种不同任务中的效果，image matching, image retrieval, and image classification.

Datasets

Brown Dataset：包含 Liberty, Notredame, Yosemite dataset 三个数据集合，每个400,000 gray-scale patches
CIFAR-10 Dataset
The Oxford 17 Category Flower Dataset: 包含17个分类，每个分类有80个图像

Results on Image Matching

相比较的方法：

unsupervised (BRIEF, ORB, BRISK, and Boosted SSC),
supervised methods (D-BRIEF, LDAHash).

Results on Image Retrieval

LSH, ITQ ,PCAH, Semantic Hashing (SH) , Spectral hashing (SpeH)), Spherical hashing (SphH), KMH, and Deep Hashing (DH)
the CIFAR-10 dataset
16, 32, and 64 hash bits

实验发现哈希码越长，deepbit的效果越好

Results on Object Recognition

主要说明deepbit是一种无监督的方法

比较对象：

real-valued descriptors such as HOG, and SIFT
on the flower recognition
train the multi-class SVM classifier with the proposed binary descriptor

说明了能够有效学习得到具有区分性并且compact的编码（effective to learn discriminative and compact binary codes）

Conclusions

提出了一个无监督的深度学习框架来得到二值描述子
本文的方法不需要标注的训练数据
比那些有监督的描述子更具有实用性（more practical to real-world applications compared to supervised binary descriptors）

refer:

Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

code

论文笔记 - Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

abstract

Introduction

Related Work

Binary Descriptors

Approach

Overall Learning Objectives

Learning Discriminative Binary Descriptors

Learning Efficient Binary Descriptors

Learning Rotation Invariant Binary Descriptors

整体算法

Experimental Results

Datasets

Results on Image Matching

Results on Image Retrieval

Results on Object Recognition

Conclusions

继续阅读

论文笔记 Joint Inference of Reward Machines and Policies for Reinforcement Learning摘要介绍准备工作JIRP算法优化案例研究Reference

小样本学习|元学习ICLR2017《Optimization as A Model for Few-shot Learning》1. 思想2. 过程3. 实验

论文笔记：Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

[论文解读]EMNLP2019: A Boundary-aware Neural Model for Nested NER

3D修复论文：Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks 摘要一、简介二、网络结构三、实验结果

论文笔记-PSPNet-Semantic Segmentation--Pyramid Scene Parsing Network论文笔记-PSPNet-Semantic Segmentation–Pyramid Scene Parsing Network

论文笔记-Unsupervised Adversarial Depth Estimation using Cycled Generative Networks

论文笔记-Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation

论文笔记 -《Machine vision technology for detecting the external defects of fruits - a review》1 文章背景2 摘要内容3 段落主旨

论文阅读：CVPR2019 | CSPNet: A New Backbone that can Enhance Learning Capability of CNN前言一、Introduction二、Method三、 result总结

论文笔记 - Weighted Component Hashing of Binary Aggregated Descriptors for Fast Visual Search

【论文笔记】FM: Factorization Machines

（推荐系统） FM算法：Factorization Machines摘要1. FM模型2. FM如何解决数据的稀疏性3 FM的线性复杂度4.FM与其他算法的对比5 总结

[MICCAI2019] Learning shape priors for robust cardiac MR segmentation from multi-view images

[MICCAI2019] Unified Attentional Generative Adversarial Network for Brain Tumor Segmentation From Mu

论文阅读笔记（三）：Research on Network Attack Effect Evaluation Based on Confrontational Perspective一. 论文简介二. 创新点和贡献：三. 相关领域的概述(related work)四. 作者的方案五. 主要的信息流（approach）六. 总结