【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
ICML-2019
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
文章目录
1 Background and Motivation
2 Related Work
3 Advantages / Contributions
4 Compound Model Scaling
4.1 Problem Formulation
4.2 Scaling Dimensions
4.3 Compound Scaling
5 EfficientNet Architecture
6 Experiments
6.1 Datasets
6.2 Experimental for ImageNet
6.3 Transfer Learning Results for EfficientNet
7 Conclusion(own)
1 Background and Motivation
Scaling up ConvNets is widely used to achieve better accuracy.
常见的如
1)scale up 网络深度(比如 resnet50 to resnet 101),
2)scale up 网络的宽度(resnet50 to wide-resnet)
还有不常见的如
3)scale up 输入的分辨率
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
三者之间是否有内在联系?三者联调如何实现最大精度的提升呢?
本文作者 first to empirically quantify the relationship among all three dimensions of network width, depth, and resolution,以高效的提升模型精度
2 Related Work
ConvNet Accuracy
ConvNet Efficiency——lightweight network
Model Scaling——width, depth, and resolutions
3 Advantages / Contributions
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
效仿 MNASNet AutoML 出 EfficientNet-B0,从 width, depth, and resolutions 三个维度 compound scale up EfficientNet-B0 形成不同大小的 EfficientNet-Bx,在 ImageNet 上实现 SOTA 且网络参数很少,跨数据集验证泛化性能也很棒(5/8 SOTA)
4 Compound Model Scaling
核心:
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
4.1 Problem Formulation
神经网络 N N N 可以由堆叠的层 F ( X ) F(X) F(X) 来表示
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
F i L i F_i^{L_i} FiLi 表示 layer F i F_i Fi 在 stage i i i 中重复了 L i L_i Li 次
网络迭代以提升精度的过程可表示为
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
4.2 Scaling Dimensions
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
Scaling up any dimension of network width, depth, or resolution improves accuracy, but the accuracy gain diminishes for bigger models.
1)scaling Depth
优势:capture richer and more complex features
缺点:more difficult to train due to the vanishing gradient problem——diminishing accuracy return for very deep ConvNets(一定深度后 ACC 会达到瓶颈)
2)scaling Width
通道数增加了
优势:wider networks tend to be able to capture more fine-grained features and are easier to train
缺点:have difficulties in capturing higher level features
3)scaling Resolution
优点:potentially capture more fine-grained patterns
4.3 Compound Scaling
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
In order to pursue better accuracy and efficiency, it is critical to balance all dimensions of network width, depth, and resolution during ConvNet scaling
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
doubling network depth will double FLOPS, but doubling network width or resolution will increase FLOPS by four times
按照作者的 compound scaling 方式,网络的 FLOPS 变成了原来的
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
倍
5 EfficientNet Architecture
基于 MNASNet 去 AutoML 基础网络 EfficientNet-B0——we optimize FLOPS rather than latency since we are not targeting any specific hardware device.
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
6 Experiments
6.1 Datasets
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
ImageNet
CIFAR10
CIFAR100
Birdsnap
Stanford Cars
Flowers
FGVC Aircraft
Oxford-IIIT Pets
Food-101
6.2 Experimental for ImageNet
1)Scaling Up MobileNets and ResNets
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
compound scaling 还是比 single scaling 猛哒
2)ImageNet Results for EfficientNet
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
下面看看速度
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
可以说是又快又猛
6.3 Transfer Learning Results for EfficientNet
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)
又快又猛
这个图画成不同网络用 compound scaling(点变成线) 就更惊艳啦
【EfficientNet】《EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks》1 Background and Motivation2 Related Work3 Advantages / Contributions4 Compound Model Scaling5 EfficientNet Architecture6 Experiments7 Conclusion(own)