CVPR2021年的官網:
![](https://img.laitimes.com/img/__Qf2AjLwojIjJCLyojI0JCLiAzNfRHLGZkRGZkRfJ3bs92YsYTMfVmepNHL5FEVNBzYE90MNpHW3BjMMBjVtJWd0ckW65UbM5WOHJWa5kHT20ESjBjUIF2X0hXZ0xCMx81dvRWYoNHLrdEZwZ1Rh5WNXp1bwNjW1ZUba9VZwlHdssmch1mclRXY39CXldWYtlWPzNXZj9mcw1ycz9WL49zZuBnLwYDO5UzMxETM5AzMwEjMwIzLc52YucWbp5GZzNmLn9Gbi1yZtl2Lc9CX6MHc0RHaiojIsJye.png)
總結持續更新Github上面:https://github.com/Sophia-11/Awesome-CVPR-Paper
Image-to-image Translation via Hierarchical Style Disentanglement Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji https://arxiv.org/abs/2103.01456 https://github.com/imlixinyang/HiSD
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation https://arxiv.org/pdf/2012.08512.pdf https://tarun005.github.io/FLAVR/Code https://tarun005.github.io/FLAVR/
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition Stephen Hausler, Sourav Garg, Ming Xu, Michael Milford, Tobias Fischer https://arxiv.org/abs/2103.01486
Depth from Camera Motion and Object Detection Brent A. Griffin, Jason J. Corso https://arxiv.org/abs/2103.01468
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers https://arxiv.org/pdf/2011.09094.pdf
Multi-Stage Progressive Image Restoration https://arxiv.org/abs/2102.02808 https://github.com/swz30/MPRNet
Weakly Supervised Learning of Rigid 3D Scene Flow https://arxiv.org/pdf/2102.08945.pdf https://arxiv.org/pdf/2102.08945.pdf https://3dsceneflow.github.io/
Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning Mamshad Nayeem Rizve, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah https://arxiv.org/abs/2103.01315
Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels https://arxiv.org/abs/2101.05022 https://github.com/naver-ai/relabel_imagenet
Rethinking Channel Dimensions for Efficient Model Design https://arxiv.org/abs/2007.00992 https://github.com/clovaai/rexnet
Coarse-Fine Networks for Temporal Activity Detection in Videos Kumara Kahatapitiya, Michael S. Ryoo https://arxiv.org/abs/2103.01302
A Deep Emulator for Secondary Motion of 3D Characters Mianlun Zheng, Yi Zhou, Duygu Ceylan, Jernej Barbic https://arxiv.org/abs/2103.01261
Fair Attribute Classification through Latent Space De-biasing https://arxiv.org/abs/2012.01469 https://github.com/princetonvisualai/gan-debiasing https://princetonvisualai.github.io/gan-debiasing/
Auto-Exposure Fusion for Single-Image Shadow Removal Lan Fu, Changqing Zhou, Qing Guo, Felix Juefei-Xu, Hongkai Yu, Wei Feng, Yang Liu, Song Wang https://arxiv.org/abs/2103.01255
Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling https://arxiv.org/pdf/2102.06183.pdf https://github.com/jayleicn/ClipBERT
MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing Zhengjue Wang, Hao Zhang, Ziheng Cheng, Bo Chen, Xin Yuan https://arxiv.org/abs/2103.01786
AttentiveNAS: Improving Neural Architecture Search via Attentive https://arxiv.org/pdf/2011.09011.pdf
Diffusion Probabilistic Models for 3D Point Cloud Generation Shitong Luo, Wei Hu https://arxiv.org/abs/2103.01458
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge Francisco Rivera Valverde, Juana Valeria Hurtado, Abhinav Valada https://arxiv.org/abs/2103.01353 http://rl.uni-freiburg.de/research/multimodal-distill
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation https://arxiv.org/abs/2008.00951 https://github.com/eladrich/pixel2style2pixel https://eladrich.github.io/pixel2style2pixel/
Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph Xin Ye, Yezhou Yang https://arxiv.org/abs/2103.01350
RepVGG: Making VGG-style ConvNets Great Again https://arxiv.org/abs/2101.03697 https://github.com/megvii-model/RepVGG
Transformer Interpretability Beyond Attention Visualization https://arxiv.org/pdf/2012.09838.pdf https://github.com/hila-chefer/Transformer-Explainability
PREDATOR: Registration of 3D Point Clouds with Low Overlap https://arxiv.org/pdf/2011.13005.pdf https://github.com/ShengyuH/OverlapPredator https://overlappredator.github.io/
往年2020年論文回歸
目标檢測
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection 論文位址:https://arxiv.org/abs/1912.02424
代碼:https://github.com/sfzhang15/ATSS
Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector 論文位址:https://arxiv.org/abs/1908.01998
圖像分割
Semi-Supervised Semantic Image Segmentation with Self-correcting Networks 論文位址:https://arxiv.org/abs/1811.07073
Deep Snake for Real-Time Instance Segmentation 論文位址:https://arxiv.org/abs/2001.01629
CenterMask : Real-Time Anchor-Free Instance Segmentation 論文位址:https://arxiv.org/abs/1911.06667 代碼:https://github.com/youngwanLEE/CenterMask
SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks 論文位址:https://arxiv.org/abs/2003.00678
PolarMask: Single Shot Instance Segmentation with Polar Representation 論文位址:https://arxiv.org/abs/1909.13226 代碼:https://github.com/xieenze/PolarMask
xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation 論文位址:https://arxiv.org/abs/1911.12676
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation 論文位址:https://arxiv.org/abs/2001.00309
人臉識别
Towards Universal Representation Learning for Deep Face Recognition 論文位址:https://arxiv.org/abs/2002.11841
Suppressing Uncertainties for Large-Scale Facial Expression Recognition
論文位址:https://arxiv.org/abs/2002.10392 代碼:https://github.com/kaiwang960112/Self-Cure-Network
3.Face X-ray for More General Face Forgery Detection 論文位址:https://arxiv.org/pdf/1912.13458.pdf
目标跟蹤
1.ROAM: Recurrently Optimizing Tracking Model 論文位址:https://arxiv.org/abs/1907.12006
三維點雲&重建
PF-Net: Point Fractal Network for 3D Point Cloud Completion 論文位址:https://arxiv.org/abs/2003.00410
PointAugment: an Auto-Augmentation Framework for Point Cloud Classification 論文位址:https://arxiv.org/abs/2002.10876 代碼:https://github.com/liruihui/PointAugment/
3.Learning multiview 3D point cloud registration 論文位址:https://arxiv.org/abs/2001.05119
C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds 論文位址:https://arxiv.org/abs/1912.07009
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds 論文位址:https://arxiv.org/abs/1911.11236
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image 論文位址:https://arxiv.org/abs/2002.12212
Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion 論文位址:https://arxiv.org/abs/2003.01456
In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks 論文位址:https://arxiv.org/pdf/1911.11924.pdf
姿态估計
VIBE: Video Inference for Human Body Pose and Shape Estimation 論文位址:https://arxiv.org/abs/1912.05656
代碼:https://github.com/mkocabas/VIBE
Distribution-Aware Coordinate Representation for Human Pose Estimation 論文位址:https://arxiv.org/abs/1910.06278
代碼:https://github.com/ilovepose/DarkPose
4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras 論文位址:https://arxiv.org/abs/2002.12625
Optimal least-squares solution to the hand-eye calibration problem 論文位址:https://arxiv.org/abs/2002.10838
D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry 論文位址:https://arxiv.org/abs/2003.01060
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition 論文位址:https://arxiv.org/abs/2001.09691
Distribution Aware Coordinate Representation for Human Pose Estimation 論文位址:https://arxiv.org/abs/1910.06278
The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation 論文位址:https://arxiv.org/abs/1911.07524
9.PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation 論文位址:https://arxiv.org/abs/1911.04231
GAN
Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models 論文位址:https://arxiv.org/abs/1911.12287 代碼:https://github.com/giannisdaras/ylg
MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis 論文位址:https://arxiv.org/abs/1903.06048
Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory 論文位址:https://arxiv.org/abs/1911.04636
小樣本&零樣本
Improved Few-Shot Visual Classification 論文位址:https://arxiv.org/pdf/1912.03432.pdf
2.Meta-Transfer Learning for Zero-Shot Super-Resolution 論文位址:https://arxiv.org/abs/2002.12213
弱監督&無監督
Rethinking the Route Towards Weakly Supervised Object Localization 論文位址:https://arxiv.org/abs/2002.11359
NestedVAE: Isolating Common Factors via Weak Supervision 論文位址:https://arxiv.org/abs/2002.11576
3.Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation 論文位址:https://arxiv.org/abs/1911.07450
4.Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction 論文位址:https://arxiv.org/abs/2003.01460
神經網絡
Visual Commonsense R-CNN 論文位址:https://arxiv.org/abs/2002.12204
GhostNet: More Features from Cheap Operations 論文位址:https://arxiv.org/abs/1911.11907 代碼:https://github.com/iamhankai/ghostnet
Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral 論文位址:https://arxiv.org/abs/2003.01826
模型加速
GPU-Accelerated Mobile Multi-view Style Transfer 論文位址:https://arxiv.org/abs/2003.00706
視覺常識
What it Thinks is Important is Important: Robustness Transfers through Input Gradients 論文位址:https://arxiv.org/abs/1912.05699
2.Attentive Context Normalization for Robust Permutation-Equivariant Learning 論文位址:https://arxiv.org/abs/1907.02545
Bundle Adjustment on a Graph Processor 論文位址:https://arxiv.org/abs/2003.03134 https://github.com/joeaortiz/gbp
Transferring Dense Pose to Proximal Animal Classes 論文位址:https://arxiv.org/abs/2003.00080
Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs 論文位址:https://arxiv.org/abs/2003.00287
Learning in the Frequency Domain 論文位址:https://arxiv.org/abs/2002.12416
7.Filter Grafting for Deep Neural Networks 論文位址:https://arxiv.org/pdf/2001.05868.pdf
8.ClusterFit: Improving Generalization of Visual Representations 論文位址:https://arxiv.org/abs/1912.03330
9.Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction 論文位址:https://arxiv.org/abs/2002.11927
10.Auto-Encoding Twin-Bottleneck Hashing 論文位址:https://arxiv.org/abs/2002.11930
11.Learning Representations by Predicting Bags of Visual Words 論文位址:https://arxiv.org/abs/2002.12247
12.Holistically-Attracted Wireframe Parsing 論文位址:https://arxiv.org/abs/2003.01663
13.A General and Adaptive Robust Loss Function 論文位址:https://arxiv.org/abs/1701.03077
14.A Characteristic Function Approach to Deep Implicit Generative Modeling 論文位址:https://arxiv.org/abs/1909.07425
15.AdderNet: Do We Really Need Multiplications in Deep Learning? 論文位址:https://arxiv.org/pdf/1912.13200
16.12-in-1: Multi-Task Vision and Language Representation Learning 論文位址:https://arxiv.org/abs/1912.02315
17.Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks 論文位址:https://arxiv.org/abs/1912.09393
18.CARS: Contunuous Evolution for Efficient Neural Architecture Search 論文位址:https://arxiv.org/pdf/1909.04977.pdf 代碼:https://github.com/huawei-noah/CARS
19.Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training 論文位址:https://arxiv.org/abs/2002.10638 代碼:https://github.com/weituo12321/PREVALENT
1.GhostNet: More Features from Cheap Operations(超越Mobilenet v3的架構) 論文連結:https://arxiv.org/pdf/1911.11907arxiv.org 模型(在ARM CPU上的表現驚人):https://github.com/iamhankai/ghostnetgithub.com
We beat other SOTA lightweight CNNs such as MobileNetV3 and FBNet.
AdderNet: Do We Really Need Multiplications in Deep Learning? (加法神經網絡) 在大規模神經網絡和資料集上取得了非常好的表現 論文連結:https://arxiv.org/pdf/1912.13200arxiv.org
Frequency Domain Compact 3D Convolutional Neural Networks (3dCNN壓縮) 論文連結:https://arxiv.org/pdf/1909.04977arxiv.org 開源代碼:https://github.com/huawei-noah/CARSgithub.com
A Semi-Supervised Assessor of Neural Architectures (神經網絡精度預測器 NAS)
Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection (NAS 檢測) backbone-neck-head一起搜尋, 三位一體
CARS: Contunuous Evolution for Efficient Neural Architecture Search (連續進化的NAS) 高效,具備可微和進化的多重優勢,且能輸出帕累托前研
On Positive-Unlabeled Classification in GAN (PU+GAN)
Learning multiview 3D point cloud registration(3D點雲) 論文連結:arxiv.org/abs/2001.05119
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition(細粒度動作識别) 論文連結:arxiv.org/abs/2001.09691
Action Modifiers:Learning from Adverbs in Instructional Video 論文連結:arxiv.org/abs/1912.06617
PolarMask: Single Shot Instance Segmentation with Polar Representation(執行個體分割模組化) 論文連結:arxiv.org/abs/1909.13226 論文解讀:https://zhuanlan.zhihu.com/p/84890413 開源代碼:https://github.com/xieenze/PolarMask
Rethinking Performance Estimation in Neural Architecture Search(NAS) 由于block wise neural architecture search中真正消耗時間的是performance estimation部分,本文針對 block wise的NAS找到了最優參數,速度更快,且相關度更高。
Distribution Aware Coordinate Representation for Human Pose Estimation(人體姿态估計) 論文連結:arxiv.org/abs/1910.06278 Github:https://github.com/ilovepose/DarkPose 作者團隊首頁:https://ilovepose.github.io/coco/
OCR
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network 論文位址:https://arxiv.org/abs/2002.10200 代碼:https://github.com/Yuliang-Liu/bezier_curve_text_spotting,https://github.com/aim-uofa/adet
圖像分類
Self-training with Noisy Student improves ImageNet classification 論文位址:https://arxiv.org/abs/1911.04252
Image Matching across Wide Baselines: From Paper to Practice 論文位址:https://arxiv.org/abs/2003.01587
Towards Robust Image Classification Using Sequential Attention Models 論文位址:https://arxiv.org/abs/1912.02184
視訊分析
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications 論文位址:https://arxiv.org/abs/2003.01455
代碼:https://github.com/bbrattoli/ZeroShotVideoClassification
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs 論文位址:https://arxiv.org/abs/2003.00387
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning 論文位址:https://arxiv.org/abs/2003.00392
Object Relational Graph with Teacher-Recommended Learning for Video Captioning 論文位址:https://arxiv.org/abs/2002.11566
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution 論文位址:https://arxiv.org/abs/2002.11616
Blurry Video Frame Interpolation 論文位址:https://arxiv.org/abs/2002.12259
Hierarchical Conditional Relation Networks for Video Question Answering 論文位址:https://arxiv.org/abs/2002.10698
Action Modifiers:Learning from Adverbs in Instructional Video 論文位址:https://arxiv.org/abs/1912.06617
圖像處理
Learning to Shade Hand-drawn Sketches 論文位址:https://arxiv.org/abs/2002.11812
2.Single Image Reflection Removal through Cascaded Refinement 論文位址:https://arxiv.org/abs/1911.06634
3.Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data 論文位址:https://arxiv.org/abs/2002.11297
Deep Image Harmonization via Domain Verification 論文位址:https://arxiv.org/abs/1911.13239 代碼:https://github.com/bcmi/Image_Harmonization_Datasets
RoutedFusion: Learning Real-time Depth Map Fusion 論文位址:https://arxiv.org/pdf/2001.04388.pdf
更新
視覺常識R-CNN,Visual Commonsense R-CNN
https://arxiv.org/abs/2002.12204
Out-of-distribution圖像檢測
https://arxiv.org/abs/2002.11297
模糊視訊幀插值,Blurry Video Frame Interpolation
https://arxiv.org/abs/2002.12259
元遷移學習零樣本超分
https://arxiv.org/abs/2002.12213
3D室内場景了解
https://arxiv.org/abs/2002.12212
6.從有偏訓練生成無偏場景圖
https://arxiv.org/abs/2002.11949
自動編碼雙瓶頸哈希
https://arxiv.org/abs/2002.11930
一種用于人類軌迹預測的社會時空圖卷積神經網絡
https://arxiv.org/abs/2002.11927
面向面向深度人臉識别的通用表示學習
https://arxiv.org/abs/2002.11841
視覺表示泛化性
https://arxiv.org/abs/1912.03330
減弱上下文偏差
https://arxiv.org/abs/2002.11812
可遷移元技能的無監督強化學習
https://arxiv.org/abs/1911.07450
快速準确時空視訊超分
https://arxiv.org/abs/2002.11616
對象關系圖Teacher推薦學習的視訊captioning
https://arxiv.org/abs/2002.11566
弱監督物體定位路由再思考
https://arxiv.org/abs/2002.11359
通過預教育訓練學習視覺和語言導航的通用代理
https://arxiv.org/pdf/2002.10638.pdf
GhostNet輕量級神經網絡
https://arxiv.org/pdf/1911.11907.pdf
AdderNet:在深度學習中,我們真的需要乘法嗎?
https://arxiv.org/pdf/1912.13200.pdf
CARS:高效神經結構搜尋的持續進化
https://arxiv.org/abs/1909.04977
通過協作式的疊代級聯微調來移除單圖像中的反射
https://arxiv.org/abs/1911.06634
深度神經網絡的濾波嫁接
https://arxiv.org/pdf/2001.05868.pdf
PolarMask:将執行個體分割統一到FCN
https://arxiv.org/pdf/1909.13226.pdf
半監督語義圖像分割
https://arxiv.org/pdf/1811.07073.pdf
通過選擇性的特征再生來抵禦通用攻擊
https://arxiv.org/pdf/1906.03444.pdf
實時的基于細粒度草圖的圖像檢索
https://arxiv.org/abs/2002.10310
用子問題詢問VQA模型
https://arxiv.org/abs/1906.03444
從2D範例中學習神經三維紋理空間
https://geometry.cs.ucl.ac.uk/projects/2020/neuraltexture/
NestedVAE:通過薄弱的監督來隔離共同因素
https://arxiv.org/abs/2002.11576
實作多未來軌迹預測
https://arxiv.org/pdf/1912.06445.pdf
使用序列注意力模型進行穩健的圖像分類
https://arxiv.org/pdf/1912.02184
原文連接配接 https://blog.csdn.net/qq_15698613/article/details/112469087