Giter Site home page Giter Site logo

murufeng / cvpr_2021_papers Goto Github PK

View Code? Open in Web Editor NEW
604.0 9.0 96.0 250 KB

CVPR2021最新论文汇总,主要包括:Transformer, NAS,模型压缩,模型评估,图像分类,检测,分割,跟踪,GAN,超分辨率,图像恢复,去雨,去雾,去模糊,去噪,重建等等

Home Page: https://github.com/murufeng/CVPR_2021_Papers

cvpr2021

cvpr_2021_papers's Introduction

CVPR_2021_Papers汇总,主要包括论文链接、代码地址、文章解读等等

关注公众号【深度学习技术前沿】后台回复 CVPR2021 获得百度云下载链接



CVPR2021接受论文/代码分方向整理(持续更新)

分类目录:

Low-Level-Vision(主要包括:超分辨率,图像恢复,去雨,去雾,去模糊,去噪,重建等方向)

High-Level-Vision(主要包括:图像分类,检测,分割,跟踪,GAN等方向)

模型架构与数据处理(主要包括:Transformer, NAS,模型压缩,模型评估)


其它方向


CVPR2021的论文解读汇总



1.超分辨率(Super-Resolution)

Unsupervised Degradation Representation Learning for Blind Super-Resolution

Data-Free Knowledge Distillation For Image Super-Resolution

Learning Continuous Image Representation with Local Implicit Image Function(通过局部隐含图像功能学习连续图像表示)

AdderSR: Towards Energy Efficient Image Super-Resolution

Exploring Sparsity in Image Super-Resolution for Efficient Inference

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

Cross-MPI: Cross-scale Stereo for Image Super-Resolution using Multiplane Images

2.图像去雨(Image Deraining)

Semi-Supervised Video Deraining with Dynamic Rain Generator(带动态雨水产生器的半监督视频去雨)

3.图像去雾(Image Dehazing)

4.去模糊(Deblurring)

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects(快速移动物体的去模糊和形状恢复)

ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring(学习用于视频去模糊的全范围体积对应)

5.去噪(Denoising)

6.图像恢复(Image Restoration)

Multi-Stage Progressive Image Restoration

CT Film Recovery via Disentangling Geometric Deformation and Illumination Variation: Simulated Datasets and Deep Models

Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE(使用分层VQ-VAE生成图像修复的多样结构)

PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

DeFLOCNet: Deep Image Editing via Flexible Low level Controls(通过灵活的低级控件进行深度图像编辑)

PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing(利用GAN中潜在的空间维度进行实时图像编辑)

7.图像增强(Image Enhancement)

Auto-Exposure Fusion for Single-Image Shadow Removal

Learning Multi-Scale Photo Exposure Correction

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects

8.图像去摩尔纹(Image Demoireing)

9.图像阴影去除(Image Shadow Removal)

Auto-Exposure Fusion for Single-Image Shadow Removal(用于单幅图像阴影去除的自动曝光融合)

10.图像翻译(Image Translation)

Image-to-image Translation via Hierarchical Style Disentanglement

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码:用于图像到图像翻译的StyleGAN编码器)

CoMoGAN: continuous model-guided image-to-image translation(连续的模型指导的图像到图像翻译)

Spatially-Adaptive Pixelwise Networks for Fast Image Translation(空间自适应像素网络,用于快速图像翻译)

11.插帧(Frame Interpolation)

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation

CDFI: Compression-driven Network Design for Frame Interpolation

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects

12.视频压缩(Video Compression)

MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing

13.图像编辑(Image Edit)

Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing(利用GAN中潜在的空间维度进行实时图像编辑)


[1] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

[3] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外检测对象的阳性无标签数据提纯)

[4] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

[5] Instance Localization for Self-supervised Detection Pretraining(自监督检测预训练的实例定位)

[6] Multiple Instance Active Learning for Object Detection(用于对象检测的多实例主动学习)

[7] Towards Open World Object Detection(开放世界中的目标检测)

[8] You Only Look One-level Feature

[9] End-to-End Object Detection with Fully Convolutional Network()

[10] FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding(通过对比提案编码进行的小样本目标检测)

[11] Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection(学习可靠的定位质量估计用于密集目标检测)

[12] MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection(用于类别识别无监督域自适应对象检测)

[13] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)

[14] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

[1] Depth from Camera Motion and Object Detection(相机运动和物体检测的深度)

[2] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge(多模态知识提取的自监督多目标检测与有声跟踪)

[3] Dogfight: Detecting Drones from Drone Videos(从无人机视频中检测无人机)

[1] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection(利用IoU预测进行半监督3D对象检测)

[2] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)

[3] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection(ST3D:在三维目标检测上进行无监督域自适应的自训练)

[4] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)

[1] Coarse-Fine Networks for Temporal Activity Detection in Videos

[2] Detecting Human-Object Interaction via Fabricated Compositional Learning(通过人为构图学习检测人与物体的相互作用)

[3] Reformulating HOI Detection as Adaptive Set Prediction(将人物交互检测重新配置为自适应集预测)

[4] QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information(具有图像范围的上下文信息的基于查询的成对人物交互检测)

[5] End-to-End Human Object Interaction Detection with HOI Transformer(使用HOI Transformer进行端到端的人类对象交互检测)

[1] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

[2] ReDet: A Rotation-equivariant Detector for Aerial Object Detection(ReDet:用于航空物体检测的等速旋转检测器)

[3] Dense Label Encoding for Boundary Discontinuity Free Rotation Detection(密集标签编码,用于边界不连续自由旋转检测)

[4] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并:无监督的对准关键点检测器)


[1] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

[2] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

[3] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

[4] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)

[1] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)

[2] 4D Panoptic LiDAR Segmentation(4D全景LiDAR分割)

[1] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)

[2] PLOP: Learning without Forgetting for Continual Semantic Segmentation(PLOP:学习而不会忘记连续的语义分割)

[3] Cross-Dataset Collaborative Learning for Semantic Segmentation(跨数据集协同学习的语义分割)

[4] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)

[5] Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations(通过稀疏和纠缠的潜在表示的排斥力进行连续语义分割)

[6] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)

[7] Capturing Omni-Range Context for Omnidirectional Segmentation(捕获全方位上下文进行全方位分割)

[8] MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation(MetaCorrection:语义分割中无监督域自适应的域感知元丢失校正)

[9] Learning Statistical Texture for Semantic Segmentation(学习用于语义分割的统计纹理)

[10] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)

[11] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)

### 实例分割(Instance Segmentation)

[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)

[2] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)

## 超像素(Superpixel)

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)

[1] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild(学习推荐帧用于交互式野外视频对象分割)

[2] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion(模块化交互式视频对象分割:面具交互,传播和差异感知融合)

[1] Real-Time High Resolution Background Matting

[1] CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild(野外自监督的单眼3D人类姿态估计)

[2] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers(具有透视作物层的3D姿势的几何感知神经重建)

[3] DCPose: Deep Dual Consecutive Network for Human Pose Estimation(用于人体姿态估计的深度双重连续网络)

[4] Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing(用于实例感知人类语义解析的可微分多粒度人类表示学习)

[1] Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration(基于语义聚合和自适应2D-1D配准的相机空间手部网格恢复)

[2] Skeleton Based Sign Language Recognition Using Whole-body Keypoints(基于全身关键点的基于骨架的手语识别)

[1] GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation(用于单眼6D对象姿态估计的几何引导直接回归网络)

[2] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中,通过空间划分的鲁棒神经路由可实现摄像机的重新定位)

[3] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

[1] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(Cross Modal Focal Loss for RGBD Face Anti-Spoofing)

[2] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时:一个多任务学习框架)

[3] Multi-attentional Deepfake Detection(多注意的深伪检测)

[4] Image-to-image Translation via Hierarchical Style Disentanglement

[5] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)


[1] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)

[2] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段:在线多对象跟踪器)

[3] Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking(多目标跟踪的概率小波计分和修复)

[4] Rotation Equivariant Siamese Networks for Tracking(旋转等距连体网络进行跟踪)

[5] Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking(Transformer与追踪器相遇:利用时间上下文进行可靠的视觉追踪)

[6] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段:在线多目标跟踪器)

[7] Learning a Proposal Classifier for Multiple Object Tracking(用于多对象跟踪的分类器)

[8] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)


[1] Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection(【人脸伪造检测】由单中心损失监督的频率感知判别特征学习,用于人脸伪造检测)

[2] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)

[3] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis(进行全面伪造分析的多功能基准)

[4] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)

[5] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时:一个多任务学习框架)

[6] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

[7] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)

[1] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(跨模态焦点损失,用于RGBD人脸反欺骗)

[2] Multi-attentional Deepfake Detection(多注意的Deepfake检测)


## 重识别

[1] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

[2] On Semantic Similarity in Video Retrieval(视频检索中的语义相似度)
papercode

[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)
paper

[1] Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation(临时加权层次聚类,实现无监督动作分割)

[2] Coarse-Fine Networks for Temporal Activity Detection in Videos(粗细网络,用于视频中的时间活动检测)

[3] Learning Discriminative Prototypes with Dynamic Time Warping(通过动态时间扭曲学习判别性原型)

[4] Temporal Action Segmentation from Timestamp Supervision(时间监督中的时间动作分割)

[5] ACTION-Net: Multipath Excitation for Action Recognition(用于动作识别的多路径激励)

[6] BASAR:Black-box Attack on Skeletal Action Recognition(骨骼动作识别的黑匣子攻击)

[7] Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack(了解对抗攻击下基于骨骼的动作识别的鲁棒性)

[8] Temporal Difference Networks for Efficient Action Recognition(用于有效动作识别的时差网络)

[9] Behavior-Driven Synthesis of Human Dynamics(行为驱动的人类动力学综合)

[1] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)

[2] Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning(多机构协作改进基于深度学习的联合学习磁共振图像重建)

[3] 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management(用于胰腺肿块分割,诊断和定量患者管理的3D图形解剖学几何集成网络)

[4] Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies(深部病变追踪器:在4D纵向成像研究中监控病变)

[5] Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization(通过脊柱矫正和解剖学约束优化在CT中自动进行椎骨定位和识别)

[6] Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net(无监督多元规范CSCℓ4Net的脑图像合成)

[7] XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations(使用全局和局部解释诊断胸部X光片)

[8] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)

[9] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕:从组织病理学教科书和文章中学习表示形式)

[10] Discovering Hidden Physics Behind Transport Dynamics(在运输动力学背后发现隐藏物理)


[1] AttentiveNAS: Improving Neural Architecture Search via Attentive(通过注意力改善神经架构搜索)

[2] ReNAS: Relativistic Evaluation of Neural Architecture Search(NAS predictor当中ranking loss的重要性)

[3] HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens(降低NAS的成本)

[4] Prioritized Architecture Sampling with Monto-Carlo Tree Search(蒙特卡洛树搜索的优先架构采样)

[5] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator(通过生成进行搜索:带有架构生成器的灵活高效的一键式NAS)

[6] Contrastive Neural Architecture Search with Neural Architecture Comparators(带有神经结构比较器的对比神经网络架构搜索)

[7] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)


[1] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

[2] Efficient Conditional GAN Transfer with Knowledge Propagation across Classes(高效的有条件GAN转移以及跨课程的知识传播)

[3] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing(利用GAN中潜在的空间维度进行实时图像编辑)

[4] Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs(Hijack-GAN:意外使用经过预训练的黑匣子GAN)

[5] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码:用于图像到图像翻译的StyleGAN编码器)

[6] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)

[7] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)

[8] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)

[9] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)

[10] MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks(模拟未知目标模型以提高查询效率的黑盒攻击)

[11] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)

[12] LOHO: Latent Optimization of Hairstyles via Orthogonalization(LOHO:通过正交化潜在地优化发型)

[13] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

[14] Closed-Form Factorization of Latent Semantics in GANs(GAN中潜在语义的闭式分解)

[15] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

[2] A Deep Emulator for Secondary Motion of 3D Characters(三维角色二次运动的深度仿真器)

[1] 3D CNNs with Adaptive Temporal Feature Resolutions(具有自适应时间特征分辨率的3D CNN)

[14] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并:无监督的对准关键点检测器)
paper | code

[13] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding(使用缺失区域编码的循环变换完成不成对的点云)
paper

[12] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)
paper

[11] How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines(线云如何保护隐私? 从3D线中恢复场景详细信息)
paper | code

[10] PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency(使用深度空间一致性进行稳健的点云配准)
paper | code

[9] Robust Point Cloud Registration Framework Based on Deep Graph Matching(基于深度图匹配的鲁棒点云配准框架)
paper | code

[8] TPCN: Temporal Point Cloud Networks for Motion Forecasting(面向运动预测的时态点云网络)

[7] PointGuard: Provably Robust 3D Point Cloud Classification(可证明稳健的三维点云分类)

[6] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)

[5] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration(SpinNet:学习用于3D点云注册的通用表面描述符)

[4] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

[3] Diffusion Probabilistic Models for 3D Point Cloud Generation(三维点云生成的扩散概率模型)

[2] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion(用于点云补全的对抗性渲染基于样式的点生成器)

[1] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器:低重叠的3D点云的注册)

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers(具有透视作物层的3D姿势的几何感知神经重建)


[1] Manifold Regularized Dynamic Network Pruning(动态剪枝的过程中考虑样本复杂度与网络复杂度的约束)

[2] Learning Student Networks in the Wild(一种不需要原始训练数据的模型压缩和加速技术)

[1] Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation(通过自学来完善自己:通过自我蒸馏提炼特征)

[2] Knowledge Evolution in Neural Networks(神经网络中的知识进化)

[3] Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning(少班级增量学习的语义感知知识蒸馏)

[4] Teachers Do More Than Teach: Compressing Image-to-Image Models(https://arxiv.org/abs/2103.03467)

[5] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

[6] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

[7] Distilling Object Detectors via Decoupled Features(前景背景分离的蒸馏技术)


[1] Coordinate Attention for Efficient Mobile Network Design(协调注意力以实现高效的移动网络设计)

[2] Inception Convolution with Efficient Dilation Search

[3] Rethinking Channel Dimensions for Efficient Model Design(重新考虑通道尺寸以进行有效的模型设计)

[4] Inverting the Inherence of Convolution for Visual Recognition(颠倒卷积的固有性以进行视觉识别)

[5] RepVGG: Making VGG-style ConvNets Great Again

[6] Fast and Accurate Model Scaling(快速准确的模型缩放)

[7] Involution: Inverting the Inherence of Convolution for Visual Recognition(反转卷积的固有性以进行视觉识别)


[1] Transformer Interpretability Beyond Attention Visualization(注意力可视化之外的Transformer可解释性)

[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

[3] Pre-Trained Image Processing Transformer(底层视觉预训练模型)

[2] Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)


[1] KeepAugment: A Simple Information-Preserving Data Augmentation(一种简单的保存信息的数据扩充)

[3] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning(半监督转移学习的自适应一致性正则化)

[2] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

[1] Representative Batch Normalization with Feature Calibration(具有特征校准功能的代表性批量归一化)

[2] Improving Unsupervised Image Clustering With Robust Learning(通过鲁棒学习改善无监督图像聚类)

[1] Reconsidering Representation Alignment for Multi-view Clustering(重新考虑多视图聚类的表示对齐方式)

[1] Are Labels Necessary for Classifier Accuracy Evaluation?(测试集没有标签,我们可以拿来测试模型吗?)

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)

[1] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels(重新标记ImageNet:从单标签到多标签,从全局标签到本地标签)

[3] Vab-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning

[2] Multiple Instance Active Learning for Object Detection(用于对象检测的多实例主动学习)

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)


[6] Goal-Oriented Gaze Estimation for Zero-Shot Learning(零样本学习的目标导向注视估计)

[5] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

[4] Counterfactual Zero-Shot and Open-Set Visual Recognition(反事实零射和开集视觉识别)

[3] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

[2] Few-shot Open-set Recognition by Transformation Consistency(转换一致性很少的开放集识别)

[1] Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning(探索少量学习的不变表示形式和等变表示形式的互补强度)

[2] Rainbow Memory: Continual Learning with a Memory of Diverse Samples(不断学习与多样本的记忆)

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)

[1] Transformation Driven Visual Reasoning(转型驱动的视觉推理)

[4] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)

[3] Domain Generalization via Inference-time Label-Preserving Target Projections(基于推理时间保标目标投影的区域泛化)

[2] MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing(可伸缩的自适应视频压缩传感重建)

[1] FSDR: Frequency Space Domain Randomization for Domain Generalization(用于域推广的频域随机化)

[1] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)

[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)

Learning Asynchronous and Sparse Human-Object Interaction in Videos(视频中异步稀疏人-物交互的学习)

Self-supervised Geometric Perception(自我监督的几何知觉)

Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)

Data-Free Model Extraction(无数据模型提取)

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition(用于【位置识别】的局部全局描述符的【多尺度融合】)

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations(适用于正确概念的权利:通过可解释性来修正神经符号概念)

Multi-Objective Interpolation Training for Robustness to Label Noise(多目标插值训练的鲁棒性)

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs(【文本生成】VX2TEXT:基于视频的文本生成的端到端学习来自多模式输入)

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans(【图像字幕】Scan2Cap:RGB-D扫描中的上下文感知密集字幕)

Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph(基于目标关系图的分层部分可观测目标驱动策略学习)

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis(视图合成的迭代软硬变形)

PML: Progressive Margin Loss for Long-tailed Age Classification(【长尾分布】【图像分类】长尾年龄分类的累进边际损失)

Diversifying Sample Generation for Data-Free Quantization(【图像生成】多样化的样本生成,实现无数据量化)

Domain Generalization via Inference-time Label-Preserving Target Projections(通过保留推理时间的目标投影进行域泛化)

DeRF: Decomposed Radiance Fields(分解的辐射场)

Densely connected multidilated convolutional networks for dense prediction tasks(【密集预测】密集连接的多重卷积网络,用于密集的预测任务)

VirTex: Learning Visual Representations from Textual Annotations(【表示学习】从文本注释中学习视觉表示)

Weakly-supervised Grounded Visual Question Answering using Capsules(使用胶囊进行弱监督的地面视觉问答)

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation(【视频插帧】FLAVR:用于快速帧插值的与流无关的视频表示)

Probabilistic Embeddings for Cross-Modal Retrieval(跨模态检索的概率嵌入)

Self-supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map(道路动力学和成本图的自监督式多步同时预测)

IIRC: Incremental Implicitly-Refined Classification(增量式隐式定义的分类)

Fair Attribute Classification through Latent Space De-biasing(通过潜在空间去偏的公平属性分类)

Information-Theoretic Segmentation by Inpainting Error Maximization(修复误差最大化的信息理论分割)

UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pretraining(【视频语言学习】UC2:通用跨语言跨模态视觉和语言预培训)

Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)

D-NeRF: Neural Radiance Fields for Dynamic Scenes(D-NeRF:动态场景的神经辐射场)

Weakly Supervised Learning of Rigid 3D Scene Flow(刚性3D场景流的弱监督学习)


[23] Self-supervised Geometric Perception(自我监督的几何知觉)

[22] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)

[21] Modeling Multi-Label Action Dependencies for Temporal Action Localization(为时间动作本地化建模多标签动作相关性)

[20] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)

[19] Real-Time High Resolution Background Matting(实时高分辨率背景抠像)

[18] Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)

[17] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中,通过空间划分的鲁棒神经路由可实现摄像机的重新定位)

[16] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

[15] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)

[14] PatchmatchNet: Learned Multi-View Patchmatch Stereo(学习多视图立体声)

[13] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)

[12] Single-Stage Instance Shadow Detection with Bidirectional Relation Learning(具有双向关系学习的单阶段实例阴影检测)

[11] Neural Geometric Level of Detail:Real-time Rendering with Implicit 3D Surfaces(神经几何细节水平:隐式3D曲面的实时渲染)

[9] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器:低重叠的3D点云的注册)

[8] Domain Generalization via Inference-time Label-Preserving Target Projections(通过保留推理时间的目标投影进行域泛化)

[7] Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction(全局一致的非刚性重建的神经变形图)

[6] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)

[5] Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)

[4] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)

[3] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)

[2] Towards Open World Object Detection(开放世界中的目标检测)

  • [paper](Towards Open World Object Detection)
  • code

[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)



cvpr_2021_papers's People

Contributors

murufeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.