Giter Site home page Giter Site logo

vision-papers's Introduction

Vision-Papers

This is a non-exhaustive repo of papers I've read as a student at UC San Diego, Carnegie Mellon, and professionally as a computer vision engineer / intern, mostly for personal reference. It probably won't be very beneficial to the general public aside from giving a sense of what the average computer vision engineer reads through.

Many papers that are standard deep learning / computer vision course material (optimizers, learning rate schedules, special activations or convolutions) are omitted, as are papers that I just don't remember at the time of writing this.

If you're a recruiter, questions about papers from this list are game :)

Classical

(2004) SIFT: Distinctive Image Features from Scale-Invariant Keypoints

(2005) HOG: Histograms of Oriented Gradients for Human Detection

(2011) Line2D / LineMOD: Gradient Response Maps for Real-Time Detection of Texture-Less Objects

(2012) ORB: an efficient alternative to SIFT or SURF

(2013) BOLD: features to detect texture-less objects

(2013) R-CNN: Rich feature hierarchies for accurate object detection and semantic segmentation

2015

ResNet: Deep Residual Learning for Image Recognition

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

SSD: Single Shot MultiBox Detector

YOLOv1: You Only Look Once: Unified, Real-Time Object Detection

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

U-Net: Convolutional Networks for Biomedical Image Segmentation

2016

SiamFC: Fully-Convolutional Siamese Networks for Object Tracking

BiGAN: Adversarial Feature Learning

ResNeXt: Aggregated Residual Transformations for Deep Neural Networks

DenseNet: Densely Connected Convolutional Networks

Layer Norm: Layer Normalization

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Learning without Forgetting

2017

CycleGaN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

RetinaNet: Focal Loss for Dense Object Detection

LARS: Large Batch Training of Convolutional Networks

MixUp: Beyond Empirical Risk Minimization

NIMA: Neural Image Assessment

Swish: Searching for Activation Functions

SENet: Squeeze-and-Excitation Networks

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

Mask R-CNN: Mask R-CNN

DeepLabv3: Rethinking Atrous Convolution for Semantic Image Segmentation

Soft-NMS: Improving Object Detection With One Line of Code

Transformer: Attention Is All You Need

2018

SiamRPN: High Performance Visual Tracking with Siamese Region Proposal Network

CBAM: Convolutional Block Attention Module

MnasNet: Platform-Aware Neural Architecture Search for Mobile

MobileNetV2: Inverted Residuals and Linear Bottlenecks

YOLOv3: An Incremental Improvement

GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training

DeepSVDD: Deep One-Class Classification

PVNet: PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation

SWA: Averaging Weights Leads to Wider Optima and Better Generalization

DropBlock: DropBlock: A regularization method for convolutional networks

2019

D2-Net: A Trainable CNN for Joint Detection and Description of Local Features

CORAL: Rank consistent ordinal regression for neural networks with application to age estimation

SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks

QATM: Quality-Aware Template Matching For Deep Learning

MobileNetv3: Searching for MobileNetV3

SKNet: Selective Kernel Networks

MS R-CNN: Mask Scoring R-CNN

EfficientNet: https://arxiv.org/abs/1903.00241

Mish: A Self Regularized Non-Monotonic Activation Function

MOCO: Momentum Contrast for Unsupervised Visual Representation Learning

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

GIoU: Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

2020

FCDD: Explainable Deep One-Class Classification

MOCOv2: Improved Baselines with Momentum Contrastive Learning

SCAN: Learning to Classify Images without Labels

SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

SimCLRv2: Big Self-Supervised Models are Strong Semi-Supervised Learners

SimSiam: Exploring Simple Siamese Representation Learning

SwAV: Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

YOLOv4: Optimal Speed and Accuracy of Object Detection

ViT: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

DETR: End-to-End Object Detection with Transformers

PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization

FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking

2021

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

DINO: Emerging Properties in Self-Supervised Vision Transformers

vision-papers's People

Contributors

gerardmaggiolino avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.