Giter Site home page Giter Site logo

yassouali / awesome-semi-supervised-learning Goto Github PK

View Code? Open in Web Editor NEW
1.7K 61.0 224.0 249 KB

😎 An up-to-date & curated list of awesome semi-supervised learning papers, methods & resources.

License: MIT License

deep-learning semi-supervised-learning machine-learning computer-vision natural-language-processing graph-neural-networks generative-model

awesome-semi-supervised-learning's Introduction

Awesome Semi-Supervised Learning

Awesome MIT License PRs Welcome Maintenance

A curated list of awesome Semi-Supervised Learning resources. Inspired by awesome-deep-vision, awesome-deep-learning-papers, and awesome-self-supervised-learning.

Background

What is Semi-Supervised Learning?

It is a special form of classification. Traditional classifiers use only labeled data (feature / label pairs) to train. Labeled instances however are often difficult, expensive, or time consuming to obtain, as they require the efforts of experienced human annotators. Meanwhile unlabeled data may be relatively easy to collect, but there has been few ways to use them. Semi-supervised learning addresses this problem by using large amount of unlabeled data, together with the labeled data, to build better classifiers. Because semi-supervised learning requires less human effort and gives higher accuracy, it is of great interest both in theory and in practice.

How many semi-supervised learning methods are there?

Many. Some often-used methods include: EM with generative mixture models, self-training, consistency regularization, co-training, transductive support vector machines, and graph-based methods. And with the advent of deep learning, the majority of these methods were adapted and intergrated into existing deep learning frameworks to take advantage of unlabled data.

How do semi-supervised learning methods use unlabeled data?

Semi-supervised learning methods use unlabeled data to either modify or reprioritize hypotheses obtained from labeled data alone. Although not all methods are probabilistic, it is easier to look at methods that represent hypotheses by p(y|x), and unlabeled data by p(x). Generative models have common parameters for the joint distribution p(x,y). It is easy to see that p(x) influences p(y|x). Mixture models with EM is in this category, and to some extent self-training. Many other methods are discriminative, including transductive SVM, Gaussian processes, information regularization, graph-based and the majority of deep learning based methods. Original discriminative training cannot be used for semi-supervised learning, since p(y|x) is estimated ignoring p(x). To solve the problem, p(x) dependent terms are often brought into the objective function, which amounts to assuming p(y|x) and p(x) share parameters

(source: SSL Literature Survey.)

An example of the influence of unlabeled data in semi-supervised learning. (Image source: Wikipedia)

Contributing

If you find any errors, or you wish to add some papers, please feel free to contribute to this list by contacting me or by creating a pull request using the following Markdown format:

- Paper Name. 
  [[pdf]](link) 
  [[code]](link)
  - Author 1, Author 2, and Author 3. *Conference Year*

and adding them to the corresponding markdown file in files/.

Books

Codebase

Surveys & Overview

Computer Vision

Note that for Image and Object segmentation tasks, we also include weakly-supervised learning methods, that uses weak labels (eg, image classes) for detection and segmentation.

NLP

Generative Models & Tasks

Graph Based SSL

Theory

Reinforcement Learning, Meta-Learning & Robotics

Regression

Other

Talks

Thesis

Blogs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.