Giter Site home page Giter Site logo

enricivi / semi-falkon Goto Github PK

View Code? Open in Web Editor NEW
4.0 3.0 1.0 24.17 MB

Falkon is one of the most efficient algorithm able to work in a supervised large scale setting. This method is the result of a combination of three simple principles: sub-sampling, preconditioning and iterative solvers. In order to extend FALKON usability we have designed an extension able to work in a semi-supervised scenario.

License: MIT License

Python 100.00%
large-scale-learning kernel-methods semi-supervised-learning supervised-learning

semi-falkon's Introduction

Work in progress...

An efficient implementation of the FALKON algorithm for Large Scale kernel methods and an extension to the semi-supervised scenario

Starting from one of the simplest kernel method (Kernel Ridge Regression) Rudi et al. have designed FALKON [1]. Falkon is one of the most efficient algorithm, from both computational and statistical points of view, able to work in a supervised large scale setting. This method is the result of a combination of three simple principles: sub-sampling, preconditioning and iterative solvers. Exploiting these ideas Falkon reaches sub-quadratic time complexity and linear memory requirements. The only weak spot in their work is represented by the high cost of data labelling, especially if concerned datasets are large. In order to overcome this problem we have designed an extension of Falkon able to work in a semi-supervised scenario (that is, a dataset made up of few labelled data and a lot of unlabelled ones) [2]. As we will see in the next sections, our extension efficiently manages large semi-supervised datasets both from accuracy and time points of view.

Semi-Supervised extension

Requirements

pip install -r ./requirements.txt

Usage

It is possible to download the required dataset following this link: https://drive.google.com/drive/folders/1ZjAZUafi6NfjQb_TuGuvThQv_r5HXRhC?usp=sharing.

python moons.py dataset/moons_3m_s04.npy --n_labeled 10 --gpu True

Some results

(Main) Reference papers

  1. FALKON: An Optimal Large Scale Kernel Method - Alessandro Rudi, Luigi Carratino and Lorenzo Rosasco - https://arxiv.org/abs/1705.10958

  2. Lagrangean-Based Combinatorial Optimization for Large Scale S3VMs - Francesco Bagattini, Paola Cappanera and Fabio Schoen - https://ieeexplore.ieee.org/abstract/document/8113555

semi-falkon's People

Contributors

enricivi avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

claudiofantacci

semi-falkon's Issues

Residual equal to zero leads to NaN in output of trained model.

Hi, I tried running your implementation of Falkon but I get an error during traning. for a binary classification problem.

Specifically, when computing the conjugate gradient at line 65, the residual b is a vector of zeros, this causes a division by 0 here.
However, the first residual b is computed at line 61 and shouldn't be zero.
This results in training a model that outputs NaN values.

Am I missing? Any kind of preprocessing on the data or any kind of flag? I tried both gpu and cpu version getting the same error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.