Giter Site home page Giter Site logo

mvdata's Introduction

mvdata

Data sets for multi-view learning in MatLab ".mat" file format.

##Introduction

This repository contains the data that is used in the following paper.

Li, Yeqing, Feiping Nie, Heng Huang, and Junzhou Huang. "Large-Scale Multi-View Spectral Clustering via Bipartite Graph." In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.

If you use the data, please cite our paper:

@inproceedings{li2015large,
  title={Large-Scale Multi-View Spectral Clustering via Bipartite Graph},
  author={Li, Yeqing and Nie, Feiping and Huang, Heng and Huang, Junzhou},
  booktitle={Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence},
  year={2015}
}

###How to use

Download from google drive using the browser.

Note: The whole dataset is about 300MB to 400MB.

###About the data format

All the data is stored in MatLab ".mat" file format. Each ".mat" file contains at least two variables: X and Y. "X" is a cell array of feature matrix (dimension N by d) and "Y" is the label vector (dimension N by 1), where N is the number of data points and d is the dimension of features. The specification of the data is listed below

No. Handwritten Caltech-7/20 Reuters NUS-WIDE AWA
1 Pix(240) Gabor(48) English(21531) CH(65) CQ(2688)
2 Fou(76) WM(40) France(24892) CM(226) LSS(2000)
3 Fac(216) CENTRIST(254) German(34251) CORR(145) PHOG(252)
4 ZER(47) HOG(1984) Italian(15506) EDH(74) SIFT(2000)
5 KAR(64) GIST(512) Spanish(11547) WT(129) RGSIFT(2000)
6 MOR(6) LBP(928) - - SURF(2000)
num of data 2000 1474/2386 18758 26315 4000
num of classes 10 7/20 6 31 50

Note: Caltech-7 and Caltech-20 are two subsets of Caltech-101, which contains only 7 and 20 classes respectively. The creation of these subsets is due to the unbalance of the number of data in each classes of Caltech-101.

List of features:

  • Fourier coefficients of the character shapes (FOU)
  • Profile correlations (FAC)
  • Pixel averages in 2 ร— 3 windows (Pix),
  • Zernike moment (ZER)
  • Morphological (MOR) features.
  • Gabor feature
  • Wavelet moments (WM)
  • CENTRIST feature
  • Histogram of oriented gradients (HOG) feature
  • GIST feature
  • Local binary patterns (LBP) feature
  • Color Histogram (CH)
  • Color moments (CM)
  • Color correlation (CORR)
  • Edge distribution (EDH)
  • Wavelet texture (WT)

##Source

The source of the data are downloaded from the following links:

mvdata's People

Contributors

yeqinglee avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.