Giter Site home page Giter Site logo

rafaelortegar / prince Goto Github PK

View Code? Open in Web Editor NEW

This project forked from maxhalford/prince

0.0 0.0 0.0 8.51 MB

:crown: Multivariate exploratory data analysis in Python: PCA, CA, MCA, MFA, FAMD, GPA

Home Page: https://maxhalford.github.io/prince

License: MIT License

Python 99.60% Makefile 0.40%

prince's Introduction

prince_logo


Prince is a Python library for multivariate exploratory data analysis in Python. It includes a variety of methods for summarizing tabular data, including principal component analysis (PCA) and correspondence analysis (CA). Prince provides efficient implementations, using a scikit-learn API.

Example usage

>>> import prince

>>> dataset = prince.datasets.load_decathlon()
>>> decastar = dataset.query('competition == "Decastar"')

>>> pca = prince.PCA(n_components=5)
>>> pca = pca.fit(decastar, supplementary_columns=['rank', 'points'])
>>> pca.eigenvalues_summary
          eigenvalue % of variance % of variance (cumulative)
component
0              3.114        31.14%                     31.14%
1              2.027        20.27%                     51.41%
2              1.390        13.90%                     65.31%
3              1.321        13.21%                     78.52%
4              0.861         8.61%                     87.13%

>>> pca.transform(dataset).tail()
component                       0         1         2         3         4
competition athlete
OlympicG    Lorenzo      2.070933  1.545461 -1.272104 -0.215067 -0.515746
            Karlivans    1.321239  1.318348  0.138303 -0.175566 -1.484658
            Korkizoglou -0.756226 -1.975769  0.701975 -0.642077 -2.621566
            Uldal        1.905276 -0.062984 -0.370408 -0.007944 -2.040579
            Casarsa      2.282575 -2.150282  2.601953  1.196523 -3.571794
>>> chart = pca.plot(dataset)

Installation

pip install prince

๐ŸŽจ Prince uses Altair for making charts.

Methods

flowchart TD
    cat?(Categorical data?) --> |"โœ…"| num_too?(Numerical data too?)
    num_too? --> |"โœ…"| FAMD
    num_too? --> |"โŒ"| multiple_cat?(More than two columns?)
    multiple_cat? --> |"โœ…"| MCA
    multiple_cat? --> |"โŒ"| CA
    cat? --> |"โŒ"| groups?(Groups of columns?)
    groups? --> |"โœ…"| MFA
    groups? --> |"โŒ"| shapes?(Analysing shapes?)
    shapes? --> |"โœ…"| GPA
    shapes? --> |"โŒ"| PCA

Correctness

Prince is tested against scikit-learn and FactoMineR. For the latter, rpy2 is used to run code in R, and convert the results to Python, which allows running automated tests. See more in the tests directory.

Citation

Please use this citation if you use this software as part of a scientific publication.

@software{Halford_Prince,
    author = {Halford, Max},
    license = {MIT},
    title = {{Prince}},
    url = {https://github.com/MaxHalford/prince}
}

Support

I made Prince when I was at university, back in 2016. I've had very little time over the years to maintain this package. I spent a significant of time in 2022 to revamp the entire package. Prince has now been downloaded over 1 million times. I would be grateful to anyone willing to sponsor me. Sponsorships allow me to spend more time working on open source software, including Prince.

License

The MIT License (MIT). Please see the license file for more information.

prince's People

Contributors

bambooforest avatar blu3r4y avatar charlesbmi avatar fpom avatar jodhernandezbe avatar liutongzhou avatar macfernandez avatar maxhalford avatar maximekan avatar mcalcote avatar regonn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.