This project takes a dataset file and produces a screeplot for analyzing the PC components as well as the dataset projected on the first two PC components.
Some file descriptions:
- DataGenerator.py: Can generate valid .cvs files for random based testing (Some are already generated)
- Nutrition.csv is a non-random dataset that works well with PCA
- PCA.py: Performs PCA dimentionality reduction on given dataset
To use with an sklearn dataset (IRIS or BREASTCANCER):
- python3 ./pca.py
- enter either "iris" or "breastcancer" to use the respective sklearn datasets
To use with a data file:
- python ./pca.py
- enter the name of the file ex. "Nutrition.csv"
Notes:
Currently designed to only take input data files configured in a certain way. To see configuration reference any of the .csv files in this project.
Make sure to have sklearn, matplotlib, and numpy libraries installed
Pictures: