Library for ML feature type inference: https://github.com/pvn25/ML-Data-Prep-Zoo/tree/master/MLFeatureTypeInference
Original repo: https://github.com/pvn25/SortingHatLib.git
- Install the package using python-pip
git clone https://github.com/fabulousdj/SortingHatLib.git
pip install SortingHatLib/
- Import the library using
import sortinghat.pylib as pl
- Read in csv file using pandas
# rf: Random Forest, neural: Neural Model, knn: K-nn, logreg: Logistic Regression, svm: RBF_SVM
dataDownstream = pd.read_csv('adult.csv')
- Perform base featurization of the raw CSV file:
dataFeaturized = pl.FeaturizeFile(dataDownstream)
- bigram feature extraction for Random Forest:
dataFeaturized1 = pl.FeatureExtraction(dataFeaturized)
- Finally, load the model for prediction
y_RF = pl.Load_RF(dataFeaturized1)