Text classifier and cluster. It can be applied to the fields of sentiment polarity analysis, text risk classification and so on, and it supports multiple classification algorithms.
text-classifier is a python Open Source Toolkit for text classification and text clustering. The goal is to implement text analysis algorithm, so as to achieve the use in the production environment. text-classifier has the characteristics of clear algorithm, high performance and customizable corpus.
text-classifier provides the following functions:
- Classifier
- LogisticRegression
- MultinomialNB
- KNN
- SVM
- RandomForest
- DecisionTreeClassifier
- Xgboost
- Neural Network
- Evaluate
- Precision
- Recall
- F1
- Test
- Chi-square test
- Cluster
- MiniBatchKmeans
While providing rich functions, text-classifier internal modules adhere to low coupling, model adherence to inert loading, dictionary publication, and easy to use.
https://www.borntowin.cn/product/sentiment_classify
git clone https://github.com/shibing624/text-classifier.git
pip3 install -r requirements.txt
- Preprocess with segment
python3 preprocess.py
- Train model
you can change model with edit config.py
and train model.
python3 train.py
- Predict with test data
python3 infer.py
- LogisticRegression
- Random Forest
- Decision Tree
- K-Nearest Neighbours
- Naive bayes
- Xgboost
- Support Vector Machine(SVM)
- MLP
- Ensemble
- Stack
- Xgboost_lr
- text CNN
- text RNN
- fasttext
- HAN
- Kmenas
- SentimentPolarityAnalysis
- Apache Licence 2.0