In the modern era of online communication, managing and moderating user-generated content is crucial for maintaining healthy and safe online communities. The "Toxic Comment Scoring" project is a Deep learning solution designed to automatically identify and score toxic comments in online discussions, forums, social media platforms, and more.
- Clone the repo using:
git clone https://github.com/chiragdeep01/Toxic-Comment-Scoring.git
If you want to only access the api, you can use docker also:
docker compose build
docker compose up
For Python:
- Create Enviroment:
conda create -n toxic python==3.11.5
cd Toxic-Comment-Scoring
pip install -r requirements.txt
I have trained a LSTM based deep learning model to output score for 6 different classes for toxicity measure and they are:
- toxic
- severe_toxic
- obscene
- threat
- insult
- identity_hate
The model implementation and inference is under ./toxicModel/
I have created an API using FASTAPI to generate predictions and for that you can just use the run.py file:
python run.py
API DESCRIPTION:
POST localhost:8080
Key | Value | Type |
---|---|---|
text |
You are an idiot |
string |
JSON:
{
"text" : "You are an idiot"
}
RESPONSE:
{
"toxic": "0.81315887",
"severe_toxic": "0.050926227",
"obscene": "0.48636666",
"threat": "0.017364092",
"insult": "0.43801567",
"identity_hate": "0.052807126"
}
For Python you can Import toxicModel and do direct inference. Example:
import toxicModel
toxic_model = toxicModel.Model(device = "gpu:0")
result = toxic_model.predict("hello how are you")
print(result)
By default the device is set to auto and will automatically find the gpu id if present and will load the model to gpu otherwise onto cpu. For cpu just pass "cpu" as device.
I have trained the model on dataset I found on kaggle - LINK