Authors:
- Dario Galimberti
- Paola Impiccichè
- Giacomo Pracucci
This project presents a study on text classification and topic modeling using an e-commerce dataset containing product descriptions.
The task of Text Classification involves the categorization of product descriptions into predefined classes using three Machine Learning algorithms: Multinomial Naive Bayes, Logistic Regression and Random Forest.
The second part discusses the task of Topic Modeling using Latent Dirichlet Allocation (LDA) and BERTopic, to discover latent themes or topics present in the descriptions of different product lines.