This is a Natural Language Processing project. The objective is to create a model which learns from a given tagged training data set of news (ID|Title|Text|Label) to predict if more news are Fake or not.
This project was created as part of the Master of Business Analytics and Big Data program at IE University by Breogán Pardo and Bryan Sebastián Vásquez.
So far this project is composed of the following elements:
- Python notebook: the core of the project where the development, analysis and results are presented and discussed.
- utils.py: external functions for preprocessing and data visualization to give more clarity to the code.
- clf.py: functions related to the Machine Learning classifiers used in this project together with the Confusion Matrix function.
- words_dictionary.json: a json file dictionary containing all English vocabulary.