niharikabalachandra / language-detection-multinomiallogisticregression Goto Github PK
View Code? Open in Web Editor NEWLanguage Detection using the European Parliament Proceedings Parallel Corpus. European Parliament Proceedings Parallel Corpus is a text dataset used for evaluating language detection engines. The 1.5GB corpus includes 21 languages spoken in EU. This project aims to build a machine learning model trained on this dataset to predict new unseen data.
License: MIT License