This kaggle challenge is perfect for getting started in Natural Language Processing (NLP).
- Data exploration
- Data cleaning - remove URLs, punctuations, emojis
- Tokenization - count vectorization, TFIDF
- Stopword removal
- Stemming and lemmatization
- Model evaluation
- Hyperparameter tuning
- Gain deeper understanding in the important features in NLP
- Experiment using BERT (Bidirectional Encoder Representations from Transformers) model