nltk_bookreview_sentimentanalysis.ipynb
- positive/negative sentiment analysis on amazon book reviews
nltk_spam_classifier.ipynb
- positive/negative spam prediction on sms text messages
nltk_bbcnews_topicmodelling.ipynb
- topic modelling of 2000 BBC news articles into categories.
- preprocess using CountVectorizer and tf-idf.
- apply PCA and k-means to estimate optimal n_topics
- Clustering using t-SNE and UMAP reductions
- LDA topic modelling with gensim vs sklearn
- wordcloud visualisation of topic keywords
- BERTopic topic modelling, visualize topic distance, hierarchical linkage
- topic modelling of 2000 BBC news articles into categories.
gammaraysky / nltk_exercises Goto Github PK
View Code? Open in Web Editor NEWsmall compilation of classification and clustering problems