Text analysis project on reddit text posts and comments. Started out as a school project, decided to go further!
What was done for basic school project:
- Preprocessing
- TF-DIF, SVD
- KNN
- Deep neural network taking SVD as input
- Deep neural network with embedding
- Analysis of how number of latent factors in SVD influenced performance of KNN and neural net
Goals:
- Explore additional methods: BERT, LSTMs layers, convolutional NNs
- Use GPU acceleration
- Add dataset of images and perform the same kind of classification
- Use more metrics than accuracy to measure performance of models