bra1ndump / google-wrapped Goto Github PK

View Code? Open in Web Editor NEW

Spotify Wrapped but for your Google Searches. SparkNLP

License: GNU Affero General Public License v3.0

Jupyter Notebook 7.42% Dockerfile 1.70% Python 28.11% HTML 5.22% Stata 0.12% TypeScript 44.63% CSS 5.39% JavaScript 7.41%

google-wrapped's Introduction

Start web server

cd web
npm install
npm run watch

Re-create datasets-client

Important - run spark in Codespaces, not tested on macOS. Previously expereinced installation issues.

Nesessary for the client to run on device inference. The datasets contain words for different classes, for example politics, sex and health.

To get the bags of words for categories defined in spark_pipeline.py, run create_bags_of_words.ipynb, and split the output into separate files. It runs SparkNLP classifier on all english words from http://www.mieliestronk.com/wordlist.html and splits them into themes of interest.

The classifier is reused from the previous server side classification that would run user's searches through it. Thus a lot of confusing naming.

Place the files in datasets-client, format as bag_.txt. These will be fetched from the client.

Web server (Spark setup, old, now processing on javascript side)

Start with gunicorn --reload 'server:app'

Recommend Projects

bra1ndump / google-wrapped Goto Github PK

google-wrapped's Introduction

Start web server

Re-create datasets-client

Web server (Spark setup, old, now processing on javascript side)

google-wrapped's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent