Giter Site home page Giter Site logo

sandy4321 / emotion_analysis_on_twitter Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nidhikargathra/emotion_analysis_on_twitter

0.0 0.0 0.0 789 KB

Trained and evaluated performance of various classifiers (SVM, Naïve Bayes, Logistic Regression, Random Forest) for Natural Language Processing (NLP) in order to mine real-time data from Twitter and rate each tweet on its primary emotional content. Applied Synthetic Minority Oversampling Technique (SMOTE) on unbalanced dataset, achieved classification accuracy of 95%.

Jupyter Notebook 100.00%

emotion_analysis_on_twitter's Introduction

*********************************** CONFIGURATION ***********************************

== API Accounts ==

For a  user to use the GeoTwitter application, a user must have registered an account to user the Twitter API and the Google Geocoding API.  Registering with these services will provide the user with the authenticat keys and tokens required to make the API calls in the GeoTwitter code.  


Information pertaining to the PythonTwitter API can be found here: 

\href{https://developer.twitter.com/en/docs/ads/general/guides/getting-started}{Getting Started with the PythonTwitter API}.  


Similarly, the information for Google's geocoding API can be found here: 

\href{https://developers.google.com/maps/documentation/geocoding/start}{Get Started with the Geocoding API}.
 

== Authentication File ==

Once accounts are setup with the two API services, the keys and tokens will need to be stored in JSON format in a file called OAuth_Keys.json.  This file should be placed in the same directory as the GeoTwitter Jupyter Notebook.  Alternatively, the parameter CREDFILE in the code can be changed to reference a different location and filename for the JSON credential file.  A sample file containing dummy keys currently exists in the GeoTwitter zip file.

The file must contain the following key value pairs all at the initial level in the JSON file.

    KEY         VALUE
    Token       Twitter API token
    SToken      Twitter API secret token
    Key         Twitter API key
    SKey        Twitter API secret key
    GoogleKey   Google MAP API Key
	
== Data Subdirectory ==

Lastly, the data subdirectory must be included with the Jupyter Notebook.  This folder, included in the submission, contains data for the nationally sampled sentiment in JSON format.  The file is called sentiment_sample.json.

******************************** Running the Code ********************************

After all of the configuration steps have been completed, the Juypiter Notebook is ready to be run.  In order to start the GeoTwitter Application, all modules in the Jupyter Notebook must be run in order.  The final module starts the Bokeh application and will then be used to interact with the application.

IMPORTANT: Make sure that the port number in the final line of code shown below matches the port number of the Jupyter Notebook URL in the browser it is executing in.

    show(app, notebook\_url="localhost:\textbf{8888}", notebook\_handle=True)

Once the application is running, the user can supply input using the menu on the left and then submit their query by clicking on the "Get Tweets" or "Apply Filter" button.

*********************************** Limitations ***********************************

== Bokeh Server ==

Due to limitations with Bokeh Server, the same geographic location cannot be searched multiple times without restarting the Jupyter Kernel.  This is a result of the Bokeh's functionality to load specific tiles from a geographic tile provider.  Multiple searches can be done but different locations must be used.


== Python Twitter API ==

The PythonTwitter API is limited according to the rate limits specified on the Twitter API website.  As of the writing of this document the limit for retrieving tweets is 450 requests per 15-minute window.  Given that the GeoTwitter application will issue up to 50 batched requests to the Twitter API for a given search the application is theoretically limited to 9 searches every 15 minutes.  If the limit is hit, the application will return empty results in the format below.

== Google Geocoding API ==

The Google geocoding API is also rate-limited but given that the GeoTwitter application only issues one API request for each search, the user is much more likely to be constrained by the limits of the Twitter API.  A user would have to submit 2500 requests in a given 24-hour period to hit Google's rate limit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.