Giter Site home page Giter Site logo

ppraykov / emoji_trends Goto Github PK

View Code? Open in Web Editor NEW

This project forked from enric1994/emoji_trends

0.0 1.0 0.0 1.44 MB

How emojis are used on Twitter ๐Ÿ˜Ž

Home Page: https://emoji.enricmor.eu

Python 29.40% HTML 42.84% JavaScript 20.83% CSS 6.92%

emoji_trends's Introduction

Emoji Trends: How emojis are used on Twitter

Webpage: emoji.enricmor.eu

Hits

Alt text Alt text

Scraping

The tweets have been collected using the GetOldTweets-python fork that includes emoji support. The Python script bypass some limitations of Twitter Official API like accessing old tweets and requests limit.

The script used to download the tweets is composed by the following parameters: python3 Exporter.py --lang "en" --querysearch "๐ŸŽ" --since 2014-02-03 --until 2014-02-04 &

  • lang "en": Filters the language of the tweets. English is the language selected to filter the tweets.
  • querysearch "๐ŸŽ": The text or emoji to be collected.
  • since/until: tweet's date range. A one day range has been used for this project.

The scraping speed is around 3.7 million tweets per hour when running the script in parallel. Specifically, one instance of the script has been used for each day and for each emoji.

In terms of accuracy, the scrapper miss some tweets and missclassify the language of some tweets in other languages as English. However, the data extracted provide good insights in terms of the emoji frequency.

Data Structure

The data obtained has the following structure: "username","date","retweets","favorites","text","geo","mentions","hashtags","id","permalink","emoji" However, only the date and emoji columns are used for this project.

Downsampling

The processed data is composed by 2405 values with the daily usage of each emoji over the years. To smoothly represent the data in the browser the Largest-Triangle-Three-Buckets (LTTB) downsampling algorithm is applied to reduce it to 50 data points. The downsampled data keep the maximums/minimums while the data spacing is reasonable.

Chart.js

In order to visualize the data, Chart.js has been used. Chart.js is a Javascript library to create highly customizable interactive graphs on the browser.

The following plugins have been used to customize the charts:

  • Rough: adds a cartoon-like style too the charts.
  • Deferred: adds a delay when loading the charts.

Other sources:

The following examples are used in the website:

Some stats:

  • Total tweets: 3,015,922,953
  • Dataset size: 798GB
  • Tweets scrapped per hour: 3.7 million (aprox)

Similar projects

There are some similar projects involving tweets and emojis that I used as a source of inspiration, specially Ribbonline from my friend Dani Balcells.

Other related projects are: Emojitracker and Twitter Emoji Race

emoji_trends's People

Contributors

enric1994 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.