Giter Site home page Giter Site logo

Comments (4)

slavaspirin avatar slavaspirin commented on August 10, 2024

I compared both APIs at the end of today (REST and streamer). Only 1-4% have geolocations.
Also even when I look up every user from the resulting search, go to his/her page and scroll through the feed, most likely the geolocation will always be on or off.

Taking all accounts into consideration, about 70% have city-level coordinates, ~3% have polygon/point coordinates and the rest have nothing at all.

Another way of getting the geodata would be going through one's followers and extracting all geolocations if present hoping that somebody from this person's followers list has exact geolocation posted. This approach is random and I am sure I'll face rate limit shortly after the start.

Still working on it.

from twitter_sentiment.

winstonll avatar winstonll commented on August 10, 2024

OK let me clarify. So you are saying that open 1-4% of the tweet have a location right? Does that depend on the person or purely random? For example, consider the following scenarios:

  1. A person either turns location on or off, and so if they turn it on then every tweet by this person will have a location, or
  2. Location is randomly shown and in one's feed there is a 1-4% chance that a location exists?

The former means that we can only get locations for certain people (the one's with location turned on) while the latter means we can get locations for everyone so long as we query enough historical tweets from them.

from twitter_sentiment.

slavaspirin avatar slavaspirin commented on August 10, 2024
  • In average people that share their polygon geolocation are 40% likely to share it again in the next post.
  • People that shared their Point location are going to share it again with 60% chance.
  • People that don't share a location only shared it in 1.6% of their posts.

So yes, if a user has user_geoloation enabled he/she is likely to share it again, but there is no certainty when. Also, my sample sizes were about 50-100 users so these numbers can vary.

Do you think we can cluster tweets to the ones that have exact geolocation?

from twitter_sentiment.

slavaspirin avatar slavaspirin commented on August 10, 2024

Stats for the 120k tweets I have collected so far:

geo_enabled: 40.68% of the tweets

city_level: 68.91%
box_level: 2.43%
point_level: 0.16%

from twitter_sentiment.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.