dataquestio / twitter-scrape Goto Github PK

View Code? Open in Web Editor NEW

159.0 159.0 100.0 3 KB

Download streaming tweets that match specific keywords, and dump the results to a file.

Python 100.00%

twitter-scrape's People

Contributors

Stargazers

Watchers

Forkers

dennis-leeyinghui stephon94 nachocarracedo annamalainagappan 0xskl felixmichel lfliu ubajakacj-zz jgabriellima adityagogoi enterstudio arnebab ghostlyman jwilber olivierbrouw edu-glez akhilesh1194 venkateshchanda parithy86 deegeorgie krupeshd xiaojieliu7 ankitvelani prafull1249 kevinjyee shaivya-rastogi franciscocobo himanshurepo psyguy danbraunai yongduek pankajmehar winson-li elilienstein sharonteo khwilo danielfenghk salvadharani smsaladi renkasiyas axdliu rahulpargi gouper rohit-yadav jianliu91 guoqi228 osgoodbl kangkang-li benjamin-l-bc siddanth-pai shubham1304 jacobklim dom1984 dapperauteur codingdinocat dsantial waeschinger a12o gonzaloulla ro9ueadmin bladeg0d mark-maker nakosy arslan1979 sudarshanperuru rakeshmadivi dnpthree stephaniewangxy ylehilds abhijitsk beauof fmzingler jersk41 acse-wy619 sohelbaba lorentepol-ironhack-labs arielmichelle cjsackey thoreplass idowuilekura azizulwahid thimontenegro ashwin1910 bulenttokuzlu sauravp10 smoke-pgf thesins ahmadhirthani willcsw leenajenifer daljit3 nimritakoul callysthenes ashrafur22 swariara amankrah ritazheng8blocks cloudbigdatainnovation

twitter-scrape's Issues

dataset.freeze changed to datafreeze.freeze

The authors of the package have put the export functionality from dataset into a new package:

https://github.com/pudo/datafreeze/tree/master/datafreeze

I'm having trouble setting both up at the same time. With the latest dataset version the dump.py file doesn't work anymore. Which version did you use?

doesn't pull any Twitter data

Using Python3.5.3 on the most recent Raspbian Stretch (Jan2018).

I had to change some code, including installing the new datafreeze module (see issue #2), so it runs without throwing any errors. But it still just doesn't seem to connect to Twitter at all. Any ideas?

regarding private.py

I have used my credentials, can anyone tell me the syntax of the
connection string, please?

Troubles on DB upload

hi im working with the project on python 3.7 and the following massage appears
'latin-1' codec can't encode character '\u2026' in position 139: ordinal not in range(256)
anyone have the same problem ??

I am unable to pull twitter data...

Can someone help......

private.py

Hi, I am a newbie at this. Where exactly is one supposed to create the private.py file? I created another python file with the code as given. But when I run the main file, it says "no private module".

Syntax error

Showing syntax error while running scrapper.py

File "C:\Users\Hp\Aswin\lib\site-packages\tweepy\utils.py", line 91
    raise ImportError, "Can't load a json library"
                     ^
SyntaxError: invalid syntax

Not saving any data to database file

Is there a way to confirm if the code is connecting to twitter correctly and not just saving to the .db file? That is my current issue, I have freshly generated authentication keys for the twitter dev account. I am just trying to figure out where i can get metrics from.

scraper.py is Pulling retweets also

Consider changing the code in the part where retweets should be filtered out. According to Twitter documentation the object 'retweeted_status' is presented only when the tweet is a 'retweet' - https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/intro-to-tweet-json .

The "retweeted" object that you use in your scraper.py script does not exclude retweets (as long as I correctly understood the logic of your script - you want to filter them out in the beginning of the script). The "retweeted" object" indicates whether this Tweet has been Retweeted by the authenticating user" - https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object .

To remove retweets you can simply check whether the 'status' argument in on_status() method has the 'retweeted_status' attribute.

I have ran the script and currently the output contains retweets.

Could not parse rfc1738 URL from string

I have all the requirements, running python3.5.3 in virtualenvwrapper, on the most recent Raspbian Stretch (Jan2018), but I always get this. Any hints?

Traceback (most recent call last):
  File "scraper.py", line 8, in <module>
    db = dataset.connect(settings.CONNECTION_STRING)
  File "/home/pi/.virtualenvs/testtwitter_scrape/lib/python3.5/site-packages/dataset/__init__.py", line 41, in connect
    ensure_schema=ensure_schema, row_type=row_type)
  File "/home/pi/.virtualenvs/testtwitter_scrape/lib/python3.5/site-packages/dataset/database.py", line 53, in __init__
    self.engine = create_engine(url, **engine_kwargs)
  File "/home/pi/.virtualenvs/testtwitter_scrape/lib/python3.5/site-packages/sqlalchemy/engine/__init__.py", line 419, in create_engine
    return strategy.create(*args, **kwargs)
  File "/home/pi/.virtualenvs/testtwitter_scrape/lib/python3.5/site-packages/sqlalchemy/engine/strategies.py", line 50, in create
    u = url.make_url(name_or_url)
  File "/home/pi/.virtualenvs/testtwitter_scrape/lib/python3.5/site-packages/sqlalchemy/engine/url.py", line 205, in make_url
    return _parse_rfc1738_args(name_or_url)
  File "/home/pi/.virtualenvs/testtwitter_scrape/lib/python3.5/site-packages/sqlalchemy/engine/url.py", line 254, in _parse_rfc1738_args
    "Could not parse rfc1738 URL from string '%s'" % name)
sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string ''

Requirements of libicu-dev

Hi!

Good jos with twitter-scrape. I open this issue to tent you to write a requirements section files. For examples the pyicu need have installed libicu-dev. And when a I run pip install -r requirements that fails.

Just an advice.

Regards!

dataquestio / twitter-scrape Goto Github PK

twitter-scrape's People

Contributors

Stargazers

Watchers

Forkers

twitter-scrape's Issues

dataset.freeze changed to datafreeze.freeze

doesn't pull any Twitter data

regarding private.py

Troubles on DB upload

I am unable to pull twitter data...

private.py

Syntax error

Not saving any data to database file

scraper.py is Pulling retweets also

Could not parse rfc1738 URL from string

Requirements of libicu-dev

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent