taspinar / twitterscraper Goto Github PK

View Code? Open in Web Editor NEW

2.4K 2.4K 581.0 1.26 MB

Scrape Twitter for Tweets

License: MIT License

Python 99.54% Dockerfile 0.46%

twitterscraper's People

Contributors

Stargazers

Watchers

Forkers

masdevid sils errakeshpd crishernandezmaps rainfireliang gayathrisampath1 schollz wangchen1117 sahwar wangjunbo571 gustavoaires bskrishna77 matthewstidham rsesha ashgreat ricky-wilson samanthaklee bluegoo192 gridl dariolourenco gnanam336 abhinavsohani gyuhwung ningweiii darthbhyrava allurivijay polinas123 shokesu teb5240 alegomes katronai l3r1nmax indrajitharidas mstei4176 damellp adupuis2 aneliram89 blogedwin adhiravishankar tacnayn chubbymaggie infinitiii geapoch ikario404 halprez lisandro11 tonkpo mvdwaeter milosmladenovic5 dkennedy778 may215 calclavia d0tn3t shubhamdipt citronicgearon kolliparap weeshlow jhchiu1 kimmorsha cpl tumtumbear wonseokch plightt moodlezoup youjin-c acesounderglass arp12 pavanjuturu grevutiu-gabriel vikaskodag2 blue2161 abdzrahim oattie taniajacob atuljha23 jellyjr bayupaoh alioh rajat-np bradmonk christinazxy daewonseo ewertonsantiago fo0nikens jicksonp hinsencamp hieuqtran lapp0 etemiz a-moss dkoguciuk mittalakshay6 ecmyhre dfirgeek julianmack nileshjorwar beliscime selvakarthik21 mirving9 jacklaurencegaray

twitterscraper's Issues

Extract tweets using user handler and the problem with number of retweets and likes?

Sir
I am getting the data but the number of retweets and likes are always shown as zero. I wonder why is it so! And also I wanted to know if there is a way to extract tweets of a specific person using username?

More attributes

Is it possible to get more attributes like number of retweets, replies, and favorites? This is a feature request I guess

Python 3 support

Would be nice to be able to use this in python 3.

(pythonclock.org :))

ImportError: No module named 'tweet'

I get the following error when trying to use this.
Installed in a venv via pip

Traceback (most recent call last):
  File "collector.py", line 1, in <module>
    import twitterscraper
  File "/home/m0hawk/Documents/dev/TUHH/testvenv/lib/python3.5/site-packages/twitterscraper/__init__.py", line 13, in <module>
    from twitterscraper.query import query_tweets
  File "/home/m0hawk/Documents/dev/TUHH/testvenv/lib/python3.5/site-packages/twitterscraper/query.py", line 14, in <module>
    from tweet import Tweet
ImportError: No module named 'tweet'

Error: message "Error occurred during getting browser"

I successfully installed twitterscraper on my notebook, using Linux. But, when I tried to run it, I got the following error message:

"Error occurred during getting browser"

What should I do?

Thanks.

Use proper dates

We're currently extracting human readable timestamps, however there's a property data-time-ms in the span within the a which contains it: <span class="_timestamp js-short-timestamp " data-aria-label-part="last" data-time="1476057559" data-time-ms="1476057559000" data-long-form="true">Oct 9</span> - parsing the other string into proper date objects is almost impossible, they sometimes contain AM/PM, sometimes not, sometimes dots here and there, sometimes not, occasionally I get localized months...

How to scrape users' tweets

I'm trying to extract specified users' tweets.
By using this command line: twitterscraper Trump --limit 100 --output=tweets.json

it just extracts all twists that the person name is mentioned in it instead of the users' tweets.

My question, how can extract all specified users' tweets
Thank you...

Advanced query

Docu gives:
"You can use any advanced query twitter supports. Simply compile your query at https://twitter.com/search-advanced."

Lets say I try to get all tweets from user 'username'
I get the url https://twitter.com/search?f=tweets&q=from%3Ausername&src=typd
Which part (if not the whole url) is the query?

Can‘t get data earlier than 12 days ago

Hi Taspinar and Sils,

I was collecting movie data of last year today, it seems like the date issue occurring again, I cannot get the data earlier than 12 days ago :( and I have tried many times. It's as if some sort of notification occurred that enabled Twitter to know I was trying to go back further than 12 days. So how can I solve this problem?

Thank you so much!

Parralel scraping doesn't seem to work

I did a few logging modifications and if you checkout https://github.com/sils/twitterscraper/tree/sils/parallel and scrape for test or something like that you'll get like 60ish tweets sometimes for some parts of months which seems rather impossible (and doesn't check out if you put in the advanced query into the search UI)

@taspinar if you have any idea that'd help a lot :/

Add CI and at least like one integration test

So we see everything works on the versions we want to support, should be easy to do.

JSONify tweets properly

the namedtuple just jsonifies them as tuples, would be better to be more dict like and have the member names as keys in the outputted JSON

Warning: this package expressly violates Twitter's TOS

https://twitter.com/en/tos

"scraping the Services without the prior consent of Twitter is expressly prohibited"

except urllib2.HTTPError, e: (invalid syntax)

Hello again !

I've just tried my precedent script in Python 3, and got immediately these error :

File "myscript.py", line 4, in
from twitterscraper import TwitterScraper
File "/usr/local/lib/python3.4/dist-packages/twitterscraper/TwitterScraper.py", line 109
except urllib2.HTTPError, e:
^
SyntaxError: invalid syntax

Maybe it's a naive alternative, but I've discovered recently requests and found this module more powerful than urllib. Here some scraping example with requests !

Zero Result

Hello @taspinar

Recently I run twitterscraper from my command line.

C:\Python27\Scripts\twitterscraper Telkomsel -o tweets.json

Unfortunately, resulting zero result. But if I add another keyword like Telkomsel mengecewakan resulting the tweet related keyword.

In the other hand, if I write

C:\Python27\Scripts\twitterscraper Trump -o tweets.json it runs very well.

Why it happens ?

This is weird, I checked Telkomsel on Twitter, sometimes it reloads and sometimes it stucks at all. Is it part of Twitter bug ?

If there's too many tweets for one day on non parallel scraping it scrapes the same day forever again and again

And there's no way we can get all the tweets for that day I presume.

AttributeError: Tweet instance has no attribute 'encode'

I'm simply running the code from the readme file and i keep getting errors. Can you help with this error?

And also, this:
twitterscraper Trump -l 100 -o tweets.json
Produces Error:

Allow getting lots of tweets

Apparently after 100000 tweets or so twitter stops serving new pages.

Inconsistent results among multiple runs

I am using twitterscraper to get the replies to some twitter accounts.

I am running the following queries as a test:

to%3Amatteorenzi%20since%3A2017-08-21%20until%3A2017-08-27
to%3Amatteosalvinimi%20since%3A2017-08-21%20until%3A2017-08-27

When performing multiple runs I get a different amount of results each time as below, with left number being the result of first query and right one for the second. Each line is a different run.

544, 4216
386, 4121
295, 4180

Why does this happen? Any way I can prevent it?

Encoding issue when applying the script to non-English language

Dear author
Thanks very much for your kind work! I am a beginner on python programming and hope it would not trouble you too much.
Here is the problem that I am gonna apply the script on mining non-English text (for twitter advanced search page), such as "戦う", while non-English text in the output file is always displayed as unicode like "'\xe7\x8e\xb2\xe9\ ... ..."
even when typing the command "print(tweet.text.encode('utf-8'))"(or with other encode), the output is still the same.
I am wondering if there is some specific measures to display the non-English text correctly?
Thanks!

Make script name lower case

Casing only introduces problems and confusion, also the package name on pypi is lowercase.

How i can get location(from user profile) and geo location of tweet

i'm new to python.
this script works fine. But i also want user location which is given in his profile and geo location.
How i can get these information ?

Issues with since and until in commandline

twitterscraper "%24PEP"%20since%3A2017-10-05 -o pep.out

this works, but when running it

twitterscraper "%24PEP"%20since%3A2017-10-05%20until%3A2017-10-05 -o pep.out

it doesnt work.

Ie. i want to limit the results to only one single day. wont' work.

Python 3 Rewrite

Hi,

I did a python 3 rewrite. It's a bit shorter, only takes about 90 LOC, and has a cleaner API IMO. It supports arbitrary queries and I basically got rid of the IO and a lot of other stateful stuff: https://github.com/sils/twitterscraper/blob/sils/auth/scrape_from_author.py

install error

it installed correctly via pip or from source here but when trying to use cli || in python shell i get this:

from twitterscraper import query_tweets
Traceback (most recent call last):
File "", line 1, in
File "twitterscraper/init.py", line 13, in
from twitterscraper.query import query_tweets
File "twitterscraper/query.py", line 10, in
from twitterscraper.tweet import Tweet
File "twitterscraper/tweet.py", line 3, in
from bs4 import BeautifulSoup
File "/usr/local/lib/python2.7/dist-packages/bs4/init.py", line 30, in
from .builder import builder_registry, ParserRejectedMarkup
File "/usr/local/lib/python2.7/dist-packages/bs4/builder/init.py", line 314, in
from . import _html5lib
File "/usr/local/lib/python2.7/dist-packages/bs4/builder/_html5lib.py", line 70, in
class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: 'module' object has no attribute '_base'

i have the twitter api installed also.

Begin and end dates are no longer working after latest commit

Returns 503 error when trying to specify dates

FakeUserAgentError: Error occurred during getting browser

Running twitterscraper, I ran into this error using the example given in the readme twitterscraper Trump%20since%3A2017-01-03%20until%3A2017-01-04 -o tweets.json

I was running a version from March and then upgraded to the latest master.zip but I still got the same error... Any ideas on how to resolve this? I'm running Ubuntu 16.04...

Traceback (most recent call last):
  File "/usr/local/bin/twitterscraper", line 9, in <module>
    load_entry_point('twitterscraper==0.3.1', 'console_scripts', 'twitterscraper')()
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 542, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2569, in load_entry_point
    return ep.load()
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2229, in load
    return self.resolve()
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2235, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "build/bdist.linux-x86_64/egg/twitterscraper/__init__.py", line 13, in <module>
  File "build/bdist.linux-x86_64/egg/twitterscraper/query.py", line 14, in <module>
  File "/usr/local/lib/python2.7/dist-packages/fake_useragent/fake.py", line 139, in __getattr__
    raise FakeUserAgentError('Error occurred during getting browser')  # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser

Geo data?

Is there any way to scrap geo data without using the api? This isn't an issue, it's more of a question. Been searching for a while and I can't seem to find anything.

Add ability to output to stdout rather than output to file

Reading the stdout of a command is much more efficient when handling a lot of requests, rather than taxing the server memory by creating many output json files. I believe that having an option to output to the console as stdout rather than outputting to a file would be a great feature that would expand the way that people can use this project.

Source parameter is not passed accurately to the script

When running twitterscraper from command line, the source parameter is not accurately passed to the script if used with apostrophe.
Example:
#news AND source:"Twitter for Android"
twitterscraper %23news%20AND%20source%3A"Twitter%20for%20Android" --output=tweets_new_Android.json

tweets_new_Android.json is empty, but https://twitter.com/search?q=%23news%20AND%20source%3A%22Twitter%20for%20Android%22&src=typd shows results.
it works for sources without apostrophe:
#news AND source:"Tweetdeck"
twitterscraper %23news%20AND%20source%3A"Tweetdeck" --output=tweets_new_Tweetdeck.json

Syntax error: invalid character in identifier line 9

Successfully installed twitterscraper in Python36 but I get the above message from the CMD prompt indicating a problem with the filename.
I think that it is because there is no data to store in the file as there is nothing onscreen from the "print(tweet)" in line 6

Please help (python novice)

John

AttributeError on module requests

Hello @taspinar

I just found a bug while tweet scraping. When my connection is unstable I got the error message like following :

ERROR:root:An unknown error occurred! Returning tweets gathered so far.
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\twitterscraper\query.py", line 93, in query_tweets_once
pos is None
File "C:\Python27\lib\site-packages\twitterscraper\query.py", line 53, in query_single_page
except requests.exception.ConnectionError as e:
AttributeError: 'module' object has no attribute 'exception'

Solved, just need to updgrade.

Number of tweets in final JSON file much smaller than reported during run

So I ran the scraper for a tweeting period of around a year, with the limit of 40.000, so:

twitterscraper "%23bitcoin AND %23bubble since%3A2016-09-01 until%3A2017-10-10&src=typd" -l 40000 -o bitcoinbubble.json

While running it counted all the way up to 40 thousand:
INFO: Got 39953 tweets (18 new).
INFO: Got 39971 tweets (19 new).
INFO: Got 39990 tweets (17 new).
INFO: Got tweets ranging from 2017-09-08 to 2017-10-09

But when i load the json file, it only contains 1528 tweets - what explains this?

Advance Query

Hello @taspinar

I'm new at programming.

Could you please give an example about an advanced query. In particular scraping by location and specific time.

Thankyou

twitterscraper: Command not found

So whenever I am trying to run this command on my server, it's saying "Command not found". I have installed it in my home directory. Please help. Any help would be appreciated.

control-C Does not seem to stop Parser execution or save results

If I control C out of the command line execution the program does not seem to save its results anywhere. The program also continues its execution on a second iteration, which is not always desired. I ran a large search last night on separate machines and neither of them saved their search data when control-c was used

Date issue occurring again - Cannot pass in a date earlier than one week ago

returning usernames, not tweets

import twitterscraper as ts
'usr='kingjames'
for tweet in ts.query_tweets(usr,10)[:10]:
    print(tweet.user.encode('utf-8'))

#out:
b'Rypuur'
b'Powperezdiez'
b'joey_a_george'
b'mikey_rakkar'
b'yarapgv'
b'V_Nasty10'
b'downtownbrownxx'
b'DeclanJoyce'
b'atnissaa'
b'WestifiedMJ'

Add option to query by language

Given that this is just parameter in the Twitter API, it should be easy to do, and frustrating that it isn't already available

Scrape tweet url

Seems to be nestled in data-permalink-path, should be an easy scrape

Likes, Retweets, Replies not being parsed if (> 999)

Tweet data for these fields is not being properly parsed if the values exceed 999.

I suspect that it relates to the fact that Twitter displays those values with letters in them. eg, "1.1k" instead of 1100.

In any case, Twitterscraper returns those values as 0.

Missing the Output JSON file ...

This question might sound silly, but I am able to use TwitterScraper successfully (with the command twitterscraper “” --output=tweets.json”. but I am unable to retrieve my json file (Logging shows that data is being collected: Example :
INFO: Got 137 tweets (20 new).
INFO: Got 157 tweets (20 new).
INFO: Got 177 tweets (19 new).
INFO: Got 196 tweets (19 new).
¸INFO: Got 215 tweets (17 new).
INFO: Got 232 tweets (20 new).
INFO: Got 252 tweets (19 new).
)
Specifying the exact path /Users/blahblah/tweets.JSON did not make a difference.
What am I missing? Thanks for your help in advance,

Limit is inconsistent with -l flag

I have been running the following command:
twitterscraper trump -l 3 -o tweets.json, which I figured would limit the amount of tweets to 3, according to the documentation.

Why is it that -l is not limiting the tweet download to just 3? I'm assuming this is not intended behavior. I have also tested this with -l at a higher integer, and when set to -l 30, it always downloads 40 tweets.

I'm thinking that this behavior is caused by new tweets being tweeted as the scraper is running? Twitter briefly explains this in this article: https://developer.twitter.com/en/docs/tweets/timelines/guides/working-with-timelines

The output of tweets.json is the following when using --limit 3 (contains 20 tweets):

[{"timestamp": "2017-11-02T18:26:36", "text": "trump owns it now since he gutted the subsidies.", "user": "MoOkonski", "retweets": "0", "replies": "0", "fullname": "Maureenski", "id": "926153585397780480", "likes": "0"}, {"timestamp": "2017-11-02T18:26:36", "text": "Congress, impeach Trump or resign \u2026http://makeamericagreatagainreally.blogspot.com/2017/10/the-workings-of-donald-j-trumps-mind.html\u00a0\u2026 #Congress #impeachmentpic.twitter.com/lQz5q6ZW5Z", "user": "THIRDSTONE56", "retweets": "0", "replies": "0", "fullname": "THIRD STONE", "id": "926153585750085632", "likes": "0"}, {"timestamp": "2017-11-02T18:26:36", "text": "#trump ahora es un asesino tambi\u00e9n.", "user": "rikrdotc", "retweets": "0", "replies": "0", "fullname": "Ricardo C", "id": "926153585800482817", "likes": "0"}, {"timestamp": "2017-11-02T18:26:36", "text": "Donna Brazile: I found 'proof' the DNC rigged the nomination for Hillary Clinton #DrainTheSwamp #Trump POTUS http://www.foxnews.com/politics/2017/11/02/donna-brazile-found-proof-dnc-rigged-nomination-for-hillary-clinton.html\u00a0\u2026", "user": "DavidDoright", "retweets": "0", "replies": "0", "fullname": "D.W.Trump\u00a0\ud83c\uddfa\ud83c\uddf8", "id": "926153586098294785", "likes": "0"}, {"timestamp": "2017-11-02T18:26:36", "text": "Trump to press for end to North Korea nuclear program on Asia trip: White House http://ift.tt/2z9xKoh\u00a0", "user": "BreakingNewss3", "retweets": "0", "replies": "0", "fullname": "Breaking News", "id": "926153586958053376", "likes": "0"}, {"timestamp": "2017-11-02T18:26:36", "text": "Nixon used his China trip as distraction to investigations of him. Trump going to Asia; echoes of the same or misdirect to a deeper issue.", "user": "TalkinToU", "retweets": "0", "replies": "0", "fullname": "TalkinToU", "id": "926153587268263936", "likes": "0"}, {"timestamp": "2017-11-02T18:26:38", "text": "George Papadopoulos was much more than what Trump says he was. https://twitter.com/SethAbramson/status/925923595079045120\u00a0\u2026", "user": "Resistacat", "retweets": "0", "replies": "0", "fullname": "Dee Ramee", "id": "926153592427466753", "likes": "0"}, {"timestamp": "2017-11-02T18:26:38", "text": "Mysterious Trump backer Mercer stepping down at fund, selling Breitbart stake. #Trump #Breibarthttps://www.cnbc.com/2017/11/02/billionaire-trump-backer-robert-mercer-to-step-down-from-hedge-fund.html\u00a0\u2026", "user": "PSuiteNetwork", "retweets": "0", "replies": "0", "fullname": "John Cutler", "id": "926153593635459072", "likes": "0"}, {"timestamp": "2017-11-02T18:26:38", "text": "This is far from over. Wait for it. And the collusion won't be over the election it will be over Trump's shady business dealings in Russia", "user": "HarryJoachim", "retweets": "0", "replies": "0", "fullname": "Harry Joachim", "id": "926153594939871234", "likes": "0"}, {"timestamp": "2017-11-02T18:26:38", "text": "Donna Brazil confession: Trump & Bernie were right, the DNC rigged the nomination for Hillary, big league!!\n\nhttps://townhall.com/tipsheet/guybenson/2017/11/02/donna-brazile-trump-and-bernie-were-right-the-dnc-rigged-it-for-hillary-big-league-n2403847\u00a0\u2026", "user": "LovToRideMyTrek", "retweets": "0", "replies": "0", "fullname": "BOYCOTT HOLLYWOOD\u00a0\ud83c\udf83", "id": "926153595346616327", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "Why Harry Belafonte's Warning About Trump Is Important Now More Than Ever. Read here: http://allthat.tv/posts/why-harry-belafonte-s-warning-about-trump-is-important-now-more-than-ever\u00a0\u2026", "user": "ArmChairPundt", "retweets": "0", "replies": "0", "fullname": "Lachelle", "id": "926153596340649984", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "Thank Trump for that", "user": "DennisG_Shea", "retweets": "0", "replies": "0", "fullname": "Dennis Shea", "id": "926153596567203841", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "GREAT AGAIN: POTUS Trump Announces $100 Billion Company\u2019s Return To USA (VIDEO)\nhttps://goo.gl/SaF4Us\u00a0\n\nNovember 2, 2017\nby Joshua ...pic.twitter.com/dL0nG1oOT8", "user": "warfarenews", "retweets": "0", "replies": "0", "fullname": "Warfare Web", "id": "926153596629950464", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "Time To Turn The Channel. I Can Only Handle So Much In One Day Of Trump & The Counterfeit Assholes Surrounding Him! LIES-LIES-LIES!!", "user": "Brokenknee1Jim", "retweets": "0", "replies": "0", "fullname": "James", "id": "926153597427113984", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "Trump doesn\u2019t really want you to know Obamacare enrollment just started -- By @svdate https://www.huffingtonpost.com/entry/trump-obamacare-enrollment_us_59fa3adfe4b01b47404810d0?ncid=engmodushpmg00000004\u00a0\u2026 via @HuffPostPol", "user": "michaellamperd", "retweets": "0", "replies": "0", "fullname": "Mick", "id": "926153597489881088", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "The Trump-Russia dossier cost $168,000, not $12 million, like president claimed http://www.newsweek.com/trump-dossier-cost-millions-699816\u00a0\u2026", "user": "XtyMiller", "retweets": "0", "replies": "0", "fullname": "Kilikina", "id": "926153598140014592", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "MOMENTS AGO: Pres. Trump: \"Congress must end chain migration so that we can have a system that is security based, not the way it is now.\"...", "user": "The_News_Corner", "retweets": "0", "replies": "0", "fullname": "Ok", "id": "926153598202912769", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "Trump Is Quietly Deregulating All the Things | Brittany Hunter https://fee.org/articles/trump-is-quietly-deregulating-all-the-things/\u00a0\u2026 via @feeonline", "user": "badcraigsnews", "retweets": "0", "replies": "0", "fullname": "Badcraigsnews", "id": "926153598433660928", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "Trump to press for end to North Korea nuclear program on Asia trip: White House http://ift.tt/2z7GXgZ\u00a0", "user": "_politic_us_", "retweets": "0", "replies": "0", "fullname": "Audrey", "id": "926153598437818368", "likes": "0"}, {"timestamp": "2017-11-02T18:26:39", "text": "House Democrats file lawsuit over access to Trump hotel documents - Politico https://www.politico.com/story/2017/11/02/trump-hotel-documents-lawsuit-244455\u00a0\u2026", "user": "PS641600", "retweets": "0", "replies": "0", "fullname": "PeterS", "id": "926153598446301184", "likes": "0"}]

Unknown Error

while running TwitterScraper "test" --output tweets.json --all for ~10 minutes

ERROR: An unknown error occurred! Returning tweets gathered so far.
Traceback (most recent call last):
  File "/home/lasse/prog/tie/twitterscraper/twitterscraper/query.py", line 96, in query_tweets_once
    pos is None
  File "/home/lasse/prog/tie/twitterscraper/twitterscraper/query.py", line 46, in query_single_page
    tweets = list(Tweet.from_html(html))
  File "/home/lasse/prog/tie/twitterscraper/twitterscraper/tweet.py", line 34, in from_html
    yield cls.from_soup(tweet)
  File "/home/lasse/prog/tie/twitterscraper/twitterscraper/tweet.py", line 19, in from_soup
    user=tweet.find('span', 'username').text[1:],
AttributeError: 'NoneType' object has no attribute 'text'

Output file created even if no tweets are found

This results in a bunch of empty files and incrementally increasing filenames, can be annoying for testing.

UTF-8 self.writer.writerow(post) issue

Hello !

I'm trying to scrap every tweets from an account. My script is quite simple :

_!/usr/bin/env python
encoding: utf-8

from twitterscraper import TwitterScraper

topic = ""
cible = "username"
filename = 'username_tweets.csv'
scraper = TwitterScraper.Scraper(topic, 21000, authors=cible, filename = filename)
scraper.scrape()_

It works for hundreds of tweets, but then I've got these error :

Traceback (most recent call last):
File "myscript.py", line 10, in
scraper.scrape()
File "/usr/local/lib/python2.7/dist-packages/twitterscraper/TwitterScraper.py", line 148, in scrape
self.write(post)
File "/usr/local/lib/python2.7/dist-packages/twitterscraper/TwitterScraper.py", line 136, in write
self.writer.writerow(post)

(Yes, I'm using python 2.7, don't know if the problem came from here or not)

Thanks in advance

set a query for a specific time
save the data on the hard drive

Probably this is just me being new to Python, but a general documentation with a brief description for each functionality would also be nice.

Thanks in advance!

Accessing JSON file with stored tweets

After carrying out the CLi search, how do I access the JSON file which the tweets are stored in?

taspinar / twitterscraper Goto Github PK

twitterscraper's People

Contributors

Stargazers

Watchers

Forkers

twitterscraper's Issues

Recommend Projects

Recommend Topics

Recommend Org