mattlisiv / newsapi-python Goto Github PK
View Code? Open in Web Editor NEWA Python Client for News API
License: MIT License
A Python Client for News API
License: MIT License
Describe the bug
There's a qintitle
param but the API support a searchIn
param, so the current functionality is limited.
It could be useful to enable the use of proxies leveraging the proxies parameter from standard requests library.
I follow this document https://newsapi.org/docs/client-libraries/python and test python api.
Seems like the follow code will get error I paste.
sources = newsapi.get_sources()
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "crawler.py", line 93, in <module>
crawler.crawl()
File "crawler.py", line 84, in crawl
sources = newsapi.get_sources()
File "/Users/mabodx/anaconda/lib/python3.6/site-packages/newsapi/newsapi_client.py", line 311,
in get_sources
r = requests.get(const.SOURCES_URL, auth=self.auth, timeout=30, params=payload)
File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/sessions.py", line 513, in r
equest
resp = self.send(prep, **send_kwargs)
File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/sessions.py", line 623, in s
end
r = adapter.send(request, **kwargs)
File "/Users/mabodx/anaconda/lib/python3.6/site-packages/requests/adapters.py", line 514, in s
end
raise SSLError(e, request=request)
requests.exceptions.SSLError: ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
pip install newsapi-python
from newsapi import NewsApiClient
its suppost to allow me to all newsapiclient into other scripts but insted it says
line 4, in
from newsapi import NewsApiClient
ImportError: cannot import name 'NewsApiClient' from 'newsapi' (C:\Users"my name"\AppData\Local\Programs\Python\Python311\Lib\site-packages\newsapi_init_.py
Describe the bug
The API is able to handle multiple search terms or phrases if a list of strings is passed as the q arg.
e.g. if q = [google, "google ai"] is passed and parsed to the payload dict then it will return a successful result. However, the input check won't currently allow this, limiting functionality.
Hi, and thank you very much for this API!
In the method get_everything in newsapi_client class, there is currently no parameter taking pagesize. Since it is possible through the utilization of Curl, implementing this in the Python library also, would be of great value, so you do not need to look through so many pages.
Best regards
When I type this code:
btc_headlines = newsapi.get_everything(q="bitcoin", language="en", page_size=100,sort_by="relevancy")
btc_articles = btc_headlines["articles"]
btc_articles[0]
I get error:
TypeError: expected string or bytes-like object
Could you please advise what could be the issue? Thanks.
Note: This happens on Macbook Pro M1. On Intel Mac it works fine.
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior. For instance:
conda activate venv
pip install newsapi_python
from newsapi import NewsApiClient
# Init
api = NewsApiClient(api_key='xxxxxxxxxxxx')
# /v2/everything
all_articles = api.get_everything(q='mars')
print(all_articles)
(venv) C:\Users\abc\anaconda3\envs\venv\bin>C:/Users/abc/anaconda3/envs/venv/python.exe
Expected behavior
News Data
Screenshots
d:/Projects/advisely/advisely/API/newsapi.py
Traceback (most recent call last):
File "d:/Projects/newsoftoday/API/newsapi.py", line 3, in <module>
from newsapi import NewsApiClient
File "d:\Projects\newsoftoday\API\newsapi.py", line 3, in <module>
from newsapi import NewsApiClient
ImportError: cannot import name 'NewsApiClient' from partially initialized module 'newsapi' (most likely due to a circular import) (d:\Projects\newsoftoday\API\newsapi.py)
Desktop (please complete the following information):
Additional context
(venv) C:\Users\abc\anaconda3\envs\venv\bin>pip install newsapi_python
Collecting newsapi_python
Using cached newsapi_python-0.2.6-py2.py3-none-any.whl (7.9 kB)
Requirement already satisfied: requests<3.0.0 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from newsapi_python) (2.25.1)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (1.26.5)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (2021.5.30)
Requirement already satisfied: idna<3,>=2.5 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (2.10)
Requirement already satisfied: chardet<5,>=3.0.2 in c:\users\abc\anaconda3\envs\venv\lib\site-packages (from requests<3.0.0->newsapi_python) (4.0.0)
Installing collected packages: newsapi-python
Successfully installed newsapi-python-0.2.6
Hi there,
I tried using multiple keywords but nomatter how I join them, I'm not receiving results any close to what I receive when I use the classic GogleNews Search.
My code:
from newsapi import NewsApiClient
api = NewsApiClient(api_key='xxx')
keywords = ['Neubau', 'Weiße', 'Stadt']
all_articles = api.get_everything(q=','.join(keywords),
sort_by='publishedAt',
language='de')
besides ',' I tried joining the keywords by:
Is there something I miss?
Custom session for proxies should be added for enhancement.
Will make a PR.
Unresolved reference 'NewsApiClient' when attempting to import the client
Hello,
just installed the package and got the following error:
File "<stdin>", line 1, in <module>
File "/anaconda3/envs/ff/lib/python3.6/site-packages/newsapi/__init__.py", line 1, in <module>
from newsapi.newsapi_client import NewsApiClient
File "/anaconda3/envs/ff/lib/python3.6/site-packages/newsapi/newsapi_client.py", line 12
def get_top_headlines(self, q=None: str, sources=None: str, language='en': str, country=None: str, category=None: str, page_size=20: int,
^
SyntaxError: invalid syntax
I can see the change coming in from this PR.
Do I need a specific version of something or am I missing something else?
Thanks in advance.
EDIT:
Just tested it with version 0.2.3 and I can confirm that it's working. So presumably the whole thing blowed up 2 hours ago, when the version was bumped. 👼
I install with newsapi with pip command but it gives an Import error.
what is the solution for this??
One more::I tried to install this newsapi with anaconda promp
command::"conda install -c kvedala newsapi-python" also it gives and error.
TypeError Traceback (most recent call last)
in
----> 1 all_bitcoin_articles=newsapi.get_everything(q='bitcoin')
TypeError: expected string or bytes-like object
raise ValueError("cannot mix country/category param with sources param.")
I have been trying to get all the news article related to say money laundering using the get_everything method, but after a certain number of results have been accessed, it gives up this error.
Note: I have a paid account, not a developers account. I can access a max of 9900 articles only when let's say the number of articles related to money laundering is 50K. How can I access the remaining articles?
In order to work properly with parameters passed as unicode strings, the correct way to check if a parameter is a string in Python 2.x in newsapi_client.py
is:
if isinstance(q, basestring)
instead of:
if type(q) == str
basestring
includes both str
and unicode
.
See python docs for reference.
How do I get all the results as opposed to just 20? When I retrieve results it says I have 1404, but I only get 20 of the articles in the articles section of the dictionary. I know it says to use the page parameter but that does not seem to rectify the issue.
The code below errors out:
try:
newsapi = NewsApiClient(api_key='my key')
everything = newsapi.get_everything(q='bitcoin',from_parameter='2018-08-01',sources='us',language='en',page_size=100)
except Exception as e:
print(e)
This is what the exception prints:
get_everything() got an unexpected keyword argument 'from_parameter'
I'm using Python3.6 on Ubuntu 16.04.
The package on PyPI is 0.2.5, but 0.2.6 when installing from github.
Should swap the comments order about page_size and page.
Describe the bug
When I used only sources
for the given parameter, I was able to retrieve the data. However, if I used q
and country
or other parameters, I received 0 results. For example:
This works:
top_headlines = newsapi.get_top_headlines(sources='bbc-news, the-verge')
{'status': 'ok', 'totalResults': 26, 'articles': .......}
But this doesn't work:
top_headlines = newsapi.get_top_headlines()
top_headlines = newsapi.get_top_headlines(q='tesla', country='us')
top_headlines = newsapi.get_top_headlines(category='business')
{'status': 'ok', 'totalResults': 0, 'articles': []}
I also tried:
top_headlines = newsapi.get_top_headlines(language='en-US')
but got this error:
top_headlines = newsapi.get_top_headlines(language='en-US')
File "myproject/venv/lib/python3.8/site-packages/newsapi/newsapi_client.py", line 120, in get_top_headlines
raise ValueError("invalid language")
ValueError: invalid language
I replaced the en
in newsapi.const.language
with en-US
and I can receive the data right now but the source of the data is only google-news.
Also, if I used other countries, for example:
top_headlines = newsapi.get_top_headlines(country='jp')
I received 0 results.
The default language of get_top_headlines seems "en". When needed to get specific news for a country, if language is not changed article count seems 0.
Change the language get the news, could be a choice.
BUT
The problem occurs when the language is not supported by the API
top_headlines = newsapi.get_top_headlines(country="tr",language = "tr")
print(top_headlines)
Above code gives" invalid language error".
But in the API doc, specific news are supported for Turkey.
Since in the client, language has to be changed to get the news and language is not defined tor Turkey in the API, it raises an error
Also the issue exists in getting category specific news for countries
hello,i have set the api key ,and using the right code,and the demo get the "{'status': 'ok', 'totalResults': 0, 'articles': []}",this is no new_data come back ,why
I am trying to use qInTitle when using get_everything and am getting the following error:
get_everything() got an unexpected keyword argument 'qInTitle'
Is the required argument slightly different?
newsapi-python/newsapi/newsapi_client.py
Line 148 in 6da1bd7
Valid values are: 'relevancy','popularity','publishedAt','relevancy'
from newsapi import NewsApiClient
api = NewsApiClient(api_key='XXXXXXXXXXXXXXXXXXXXXXXX')
api.get_top_headlines(sources='bbc-news')
Traceback (most recent call last):
File "", line 1, in
api.get_top_headlines(sources='bbc-news')
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\newsapi\newsapi_client.py", line 115, in get_top_headlines
r = requests.get(const.TOP_HEADLINES_URL, auth=self.auth, timeout=30, params=payload)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\sessions.py", line 609, in send
r = adapter.send(request, **kwargs)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\adapters.py", line 423, in send
timeout=timeout
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 587, in urlopen
timeout_obj = self._get_timeout(timeout)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 302, in _get_timeout
return Timeout.from_float(timeout)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\util\timeout.py", line 154, in from_float
return Timeout(read=timeout, connect=timeout)
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\util\timeout.py", line 94, in init
self._connect = self._validate_timeout(connect, 'connect')
File "C:\Users\NIRANJAN\Anaconda3\lib\site-packages\requests\packages\urllib3\util\timeout.py", line 127, in _validate_timeout
"int, float or None." % (name, value))
ValueError: Timeout value connect was Timeout(connect=30, read=30, total=None), but it must be an int, float or None.
After working flawlessly for month I am not getting any news when setting source to 'spiegel-online' or 'die-zeit'. cnn or reuters is still working fine.
When get_top_headlines() is called in my code, no articles are returned I simply get: {'status': 'ok', 'totalResults': 0, 'articles': []}.
Can we get a version bump on the PyPi version so we have a version out there with exclude_domains?
Thanks in advance.
Describe the bug
The issue is that I'm able to get sources for countries like: ae, at, be, bg, ch, cn, co, cu, ua, ru.
To Reproduce
from newsapi import NewsApiClient
newsapi = NewsApiClient(api_key="api_key")
print(newsapi.get_sources(country="ua"))
{'status': 'ok', 'sources': []}
Expected behavior
Should be able to get all sources for all countries
There's no problem with the code—I just don't know much programming.
So I have a list of terms that I'd like to create timeseries of, by days for about a 5-6 month period.
Hopefully, I can just change the keyword, run the script, and end up with a .csv file with just two or three columns for the date, and the # of results on that day.
Using the Dev version, which is limited to 500 requsts a day. I don't think I can get the actual dates for individual articles if I use a long period of time right? So I guess I'll have to do each day separately.
Can this be done with loops to print the total number of articles per day? And would anyone be willing to help me out (please?)? Would be very much appreciated :)
Thank you!
I'm trying to use the pageSize parameter and getting the following error :
get_everything() got an unexpected keyword argument 'pageSize'
Does this have something to do with the API?
Is your feature request related to a problem? Please describe.
Please consider adding the search_in parameter.
Describe the solution you'd like
I would like to use the search_in parameter for niche lookups.
Describe alternatives you've considered
I could just filter myself after I get the results
In client.get_everything() from parameter is from_param, but is referred to as from_parameter in the description.
I noticed that const.countries array is missing code 'es', while the API does return news sources for Spain.
While trying to run the following code snippet , which is actually a part of documentation :
all_articles = newsapi.get_everything(q='bitcoin',
sources='bbc-news,the-verge',
domains='bbc.co.uk,techcrunch.com',
from_param='2017-12-01',
to='2017-12-12',
language='en',
sort_by='relevancy',
page=2)
I am getting this error again and again :
Traceback (most recent call last):
File "news.py", line 16, in <module>
page=2)
File "C:\Python27\lib\site-packages\newsapi\newsapi_client.py", line 261, in get_everything
raise NewsAPIException(r.json())
newsapi.newsapi_exception.NewsAPIException
Any suggestions on why this might be happening ?
Hey guys, I'm using the module to create a headline display, using some dot displays, thanks for your efforts on this code, I liked it a lot!
Now regarding my suggestion,
What about setting the default language to en, this way we can use newsapi.get_top_headlines()
only, without any arguments. Currently if we do so it raises an exception.
I can send a pull request to implement it if you want.
The content is returned like this:
"content": "The company operating the National Broadband Network has claimed competition from wireless services including Elon Musks Starlink is threatening the viability of its business, as retail internet prov\u2026 [+2829 chars]"
which has [+2829] at the end
To Reproduce
sample: headlines = newsapi.get_top_headlines(q='', category='business')
Expected behavior
A response is returned okay but the content of each article is not fully returned
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior. For instance:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
from newsapi import NewsApiClient newsapi = NewsApiClient(api_key='e6702efb133e48418f78ea26f4620e20') top_headlines = newsapi.get_top_headlines(q='bitcoin', sources='bbc-news,the-verge', category='business', language='en', country='us')
getting error in python3.x
ImportError: cannot import name 'NewsApiClient'
from newsapi import NewsAPIClient
ImportError: cannot import name 'NewsAPIClient'
I have newsapi installed and I have an api key yet it still wont work
Some of the fields have a limited set of strings that are acceptable, such as category, country, or language. It would probably be more user friendly if these were checked for validity before being passed to the API.
Enums would also be possible, though that could make for more work on the library user depending on how they are writing their code, and could also be a breaking change.
Is your feature request related to a problem? Please describe.
I'd like to be able to request the full text of the article in a secondary api call.
Describe the solution you'd like
see above
Describe alternatives you've considered
n/a
Additional context
n/a
The problem here is, we can't fetch articles using those two parameters (it's very useful).
https://newsapi.org/docs/client-libraries/python
Following example:
# /v2/everything
all_articles = newsapi.get_everything(q='bitcoin',
sources='bbc-news,the-verge',
domains='bbc.co.uk,techcrunch.com',
from_parameter='2017-12-01',
to='2017-12-12',
language='en',
sortBy='relevancy',
page=2)
sortBy
should be sort_by
Describe the bug
Langauge list in newsapi/const.py contains invalid language and 2 of them don't follow ISO-639-1 standard.
To Reproduce
Try to use 'se' country code with /everything endpoint, it won't give results and isn't available on NewsAPI documentation.
'cn' and 'en-US' aren't ISO-639-1 codes
Desktop (please complete the following information):
When I run:
api.get_everything(q='bitcoin')
I get
api.get_everything(q='bitcoin')
Traceback (most recent call last):
File "", line 1, in
api.get_everything(q='bitcoin')
File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\newsapi\newsapi_client.py", line 248, in get_everything
r = requests.get(const.EVERYTHING_URL, auth=self.auth, timeout=30, params=payload)
File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\api.py", line 57, in request
with sessions.Session() as session:
File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\sessions.py", line 386, in init
self.mount('https://', HTTPAdapter())
File "C:\Users\LapLap\Anaconda3\envs\fluffy\lib\site-packages\requests\adapters.py", line 120, in init
super(HTTPAdapter, self).init()
TypeError: super(type, obj): obj must be an instance or subtype of type
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.