vfilimonov / pydatastream Goto Github PK

View Code? Open in Web Editor NEW

71.0 71.0 30.0 193 KB

Python interface to the Refinitiv Datastream (former Thomson Reuters Datastream)

License: MIT License

Python 98.22% Makefile 1.78%

pydatastream's People

Contributors

Stargazers

Watchers

pydatastream's Issues

Compatible with pandas

Last month, pandas upgraded from 0.19 to 0.20. And they release a warning : "Panel is deprecated and will be removed in a future version". However, your pydatastream package will output in pandas when the output is 3-dimension. Therefore, I am wondering whether you will follow up pandas.

Thank you so much.

get_constituents returns error on historical dates

This seems to work in line 473
res = pd.DataFrame({fld:[record[fld_name(fld,ind)]if fld_name(fld,ind) in record else 'N/A' for ind in range(num)]for fld in fields})

Tested for call DWE.get_constituents('JSEOVER', '28-Dec-2009')

Query an instrument using its SEDOL / RIC

Is it possible to fetch data of an instrument using its SEDOL or RIC field? ISIN and Mnemonic seem to work with DS.fetch(), but querying based on other fields (like SEDOL and RIC) do not -- is this a limitation of the library or the API?

Unable to install this package

Unable to install this package. Unable to install the dependency package suds-py3 too. I am currently using Python 3.6 (Using Pycharm as IDE and Miniconda for Python). Kindly help me with what am I missing in order to get this package ?

Issues with Pulling Historical Constituent Lists

I'm having trouble pulling historical constituent lists for S&PCOMP. Regardless of the date requested, it always returns a current constituent list.

For example:

DS.get_constituents('S&PCOMP', '1-9-1980')

returns a dataframe including Netflix.

I am very new to Python/GitHub so I may just be doing it wrong. If so I would very much appreciate if you could point me to some documentation.

INVALID CODE OR EXPRESSION ENTERED

Hi,

I not sure if this is a bug or just mistake in query (misusing - I am not familiar with the DS).

When I try:

raw1 = DWE.fetch(['POEURSP'], freq='D')

raw2 = DWE.fetch(['POLZLSF'], freq='D')

Both requests work.

But

raw_m = DWE.fetch(['POEURSP', 'POLZLSF'], freq='D')

raises exception:

pydatastream.pydatastream.DatastreamException: Failure (error 2): $$"ER", E100, INVALID CODE OR EXPRESSION ENTERED, POLZLSF(P) --> "POEURSP,POLZLSF~D"

Same result for

raw_m = DWE.request('POEURSP,POLZLSF~D')

In R below code returns (not sure if it is equivalent) data without problem:

dat <- ds(user, c("POEURSP", "POLZLSF"), period = "D")

How to request several tickers at once

I want to request several tickers at once, for instance US and Spain's GDP.
I have no problem requesting each series one by one. but when I add a list of tickers, I get an error message. I cannot add a "field" because these series have no available field in Datastream, as far as I know.

from pydatastream import Datastream
DWE = Datastream(username='DS:XXXXX', password='XXXXX')
DWE.fetch(['ESXGDPR.D','USGDP...D'], date_from='1980-01-01')

invalid value encountered in absolute
inside = ((abs(dx0 + dx1) + abs(dy0 + dy1)) == 0)
Traceback (most recent call last):

File "", line 1, in
gdp = DWE.fetch(['ESXGDPR.D','USGDP...D'], date_from='1980-01-01')

File "C:\Python27\lib\site-packages\pydatastream\pydatastream.py", line 513, in fetch
dat, meta = self.parse_record(raw, indx)

File "C:\Python27\lib\site-packages\pydatastream\pydatastream.py", line 295, in parse_record
status['StatusMessage'], status['Request']))

DatastreamException: Failure (error 2): $$"ER", E100, INVALID CODE OR EXPRESSION ENTERED, USGDP...D(P) --> "ESXGDPR.D,USGDP...D~~1980-01-01~~D"

Thank you

Economics data

Hi,

I'm trying
DWE.fetch(["BHWDA347A","BHWDM8P5A"], date_from='2009', date_to='2016')

I get
DatastreamException: Failure (error 2): $$"ER", E100, INVALID CODE OR EXPRESSION ENTERED, BHWDM8P5A(P) --> "BHWDA347A,BHWDM8P5A~~2009-01-01~~:2016-01-01~D"

But i know that individually, it works well :
DWE.fetch(["BHWDA347A"], date_from='2009', date_to='2016')
DWE.fetch(["BHWDM8P5A"], date_from='2009', date_to='2016')

What am i doing wrong?

My goal:
I would like to do one request with a list of 10 series instead of looping 10 series on 10 rows (with for i in range(1,10)... I'm sure it will be faster with one single request

Use behind a proxy

Could you add some arguments to the initial Datastream allowing use behind a proxy. You could then add the argument "proxy={'http':'proxyLocaion:portNumber'}" to your suds.client call when the Datastream object is instantiated.

Connectivity problem, did something change in DSWS?

Nevermind, I found out the problem was my office network.

DS.fetch() output

I would like to extract data using pydatastream module using Datastream ISIN code as following:

data = DS.fetch(['US91835J2078','KR7114630007'], ['X(UP)~U$','X(P)~U$','VO'], date_from='2021-09-27')
I was able to obtain 'data', but the result is sorted in the order of KRxxxx and USxxxx. I don't know why they are sorted, but I would like to have data in the original order as shown in ['USxxxx', 'KRxxxx'].

I have extensive ISIN list, so it is important to get the data as intended. There must be a simple on/off switch to get the result, but I cannot find the solution. Can someone help me how to get the result as intended?

Thanks.

DWE is discontinued, code should use new REST API of DSWS

Datastream data through DataWorksEnterprise (DWE) Web Service was discontinued on 30th June 2019. As of 1st July 2019, Datastream content is delivered through Datastream Web Services (DSWS).

ImportError: cannot import name 'Datastream'

Hi,
I'm trying to set up pydatastream under anaconda. I have successfully upgraded pandas and installed suds-py3 and pydatasteam-0.4.0 using pip. When I enter "from pydatastream import Datastream" on the python command line or in a jupyter notebook I get the error in the subject line. Is this a bug or am I missing a step in the setup?

use 'Base Date' as date_from

In the excel add-in, you have the option to set the starting date as "Base Date", in which case, the starting date becomes the date corresponding to the oldest available data point of all series in the request.

Any idea how the request string should be in this case?

Request fails for more than 10 tickers at a time

Code to reproduce:

mnem = ['U:DY','@MSFT','U:NAV','U:C','U:JPM','U:IR','U:NEE','U:FE','U:BAC','U:JNJ','U:WMT','U:PH','U:F','U:DIS','U:OI','U:GE','U:NR','U:AXP','U:AOS','U:MRO','U:NKE','U:L']

conn.fetch(mnem,['RI'],'2015-10-01')

Deprecated pandas.np module

Since upgrading to pandas 1.0.3 the following warning appears when calling DS.fetch:

"/site-packages/pydatastream/pydatastream.py:335: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead"

Should be an easy case of replacing pd.np.NaN by np.NaN where appropriate

@vfilimonov Happy to do the PR for this if you like?

can't download several tickers with fetch

I can't seem to be able to fetch several tickers at once with fetch. For instance

DWE.fetch(['EUDOLLR','USDOLLR'],date_from='2000',freq='D')

does not work. However, I can do it one by one as

DWE.fetch('EUDOLLR',date_from='2000',freq='D')
DWE.fetch('USDOLLR',date_from='2000',freq='D')

and both give the correct 'P' field. I tried

DWE.fetch(['EUDOLLR(P)','USDOLLR(P)'],date_from='2000',freq='D')
DWE.fetch(['EUDOLLR','USDOLLR'],fields=['P'],date_from='2000',freq='D')

but these do not work either.

Am I missing something?

Python 3 compatibility?

@ vfilimonov are you perhaps planning to do this at some stage?

Hickup on single day request

If I run get_trading_days() on a SINGLE DAY and one country happens to have a holiday, the code gets a hickup around line 340. I was going to use the function for the current day processing only. I now you intended it as mask for historical data, but it would be great if that "feature" could be added...

You can reproduce the error with the following code:

ds.get_trading_days(['US','UK','FR','SW','IT','NL','BD','BG'], date_from='2022-06-06', date_to='2022-06-06')

Here is the trace:

---------------------------------------------------------------------------
DatastreamException                       Traceback (most recent call last)
Input In [38], in <module>
----> 1 trading_days2 = ds.get_trading_days(['US','UK','FR','SW','IT','NL','BD','BG'], date_from='2022-06-06', date_to='2022-06-06')
      2 trading_days2

File ~\Miniconda3\envs\Eikon\lib\site-packages\pydatastream\pydatastream.py:729, in Datastream.get_trading_days(self, countries, date_from, date_to)
    727     raise DatastreamException(f'Unknowns ISO codes: {", ".join(missing_isos)}')
    728 # By default 0 and NaN are returned, so we add 1
--> 729 res = self.fetch(mnems.MNEM, date_from=date_from, date_to=date_to) + 1
    731 if len(countries) == 1:
    732     return res.iloc[:, 0].to_frame(name=countries[0])

File ~\Miniconda3\envs\Eikon\lib\site-packages\pydatastream\pydatastream.py:455, in Datastream.fetch(self, tickers, fields, date_from, date_to, freq, static, IsExpression, return_metadata, always_multiindex)
    453 raw = self.request(req)
    454 self._last_response_raw = raw
--> 455 data, meta = self.parse_response(raw, return_metadata=True)
    457 if static:
    458     # Static request - drop date from MultiIndex
    459     data = data.reset_index(level=1, drop=True)

File ~\Miniconda3\envs\Eikon\lib\site-packages\pydatastream\pydatastream.py:369, in Datastream.parse_response(self, response, return_metadata)
    358 """ Parse raw JSON response
    359 
    360     If return_metadata is True, then result is tuple (dataframe, metadata),
   (...)
    366     (dataframe, metadata), otherwise each element is a dataframe.
    367 """
    368 if 'DataResponse' in response:  # Single request
--> 369     res, meta = self._parse_one(response['DataResponse'])
    370     self.last_metadata = meta
    371     return (res, meta) if return_metadata else res

File ~\Miniconda3\envs\Eikon\lib\site-packages\pydatastream\pydatastream.py:340, in Datastream._parse_one(self, res)
    338 if v['Type'] == 0:  # Error
    339     if self.raise_on_error:
--> 340         raise DatastreamException(f'"{v["Symbol"]}"("{data_type}"): {value}')
    341     res[data_type][v['Symbol']] = math.nan
    342 elif v['Type'] == 4:  # Date

DatastreamException: "S:VACS"(""): $$ER: 0904,NO DATA AVAILABLE

Access problem

Hello, when trying to access my account, I receive the following error.

DWE = Datastream(username="DS:XXXXXX", password="XXXXXXXX")

b"Server raised fault: 'User 'DS:XXXXXX' logon failed: Delegate Source 'DSPermissionsPassword' reports: No response from source DSPermissionsPassword'"

It was working just last week. I don't know if it's a problem with Datastream's server.

Do you know what it might be?

Static request

Hello,

First of all, thank you for this package.

I have a problem when trying to do a static request. For example, running this request
DWE.fetch(['D:BAS','D:BASX','HN:BAS','I:BAF','BFA','@BFFAF','S:BAS'], ['ISIN', 'ISINID', 'NAME'], static=True)
raises the following error:
TypeError: fetch() got an unexpected keyword argument 'static'

I tried upgrading with pip install pydatastream --upgrade but it does not change the outcome.

Do you have an idea what is causing the error ?

Thank you.

Error in get next release

This:

DS.get_next_release_dates(['USCONPRCE'])

or just the example in the README, give:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-11-6f693796e1c2> in <module>
----> 1 DS.get_next_release_dates(['USCONPRCE'])

/opt/conda/lib/python3.7/site-packages/pydatastream/pydatastream.py in get_next_release_dates(self, mnemonics, n_releases)
    664         reqs = [self.construct_request(mnemonics, f'DS.NDOR{i+1}', static=True)
    665                 for i in range(n_releases)]
--> 666         res_parsed = self.parse_response(self.request_many(reqs))
    667 
    668         # Rearrange the output

/opt/conda/lib/python3.7/site-packages/pydatastream/pydatastream.py in parse_response(self, response, return_metadata)
    365             return (res, meta) if return_metadata else res
    366         if 'DataResponses' in response:  # Multiple requests
--> 367             results = [self._parse_one(r) for r in response['DataResponses']]
    368             self.last_metadata = [_[1] for _ in results]
    369             return results if return_metadata else [_[0] for _ in results]

/opt/conda/lib/python3.7/site-packages/pydatastream/pydatastream.py in <listcomp>(.0)
    365             return (res, meta) if return_metadata else res
    366         if 'DataResponses' in response:  # Multiple requests
--> 367             results = [self._parse_one(r) for r in response['DataResponses']]
    368             self.last_metadata = [_[1] for _ in results]
    369             return results if return_metadata else [_[0] for _ in results]

/opt/conda/lib/python3.7/site-packages/pydatastream/pydatastream.py in _parse_one(self, res)
    347         res = pd.concat(res).unstack(level=1).T.sort_index()
    348         res_meta['Currencies'] = meta
--> 349         return res, self._parse_meta(res_meta)
    350 
    351     def parse_response(self, response, return_metadata=False):

/opt/conda/lib/python3.7/site-packages/pydatastream/pydatastream.py in _parse_meta(meta)
    302                     res[key] = None
    303                 else:
--> 304                     names = pd.DataFrame(meta[key]).set_index('Key')['Value']
    305                     names.index.name = key.replace('Names', '')
    306                     names.name = 'Name'

/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in set_index(self, keys, drop, append, inplace, verify_integrity)
   4301 
   4302         if missing:
-> 4303             raise KeyError(f"None of {missing} are in the columns")
   4304 
   4305         if inplace:

KeyError: "None of ['Key'] are in the columns"

Looking at the code, i think maybe this construct is incorrect in parse_one:

meta[data_type][v['Symbol']] = {_: v[_] for _ in v if _ != 'Value'}

This is the data returned from the api:

{'DataResponses': [{'AdditionalResponses': None,
   'DataTypeNames': [],
   'DataTypeValues': [{'DataType': 'DS.NDOR1_DATE',
     'SymbolValues': [{'Currency': None,
       'Symbol': 'USCONPRCE',
       'Type': 4,
       'Value': '/Date(1586476800000+0000)/'}]},
    {'DataType': 'DS.NDOR1_DATE_LATEST',
     'SymbolValues': [{'Currency': None,
       'Symbol': 'USCONPRCE',
       'Type': 4,
       'Value': '/Date(1586476800000+0000)/'}]},
    {'DataType': 'DS.NDOR1_TIME_GMT',
     'SymbolValues': [{'Currency': None,
       'Symbol': 'USCONPRCE',
       'Type': 6,
       'Value': '12:30'}]},
    {'DataType': 'DS.NDOR1_DATE_FLAG',
     'SymbolValues': [{'Currency': None,
       'Symbol': 'USCONPRCE',
       'Type': 6,
       'Value': 'Official'}]},
    {'DataType': 'DS.NDOR1_REF_PERIOD',
     'SymbolValues': [{'Currency': None,
       'Symbol': 'USCONPRCE',
       'Type': 4,
       'Value': '/Date(1584230400000+0000)/'}]},
    {'DataType': 'DS.NDOR1_TYPE',
     'SymbolValues': [{'Currency': None,
       'Symbol': 'USCONPRCE',
       'Type': 6,
       'Value': 'NewValue'}]}],
   'Dates': None,
   'SymbolNames': [{'Key': 'USCONPRCE', 'Value': 'USCONPRCE'}],
   'Tag': None}],
 'Properties': None}

Regards,
Mikael

string encoding error

When importing a data item which has a currency symbol in the title I get the following error:

"UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 0: ordinal not in range(128)"

This was when importing symbol "USDOLLR" with field "ER" (i.e.: DWE.fetch('USDOLLR','ER', date_from='2015'). The title of the data series has both the £ and $ characters in it.

DSPermissionPassword

I'm getting the following error (though things were working as recently as a few days ago). Do you know why this might be happening?

suds.WebFault: b"Server raised fault: 'User 'DS:XXXXXX' logon failed: Delegate Source 'DSPermissionsPassword' reports: No response from source DSPermissionsPassword'"

Get an error when proxy is enabled

File "C:\Users\abanerj1\AppData\Roaming\Python\Python35\site-packages\pydatastream\pydatastream.py", line 477, in fetch
query = self.construct_request(tickers, fields, date, date_from, date_to, freq)
File "C:\Users\abanerj1\AppData\Roaming\Python\Python35\site-packages\pydatastream\pydatastream.py", line 425, in construct_request
if isinstance(fields, basestring):
NameError: name 'basestring' is not defined

How to pull Euro denominated data for multiple firms?

Just wondering how to obtain Euro denominated data when I want to retrieve data drom multiple firms and for multiple items, e.g. for example for the request below:

res = DWE.fetch(['@AAPL','U:MMM'], fields=['P','MV','VO','PH'], date_from='2000-05-03')
print res['MV'].unstack(level=0)

I tried to shoe in ~~EUR , but to no avail ...

0.6.2 missing on pypi

Hi.

Really appreciate this convenient library. Is it possible to get 0.6.2 on pypi?

I can just use this source code of course but would be nice to have it available from pip.

Regards,
Mikael

"Static" throws an errors

Hi,

I am trying to pull a static request, however it throws an error:

I also tried the example from the readme, same error.

Martien

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-f2e5df257a85> in <module>()
      1 #anz_perp = DWE.fetch(['A:ANZB'], ['P'], date_from='2013-01-01')
----> 2 anz_perp = DWE.fetch(['A:ANZB'],['NAME'], static=True)

TypeError: fetch() got an unexpected keyword argument 'static'

Bug: InvalidToken exception raised from request_many

We're observing occurences of InvalidToken exceptions when calling the request_many function (see screenshot below).

There are two underlying issues that need to be addressed in _token_is_expired property:

Datastream's TokenExpiry is a UTC datetime, but is compared with the local datetime pd.Timestamp('now'). Suggestion for line 195: if self._token['TokenExpiry'] < pd.Timestamp.utcnow() - pd.Timedelta(minutes=15):.
Datastream returns tokens that are supposedly valid for 24 hours, but in practice does not seem to honour that commitment. Therefore, I think it would be safer to also say that a token expires after 1 hour in addition to (1). (The reason we believe this is necessary is because we're in UTC+2, which means that pydatastream should refresh the token 2 hours and 15 minutes before the 24 hour period is over and even so we encounter InvalidTokens.)

Screenshot:

vfilimonov / pydatastream Goto Github PK

pydatastream's People

Contributors

Stargazers

Watchers

Forkers

pydatastream's Issues

Recommend Projects

Recommend Topics

Recommend Org