The db from mini-kep

can we prevent CSV output to be rendered as html in browser?

When looking at the source of csv output I can see the orginal CSV file easily:

view-source:https://minikep-db.herokuapp.com/api/datapoints?name=USDRUR_CB&freq=d&start_date=2017-08-01&end_date=2017-10-01

...but in browser it gets rendered to html. Can we prevent html rendering in browser? What controls for it? some headers?

add CSV serialiser and format parameter to /datapoints

https://github.com/mini-kep/helpers/blob/master/custom_api.py

specification for database layer (2nd edition)

The document to review is this:

https://github.com/mini-kep/db/blob/master/SPEC.md

I need to spot:

ambiguitites - something not clearly written, which may be understood in different ways
things missing
things where saome choice is needed - to this do not do that

Add to readme.md

db/db/custom_api/custom_api.py

Line 1 in 21485a7

"""Decompose custom URL.

merge custom API to 'db' repo

Can we mount custom API in db project too? See https://github.com/mini-kep/helpers/issues/3 and https://github.com/mini-kep/frontend-app/blob/master/apps/views/time_series.py

api/test_views.py

Sources of error are:

get_datapoints_response deleted in code under test
new validation class introduced
possibly other

commented tests must relate to different functions in views.py

db/tests/test_views.py

Lines 189 to 217 in 3cec7d0

    
           #TODO: these test should relate to something else not covered in query.py 
        
           #class TestGetResponseDatapoints(TestCaseBase): 
        
           # 
        
           #    data_dicts = [{"date": "1999-01-31", "freq": "m", "name": "CPI_ALCOHOL_rog", "value": 109.7}, 
        
           #                  {"date": "1999-01-31", "freq": "m", 
        
           #                      "name": "CPI_FOOD_rog", "value": 110.4}, 
        
           #                  {"date": "1999-01-31", "freq": "m", "name": "CPI_NONFOOD_rog", "value": 106.2}] 
        
           # 
        
           #    def _make_sample_datapoints_list(self): 
        
           #        return [Datapoint(**params) for params in self.data_dicts] 
        
           # 
        
           #    def test_json_serialising_is_valid(self): 
        
           #        data = self._make_sample_datapoints_list() 
        
           #        response = get_datapoints_response(data, 'json') 
        
           #        parsed_json = json.loads(response.data) 
        
           #        self.assertEqual(self.data_dicts, parsed_json) 
        
           # 
        
           #    def test_csv_serialising_is_valid(self): 
        
           #        data = self._make_sample_datapoints_list() 
        
           #        response = get_datapoints_response(data, 'csv') 
        
           #        csv_string = str(response.data, 'utf-8') 
        
           #        self.assertEqual( 
        
           #            ',CPI_ALCOHOL_rog\n1999-01-31,109.7\n1999-01-31,110.4\n1999-01-31,106.2\n', csv_string) 
        
           # 
        
           #    def test_invalid_output_format_should_fail(self): 
        
           #        data = self._make_sample_datapoints_list() 
        
           #        with self.assertRaises(CustomError400): 
        
           #            get_datapoints_response(data, 'html')

borrow variable name splitting code from kep parser

This can be a helper function in db package:

https://github.com/mini-kep/parser-rosstat-kep/blob/master/src/csv2df/util_label.py

We also need a listing of variable units of measurement descriptions, similar to #39, this can be also borrowed from here:

https://github.com/mini-kep/parser-rosstat-kep/blob/c0a4941202484127193146045a095d1fc6b2e8cc/src/csv2df/specification.py#L74-L84

unit tests for SQLAlchemy database operations

продам гараж

move utils.py to app folder

optionally extract serialiser.py from utils.py

buggy TokenHelper.__as_date

Hello everybody.

This link http://mini-kep.herokuapp.com/ru/series/GDP/a/yoy/12311111111111111111
produces:

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is >overloaded or there is an error in the application.

because TokenHelper.__as_date method is buggy.

Here's the relevant stacktrace:

127.0.0.1 - - [24/Oct/2017 16:26:24] "GET /ru/series/GDP/a/yoy/12311111111111111111 HTTP/1.1" 500 -
Traceback (most recent call last):
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1997, in call
return self.wsgi_app(environ, start_response)
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1985, in wsgi_app
response = self.handle_exception(e)
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1540, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise
raise value
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise
raise value
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/home/varnie/thrash/my-virt-environments/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functionsrule.endpoint
File "/home/varnie/thrash/projects/db/db/custom_api/views.py", line 22, in time_series_api_interface
return custom_api.CustomGET(domain, varname, freq, inner_path).get_csv_response()
File "/home/varnie/thrash/projects/db/db/custom_api/custom_api.py", line 180, in init
ip = InnerPath(inner_path)
File "/home/varnie/thrash/projects/db/db/custom_api/custom_api.py", line 141, in init
self.dates = helper.get_dates_dict()
File "/home/varnie/thrash/projects/db/db/custom_api/custom_api.py", line 73, in get_dates_dict
result['start_date'] = self._as_date(start_year, month=1, day=1)
File "/home/varnie/thrash/projects/db/db/custom_api/custom_api.py", line 107, in _as_date
day=day).strftime('%Y-%m-%d')
OverflowError: Python int too large to convert to C long

Also, not sure whether it is relevant or not, but the following link:
https://minikep-db.herokuapp.com/api/datapoints?name=BRENT&freq=d&start_date=2017-01-01
also produces:

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is >overloaded or there is an error in the application.

(I've taken this link from the README.MD: http://joxi.ru/J2b57NMiXJWWLm )

Thank you for your time.

pipeline discussion

This goes from parser 1 to database
====================================
[
    {
        "date": "2014-01-31",
        "freq": "m",
        "varname": "CPI_rog",
        "value": 100.6
    },

 ....
   
 {
        "date": "2015-12-31",
        "freq": "m",
        "varname": "CPI_rog",
        "value": 100.8
    },
    {
        "date": "2015-12-31",
        "freq": "m",
        "varname": "RUR_EUR_eop",
        "value": 79.7
    }
]

This goes from parser 2 to database
===================================
[
    {
        "date": "2014-03-31",
        "freq": "q",
        "varname": "CPI_rog",
        "value": 102.3
    }
]

User query
==========
{'end': '2015-12',
 'freq': 'm',
 'start': '2014-01',
 'varnames': ['CPI_rog', 'RUR_EUR_eop']}

App response to query: json with epoch timestamps
=================================================
This format is default input 
to user reader function pd.read_json()
{'CPI_rog': {'1391126400000': 100.6,
             '1393545600000': 100.7,
   ...
             '1448841600000': 100.8,
             '1451520000000': 100.8},
 'RUR_EUR_eop': {'1391126400000': 48.1,
                 '1393545600000': 49.35,
    ... 
                '1448841600000': 70.39,
                 '1451520000000': 79.7}}

User's local dataframe
======================
            CPI_rog  RUR_EUR_eop
2014-01-31    100.6        48.10
2014-02-28    100.7        49.35
2014-03-31    101.0        49.05
2014-04-30    100.9        49.51
2014-05-31    100.9        47.27
2014-06-30    100.6        45.83
2014-07-31    100.5        47.90
2014-08-31    100.2        48.63
2014-09-30    100.7        49.95
2014-10-31    100.8        54.64
2014-11-30    101.3        61.41
2014-12-31    102.6        68.34
2015-01-31    103.9        78.11
2015-02-28    102.2        68.69
2015-03-31    101.2        63.37
2015-04-30    100.5        56.81
2015-05-31    100.4        58.01
2015-06-30    100.2        61.52
2015-07-31    100.8        64.65
2015-08-31    100.4        75.05
2015-09-30    100.6        74.58
2015-10-31    100.7        70.75
2015-11-30    100.8        70.39
2015-12-31    100.8        79.70
Identical to source data: True

unclear test - maybe parametrise?

db/tests/test_views.py

Lines 184 to 197 in 9108da4

    
           # FIXME: ------------------------------------------------------------------ 
        
           def test_get_names_on_random_freq_returns_sorted_list_of_names_for_given_random_freq(self): 
        
               random_freq = self.query_random_freq_from_test_data() 
        
               response = self.query_names_for_freq(freq=random_freq) 
        
               result = json.loads(response.get_data().decode('utf-8')) 
        
               # expected result 
        
               expected_result = [] 
        
               for row in read_test_data(): 
        
                   if row['freq'] == random_freq and row['name'] not in expected_result: 
        
                       expected_result.append(row['name']) 
        
               expected_result = sorted(expected_result) 
        
               # check 
        
               assert result == expected_result 
        
           # ------------------------------------------------------------------------

add xlsx format

end-user code tests based on home.md (with requests)

https://github.com/mini-kep/frontend-app/blob/master/apps/templates/home.md

change tests for 'custom_api' to 'decomposer'

_________________ ERROR collecting tests/test_custom_api.py __________________
ImportError while importing test module 'C:\Users\Евгений\Documents\GitHub\mini-kep-db\tests\test_custom_api.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests\test_custom_api.py:2: in <module>
    import db.custom_api.custom_api as custom_api
E   ModuleNotFoundError: No module named 'db.custom_api.custom_api'

rename 'db' repo to 'store'?

'db' folder conflicts with conventional name 'db' in Flask, not too clean

new api/dataframe endpoint

Based on functionality from #37 can develop following endpoints:

api/dataframe?freq=a
api/dataframe?freq=q
api/dataframe?freq=m
api/dataframe&freq=d

Each endpoint should provide pandas-readable csv with all variables at such frequency

Should alos accept name, start_date and end_date parameters.

API extensions

api/datapoints/varnames/{freq} lists all variable names available as that frequency {freq} (assumed to be in aqmwd)
variable descriptions GDP -> Gross domestic product
use unit descriptions
varname splitter goes to helper function, will need to use it locally

unclear test

db/tests/test_queries.py

Lines 54 to 60 in 346ae2c

    
           def test_upsert_updates_value_for_existing_row(self): 
        
               upsert(self.dp1_dict) 
        
               dp1_updated_value = self.dp1_dict['value'] + 4.56 
        
               dp1_dict_with_new_value = {k: v if k != "value" else dp1_updated_value for k, v in self.dp1_dict.items()} 
        
               upsert(dp1_dict_with_new_value) 
        
               datapoint = select_datapoints(**self.dp1_search_param).first() 
        
               self.assertEqual(datapoint.value, dp1_updated_value)

codecov not implemented

travis call is simply pytest now, not creating .coverage file

make ' POST api/incoming' endpoint 'POST api/datapoints'

make POST api/incoming work on POST api/datapoints in views.py and associated test
change parsers upload method

not todo:

change db documentation - thi si not todo as I'll have to deploy to gh pages myself anyways and it is 1 lline.

http://mini-kep.herokuapp.com/ru/series/HELLO/BUDDY Internal Server Error

Good evening.

http://mini-kep.herokuapp.com/ru/series/HELLO/BUDDY

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

Thank you for your time.

Reserve variables for descriptions in name splitter

To make space for #39 and #42 will be solved.

cannot test API with .test_client()

arrange variables in sections

For the purposes of listing variables in groups and easier browsing we need a grouping dictionary that
lists variable nameheads.

Example:

{
    'GDP components': ['GDP', 'INVESTMENT'],
    'Prices': ['CPI', 'CPI_FOOD', 'CPI_NONFOOD', 'CPI_ALCOHOL'],
    'Foreign trade': ['EXPORT_GOODS', 'IMPORT_GOODS'],
    'Exchange rates': ['USDRUR_CB']
    'Interest rates': []
}

Based on nameheads and variable name splitting method one can derive the list of actual variable names like GDP_yoy.

work with flask model in stand-alone mode

https://github.com/mini-kep/db/blob/master/utils.py#L82-L120

custom API fails when no inner path provided

http://minikep-db.herokuapp.com/ru/series/CPI_rog/m/

This link should provide all available data for CPI_rog and m frequency.

change datapoints parameter handler class

Original class:

db/db/api/utils.py

Lines 51 to 143 in 6536d1b

    
           class DatapointParameters: 
        
               """Parameter handler for api\datapoints endpoint."""     
        
               def __init__(self, args): 
        
                   self.args = args 
        
                   self.name = self.get_name()  
        
                   if not self.name:  
        
                       raise CustomError400("<name> parameter is required") 
        
                   self.freq = self.get_freq()  
        
                   if not self.freq:  
        
                       raise CustomError400("<freq> parameter is required") 
        
               def get_freq(self):  
        
                   freq = self.args.get('freq')   
        
                   self.validate_freq_exist(freq) 
        
                   return freq 
        
               def get_name(self): 
        
                   freq = self.get_freq() 
        
                   name = self.args.get('name')   
        
                   self.validate_name_exist_for_given_freq(freq, name) 
        
                   return name 
        
               def get_start(self): 
        
                   start_dt = self.get_dt('start_date')   
        
                   if start_dt: 
        
                       self.validate_start_is_not_in_future(start_dt) 
        
                   return start_dt      
        
               def get_end(self): 
        
                   end_dt = self.get_dt('end_date') 
        
                   start_dt = self.get_start() 
        
                   if start_dt and end_dt: 
        
                       self.validate_end_date_after_start_date(start_dt, end_dt) 
        
                   return end_dt      
        
               def get_dt(self, key: str): 
        
                   dt = None 
        
                   date_str = self.args.get(key)   
        
                   if date_str: 
        
                       dt = to_date(date_str) 
        
                   return dt 
        
               def _get_boundary(self, direction): 
        
                   query = queries.get_boundary_date(self.freq, self.name, direction) 
        
                   return date_as_str(query)        
        
               def get_min_date(self): 
        
                   return self._get_boundary(direction='start') 
        
               def get_max_date(self): 
        
                   return self._get_boundary(direction='end') 
        
               def get(self): 
        
                   """Return query parameters as dictionary.""" 
        
                   return dict(name=self.name, 
        
                               freq=self.freq, 
        
                               start_date=self.get_start(), 
        
                               end_date=self.get_end()) 
        
               @staticmethod 
        
               def validate_freq_exist(freq): 
        
                   allowed = list(queries.select_unique_frequencies()) 
        
                   if freq in allowed: 
        
                       return True 
        
                   else:      
        
                       raise CustomError400(message=f'Invalid frequency <{freq}>', 
        
                                            payload={'allowed': allowed}) 
        
               @staticmethod 
        
               def validate_name_exist_for_given_freq(freq, name): 
        
                   possible_names = queries.possible_names_values(freq) 
        
                   if name in possible_names: 
        
                       return True 
        
                   else: 
        
                       msg = f'No such name <{name}> for <{freq}> frequency.' 
        
                       raise CustomError400(message=msg, 
        
                                            payload={"allowed": possible_names}) 
        
               @staticmethod 
        
               def validate_start_is_not_in_future(start_date): 
        
                   current_date = datetime.date(datetime.utcnow()) 
        
                   #TODO: test on date = today must pass 
        
                   if start_date > current_date: 
        
                       raise CustomError400('Start date cannot be in future') 
        
                   else: 
        
                       return True             
        
               @staticmethod 
        
               def validate_end_date_after_start_date(start_date, end_date): 
        
                   if end_date < start_date: 
        
                       raise CustomError400('End date must be after start date')           
        
                   else: 
        
                       return True

make two datasets for testing

D1: all 2016 data in test_datapoints_list_2016.json
D2: all 2017 data in test_datapoints_list_2017.json

D1 should be dev database, D1 and D2 can be used in testing

To be created at around here https://github.com/mini-kep/parsers/blob/master/parsers/runner.py#L183-L189

Swagger documentation

document views.py with docstrings / .yml compatible with flask-swagger
provide documentation at spec endpoint

duplicate code for base testcase

db/tests/test_views.py

Lines 49 to 87 in 9108da4

    
           class TestCaseBase(unittest.TestCase): 
        
               """Base class for testing flask application.     
        
               Use to compose setUp method:  
        
                   self._prepare_app() 
        
                   self._mount_blueprint() 
        
                   self._prepare_db() 
        
                   self._start_client() 
        
               """ 
        
               def _prepare_db(self): 
        
                   data = read_test_data() 
        
                   for datapoint in data: 
        
                       datapoint['date'] = utils.to_date(datapoint['date']) 
        
                   fsa_db.session.bulk_insert_mappings(Datapoint, data) 
        
               def _prepare_app(self):     
        
                   self.app = make_app()                
        
                   self.app_context = self.app.app_context() 
        
                   self.app_context.push() 
        
                   fsa_db.init_app(app=self.app) 
        
                   fsa_db.create_all()    
        
               def _mount_blueprint(self): 
        
                   self.app.register_blueprint(api_module) 
        
               def _start_client(self): 
        
                   self.client = self.app.test_client() 
        
               def setUp(self):    
        
                   self._prepare_app() 
        
               def tearDown(self): 
        
                   fsa_db.session.remove() 
        
                   fsa_db.drop_all() 
        
                   self.app_context.pop() 
        
               def test_app_exists(self): 
        
                   self.assertTrue(current_app is not None)

db/tests/test_basic.py

Lines 54 to 85 in 9108da4

    
           class TestCaseBase(unittest.TestCase): 
        
               def prepare_db(self): 
        
                   data = read_test_data() 
        
                   for datapoint in data: 
        
                       datapoint['date'] = utils.to_date(datapoint['date']) 
        
                   fsa_db.session.bulk_insert_mappings(Datapoint, data) 
        
               def prepare_app(self):     
        
                   self.app = make_app()                
        
                   self.app_context = self.app.app_context() 
        
                   self.app_context.push() 
        
                   self.client = self.app.test_client() 
        
                   fsa_db.init_app(app=self.app) 
        
                   fsa_db.create_all()    
        
               def mount_blueprint(self): 
        
                   self.app.register_blueprint(api_module) 
        
                   self.app.register_blueprint(custom_api_bp) 
        
               def start_client(self): 
        
                   self.client = self.app.test_client()     
        
               def setUp(self):    
        
                   self.prepare_app() 
        
               def tearDown(self): 
        
                   fsa_db.session.remove() 
        
                   fsa_db.drop_all() 
        
                   self.app_context.pop() 
        
               def test_app_exists(self): 
        
                   self.assertTrue(current_app is not None)

select error handling function for views.py

This shuts down validation error messages and hurts testing

# Return validation errors as JSON
@api.errorhandler(400)
def handle_validation_error(err):
    exc = err.exc
    return jsonify({'errors': exc.messages}), 422

#@api.errorhandler(CustomError400)
#def handle_invalid_usage(error):
#    """
#    Generate a json object of a custom error
#    """
#    response = jsonify(error.to_dict())
#    response.status_code = error.status_code
#    return response

rename 'custom_api_blueprint.py'

db/db/custom_api/custom_api_blueprint.py

Line 1 in 21485a7

#FIXME: can this file be db/custom_api/views.py?

review lambda function assignment

db/tests/test_views.py

Line 34 in 466714a

is_var = lambda d: d['name'] == name and d['freq'] == freq

https://docs.quantifiedcode.com/python-anti-patterns/correctness/assigning_a_lambda_to_a_variable.html

'api/frame' serves csv with many variables

http://minikep-db.herokuapp.com/api/datapoints?freq=a&name=GDP_yoy&start_date=2013-12-31:

,GDP_yoy
2013-12-31,101.3
2014-12-31,100.7
2015-12-31,97.2
2016-12-31,99.8

http://minikep-db.herokuapp.com/api/datapoints?freq=a&name=CPI_rog&start_date=2013-12-31:

,CPI_rog
2013-12-31,106.5
2014-12-31,111.4
2015-12-31,112.9
2016-12-31,105.4

Need to implement:

http://minikep-db.herokuapp.com/api/frame?freq=a&names=GDP_yoy,CPI_rog&start_date=2013-12-31:

,GDP_yoy,CPI_rog
2013-12-31,101.3,106.5
2014-12-31,100.7,111.4
2015-12-31,97.2,112.9
2016-12-31,99.8,105.4

This concatenation is easier when time index is the same, while in daily this may not be the case.

Todo:

pandas code to merge dataframes
example for daily data with different dates in time index
test based on this pandas code

To discuss:

ideas for server-side implementation
- may use same pandas code, but it may be rather slow
- may construct csv file
how should json behave - provide some data structure readable by pandas into dataframe
or provide similar easy listing

reorganise db repo structure

This repo has several purposes and branches, mainly:

try the data pipeline (files now moved to intro repo)
write some tests for CRUD methods in plain SQLAlchemy (master branch), latest commits by @varnie
db layer implementation (flask-sqlalchemy), created by @SuperVasya
it also holds a text for db layer specification and some http tests for it (https://github.com/mini-kep/db/tree/db-views/specification)

validate freq, name and date parameters in \datapoints

In parameter handling we should have:

freq must be certain letters, notify if not
if name not found at this given this frequency, notify
check dates are valid:
- end is after start
- start is or before today

Currently on all checks return [], bu that is a simplification.

variable text descriptions

The parsers should be able to emit variable descriptions in form of list of dictionaries:

[
    {'BRENT': dict(ru='Цена нефти Brent', en='Brent oil price')},
# ...
]

Need api/title endpoint with POST/GET methods to strore/retrieve this information from the database + new model to store this information.

api/title?name=BRENT should return dict(ru='Цена нефти Brent', en='Brent oil price')

2017-11-08T20:45:23.097873+00:00 heroku[router]: at=error code=H12 desc="Request timeout" method=POST path="/api/incoming" host=minikep-db.herokuapp.com request_id=b554c78d-d8b8-4461-af76-f2ee4e1c68dc fwd="5.227.7.58" dyno=web.1 connect=0ms service=30490ms status=503 bytes=0 protocol=https
2017-11-08T20:45:25.292033+00:00 heroku[router]: at=error code=H12 desc="Request timeout" method=GET path="/api/info?freq=d&name=UST_30YEARDISPLAY" host=minikep-db.herokuapp.com request_id=ebbdfee2-0164-43d6-84a6-bb93aa7e3e9a fwd="54.74.71.204" dyno=web.1 connect=0ms service=30000ms status=503 bytes=0 protocol=http
2017-11-08T20:45:25.285737+00:00 heroku[router]: at=error code=H12 desc="Request timeout" method=GET path="/api/datapoints?freq=d&name=UST_30YEARDISPLAY&format=json" host=minikep-db.herokuapp.com request_id=f908e353-4c58-4afa-af3b-f1df9f4f52e0 fwd="54.74.71.204" dyno=web.1 connect=0ms service=30001ms status=503 bytes=0 protocol=http

must be f-string in utils.py

db/db/api/utils.py

Line 116 in 5c91971

raise CustomError400(message='Invalid frequency <{freq}>',

хватает указания, что это f-string должно быть
raise CustomError400(message=f'Invalid frequency <{freq}>',

Если сейчас посмотрите, сообщение о ошибке немного не то

https://minikep-db.herokuapp.com/api/datapoints?name=ABC&freq=z&format=json

delete SPEC.md

https://github.com/mini-kep/db/blob/master/SPEC.md is not neccessary in repo root as it is hard to maintian two spec files.

The spec is at https://github.com/mini-kep/intro/blob/master/specification/database.md, please put new changes there.

Must delete SPEC.md

upload tests fail

db/tests/test_views.py

Lines 27 to 54 in 3cec7d0

    
           # FIXME ---------------------------------------------- 
        
               @pytest.mark.xfail 
        
               def test_on_no_auth_token_returns_forbidden_status_error_code(self): 
        
                   response = self.client.post('/api/datapoints') 
        
                   assert response.status_code is ERROR_CODES  
        
               @pytest.mark.xfail 
        
               def test_on_new_data_upload_successfull_with_code_200(self): 
        
                   _token_dict = dict(API_TOKEN=self.app.config['API_TOKEN']) 
        
                   _data = json.dumps(self.test_data) 
        
                   response = self.get_response(data=_data, headers=_token_dict) 
        
                   assert response.status_code == 200 
        
               @pytest.mark.xfail 
        
               def test_on_existing_data_upload_successfull_with_code_200(self): 
        
                   _token_dict = dict(API_TOKEN=self.app.config['API_TOKEN']) 
        
                   _data = json.dumps(self.test_data[0:10]) 
        
                   response = self.get_response(data=_data, headers=_token_dict) 
        
                   assert response.status_code == 200 
        
               @pytest.mark.xfail 
        
               def test_on_broken_data_upload_returns_error_code(self): 
        
                   _token_dict = dict(API_TOKEN=self.app.config['API_TOKEN']) 
        
                   response = self.get_response(data="___broken_json_data__", headers=_token_dict) 
        
                   assert response.status_code in ERROR_CODES 
        
           # -----------------------------------------------------

specification for datapoint db

In this pipline, we have the following situation:

the parser can deliver a list of dicts, each dict is a datapoint
the database should have a POST method to api\incoming to write incoming json to db
the POST operation shoudl have some authentication
we simplify the rules for now and all data is upserted (newer data overwrites older)
the database has GET method styled around the datapoint key (variable name, frequency, start date and end date)
the GET operation is public API

@SuperVasya: please extend questions/descriptions.

At this issue we want:

description of db methods (based on above)
descritpion of incoming/outgoing data
list options to implemet (eg flask+sqlalchemy+heroku+postgres)
ideally, how to test this (in words/pseudocode)

def _find_by(session_factory, condition=None):
    # FIXME: why not working with context manager?
    # DetachedInstanceError: Instance <Datapoint at 0x9074ef0> 
    # is not bound to a Session; attribute refresh operation cannot proceed
    with Session(session_factory) as session:
        query = session.query(Datapoint)
        if condition is not None:
            return query.filter_by(**condition).all()
        else:
            return query.all()

text descriptions of units of measurement

[
    'rog': dict(ru='% к пред. периоду', en='% change to previous period')
    'yoy': dict(ru='% год к году', en='% change to 12 month earlier')
]

Can refactor dictionary below to one above.

UNIT_NAMES = {'bln_rub': 'млрд.руб.',
              'bln_usd': 'млрд.долл.',
              'gdp_percent': '% ВВП',
              'mln_rub': 'млн.руб.',
              'rub': 'руб.',
              'rog': '% к пред. периоду',
              'yoy': '% год к году',
              'ytd': 'период с начала года',
              'pct': '%',
              'bln_tkm': 'млрд. тонно-км'}

	#TODO: these test should relate to something else not covered in query.py

	#class TestGetResponseDatapoints(TestCaseBase):
	#
	# data_dicts = [{"date": "1999-01-31", "freq": "m", "name": "CPI_ALCOHOL_rog", "value": 109.7},
	# {"date": "1999-01-31", "freq": "m",
	# "name": "CPI_FOOD_rog", "value": 110.4},
	# {"date": "1999-01-31", "freq": "m", "name": "CPI_NONFOOD_rog", "value": 106.2}]
	#
	# def _make_sample_datapoints_list(self):
	# return [Datapoint(**params) for params in self.data_dicts]
	#
	# def test_json_serialising_is_valid(self):
	# data = self._make_sample_datapoints_list()
	# response = get_datapoints_response(data, 'json')
	# parsed_json = json.loads(response.data)
	# self.assertEqual(self.data_dicts, parsed_json)
	#
	# def test_csv_serialising_is_valid(self):
	# data = self._make_sample_datapoints_list()
	# response = get_datapoints_response(data, 'csv')
	# csv_string = str(response.data, 'utf-8')
	# self.assertEqual(
	# ',CPI_ALCOHOL_rog\n1999-01-31,109.7\n1999-01-31,110.4\n1999-01-31,106.2\n', csv_string)
	#
	# def test_invalid_output_format_should_fail(self):
	# data = self._make_sample_datapoints_list()
	# with self.assertRaises(CustomError400):
	# get_datapoints_response(data, 'html')

	# FIXME: ------------------------------------------------------------------
	def test_get_names_on_random_freq_returns_sorted_list_of_names_for_given_random_freq(self):
	random_freq = self.query_random_freq_from_test_data()
	response = self.query_names_for_freq(freq=random_freq)
	result = json.loads(response.get_data().decode('utf-8'))
	# expected result
	expected_result = []
	for row in read_test_data():
	if row['freq'] == random_freq and row['name'] not in expected_result:
	expected_result.append(row['name'])
	expected_result = sorted(expected_result)
	# check
	assert result == expected_result
	# ------------------------------------------------------------------------

	def test_upsert_updates_value_for_existing_row(self):
	upsert(self.dp1_dict)
	dp1_updated_value = self.dp1_dict['value'] + 4.56
	dp1_dict_with_new_value = {k: v if k != "value" else dp1_updated_value for k, v in self.dp1_dict.items()}
	upsert(dp1_dict_with_new_value)
	datapoint = select_datapoints(**self.dp1_search_param).first()
	self.assertEqual(datapoint.value, dp1_updated_value)

	class DatapointParameters:
	"""Parameter handler for api\datapoints endpoint."""
	def __init__(self, args):
	self.args = args
	self.name = self.get_name()
	if not self.name:
	raise CustomError400("<name> parameter is required")
	self.freq = self.get_freq()
	if not self.freq:
	raise CustomError400("<freq> parameter is required")

	def get_freq(self):
	freq = self.args.get('freq')
	self.validate_freq_exist(freq)
	return freq

	def get_name(self):
	freq = self.get_freq()
	name = self.args.get('name')
	self.validate_name_exist_for_given_freq(freq, name)
	return name

	def get_start(self):
	start_dt = self.get_dt('start_date')
	if start_dt:
	self.validate_start_is_not_in_future(start_dt)
	return start_dt

	def get_end(self):
	end_dt = self.get_dt('end_date')
	start_dt = self.get_start()
	if start_dt and end_dt:
	self.validate_end_date_after_start_date(start_dt, end_dt)
	return end_dt

	def get_dt(self, key: str):
	dt = None
	date_str = self.args.get(key)
	if date_str:
	dt = to_date(date_str)
	return dt

	def _get_boundary(self, direction):
	query = queries.get_boundary_date(self.freq, self.name, direction)
	return date_as_str(query)

	def get_min_date(self):
	return self._get_boundary(direction='start')

	def get_max_date(self):
	return self._get_boundary(direction='end')

	def get(self):
	"""Return query parameters as dictionary."""
	return dict(name=self.name,
	freq=self.freq,
	start_date=self.get_start(),
	end_date=self.get_end())

	@staticmethod
	def validate_freq_exist(freq):
	allowed = list(queries.select_unique_frequencies())
	if freq in allowed:
	return True
	else:
	raise CustomError400(message=f'Invalid frequency <{freq}>',
	payload={'allowed': allowed})

	@staticmethod
	def validate_name_exist_for_given_freq(freq, name):
	possible_names = queries.possible_names_values(freq)
	if name in possible_names:
	return True
	else:
	msg = f'No such name <{name}> for <{freq}> frequency.'
	raise CustomError400(message=msg,
	payload={"allowed": possible_names})

	@staticmethod
	def validate_start_is_not_in_future(start_date):
	current_date = datetime.date(datetime.utcnow())
	#TODO: test on date = today must pass
	if start_date > current_date:
	raise CustomError400('Start date cannot be in future')
	else:
	return True

	@staticmethod
	def validate_end_date_after_start_date(start_date, end_date):
	if end_date < start_date:
	raise CustomError400('End date must be after start date')
	else:
	return True

	class TestCaseBase(unittest.TestCase):
	"""Base class for testing flask application.

	Use to compose setUp method:
	self._prepare_app()
	self._mount_blueprint()
	self._prepare_db()
	self._start_client()

	"""
	def _prepare_db(self):
	data = read_test_data()
	for datapoint in data:
	datapoint['date'] = utils.to_date(datapoint['date'])
	fsa_db.session.bulk_insert_mappings(Datapoint, data)

	def _prepare_app(self):
	self.app = make_app()
	self.app_context = self.app.app_context()
	self.app_context.push()
	fsa_db.init_app(app=self.app)
	fsa_db.create_all()

	def _mount_blueprint(self):
	self.app.register_blueprint(api_module)

	def _start_client(self):
	self.client = self.app.test_client()

	def setUp(self):
	self._prepare_app()

	def tearDown(self):
	fsa_db.session.remove()
	fsa_db.drop_all()
	self.app_context.pop()

	def test_app_exists(self):
	self.assertTrue(current_app is not None)

	class TestCaseBase(unittest.TestCase):
	def prepare_db(self):
	data = read_test_data()
	for datapoint in data:
	datapoint['date'] = utils.to_date(datapoint['date'])
	fsa_db.session.bulk_insert_mappings(Datapoint, data)

	def prepare_app(self):
	self.app = make_app()
	self.app_context = self.app.app_context()
	self.app_context.push()
	self.client = self.app.test_client()
	fsa_db.init_app(app=self.app)
	fsa_db.create_all()

	def mount_blueprint(self):
	self.app.register_blueprint(api_module)
	self.app.register_blueprint(custom_api_bp)

	def start_client(self):
	self.client = self.app.test_client()

	def setUp(self):
	self.prepare_app()

	def tearDown(self):
	fsa_db.session.remove()
	fsa_db.drop_all()
	self.app_context.pop()

	def test_app_exists(self):
	self.assertTrue(current_app is not None)

	# FIXME ----------------------------------------------

	@pytest.mark.xfail
	def test_on_no_auth_token_returns_forbidden_status_error_code(self):
	response = self.client.post('/api/datapoints')
	assert response.status_code is ERROR_CODES

	@pytest.mark.xfail
	def test_on_new_data_upload_successfull_with_code_200(self):
	_token_dict = dict(API_TOKEN=self.app.config['API_TOKEN'])
	_data = json.dumps(self.test_data)
	response = self.get_response(data=_data, headers=_token_dict)
	assert response.status_code == 200

	@pytest.mark.xfail
	def test_on_existing_data_upload_successfull_with_code_200(self):
	_token_dict = dict(API_TOKEN=self.app.config['API_TOKEN'])
	_data = json.dumps(self.test_data[0:10])
	response = self.get_response(data=_data, headers=_token_dict)
	assert response.status_code == 200

	@pytest.mark.xfail
	def test_on_broken_data_upload_returns_error_code(self):
	_token_dict = dict(API_TOKEN=self.app.config['API_TOKEN'])
	response = self.get_response(data="___broken_json_data__", headers=_token_dict)
	assert response.status_code in ERROR_CODES

	# -----------------------------------------------------

mini-kep / db Goto Github PK

db's People

Contributors

Stargazers

Watchers

Forkers

db's Issues

Recommend Projects

Recommend Topics

Recommend Org