Giter Site home page Giter Site logo

pypgtable's People

Contributors

shapedsundew9 avatar

Watchers

 avatar

pypgtable's Issues

A memory leak test

Should cycle through the set of operations.
Need to monitor process memory overtime and look for growth.

Look at the handling of empty lists in WHERE clauses

DEBUG pypgtable.row_iterators:row_iterators.py:43 Closing held DB cursor.
DEBUG pypgtable.raw_table:raw_table.py:254 SELECT "population_hash" FROM "gene_pool_populations" WHERE "population_hash" = ANY('{}')
================================================================================== short test summary info ===================================================================================
FAILED tests/test_population_config.py::test_configure_populations_empty - psycopg2.errors.UndefinedObject: could not find array type for data type bytea[]
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Add execute_on_create() arbitary SQL.

If a process creates a table it may want to add triggers or functions.
SQL CREATE OR REPLACE FUNCTION can used but every time a process spins up the function will be modified which seems some what wasteful. Since IF EXISTS does not exists in SQL for functions we can add something that is only run if the process creates the table.

Fix import

Should be:

from pypgtable import table, raw_table

Add iterable of column names to register_conversion() in table

Annoying to have to list several of the same. e.g.
mission_table.register_conversion('timestamp', str_to_date, date_to_str)
mission_table.register_conversion('execution_start', str_to_date, date_to_str)
mission_table.register_conversion('execution_end', str_to_date, date_to_str)
mission_table.register_conversion('created', str_to_date, date_to_str)

-->

mission_table.register_conversion(('timestamp', 'execution_start', 'execution_end', 'created'), str_to_date, date_to_str)

Make start up a state machine

The interaction of create_, delete_, wait_* is not clear and prone to error.
Create a state matrix to show how state transitions occur.
Update construction code to follow matrix & be more readable/reliable.

Remove data validation

table.validate() & raw_table.validate() make little sense.
They are isolation functions that do not have any interaction with the table.
There is no validation on initial population because it is reasonable to assume the application has already vetted the data file (or produced it at some point).

Validation of the data beyond what psycopg2 & postgresql do is the applications responsibility.
Remove it.

Add typing!

Consider Typeddict class extensions for inheritance too.

Error creating pkdict with all columns

missions = mission_table.select(container='pkdict')

resulted in

Traceback (most recent call last):
File "history.py", line 35, in
missions = mission_table.select(container='pkdict')
File "/usr/local/lib/python3.8/dist-packages/pypgtable/table.py", line 260, in select
_columns.append(self.raw._primary_key)
AttributeError: 'tuple' object has no attribute 'append'

Issue is self._columns is a tuple if columns = '*' in the select(). NB: There is no need to add the PK if all columns are selected.

Better present Cerberus validation errors

e.g.

def _validate_config(self):
"""Validate the table configuration."""
if not raw_table_config_validator.validate(self.config):
error_str = '\n'
for field, error in raw_table_config_validator.errors.items():
tt = text_token({'E05000': {'error': field + ': ' + str(error)}})
_logger.error(tt)
error_str += str(tt) + '\n'

      raise ValueError(error_str)

E ValueError:
E E05000: Configuration error: data_files: ['Data file /home/shapedsundew9/Projects/egp-genomic-library/genomic_library/data/mutations.json empty or invalid.']
E E05000: Configuration error: dbname: ['unknown field']
E E05000: Configuration error: schema: [{'count': ["value does not match regex '[a-zA-Z][a-zA-Z0-9-]'"], 'evolvability': ["value does not match regex '[a-zA-Z][a-zA-Z0-9-]'"], 'reference_count': ["value does not match regex '[a-zA-Z][a-zA-Z0-9-]*'"], 'alpha_class': ['Value: True, Rule: None, Constraint: None'], 'ancestor': ['Value: True, Rule: None, Constraint: None'], 'beta_class': ['Value: True, Rule: None, Constraint: None'], 'gca': ['Value: True, Rule: None, Constraint: None'], 'gcb': ['Value: True, Rule: None, Constraint: None'], 'input_types': ['Value: True, Rule: None, Constraint: None', 'Value: True, Rule: None, Constraint: None'], 'meta_data': ['Value: True, Rule: None, Constraint: None'], 'output_types': ['Value: True, Rule: None, Constraint: None', 'Value: True, Rule: None, Constraint: None']}]
E E05000: Configuration error: table: ['required field']

../../.local/lib/python3.8/site-packages/pypgtable/raw_table.py:140: ValueError

Lists of lists need to be pretty presented.
This function should be moved to base_validator.py in utils.

x IN () is invalid SQL

Passing an empty tuple into query such as
"WHERE x IN {my_tuple}" where {"my_tuple": tuple()} will result in an SQL syntax error.
ERROR: syntax error at or near ")"

To avoid this you can either:

  1. Design out the possibility of an empty tuple() occurring.
  2. Conditionally choose your SQL string to add or remove the expression depending on the tuple contents.
  3. Set 'expand_null_tuples' to True in the table config which will expand an empty tuple to be (None,) and wrap expressions thus:
    WHERE (x IN {my_tuple}) IS TRUE
    or for the intended case of "WHERE x NOT IN {my_tuple}"
    WHERE (x NOT IN {my_tuple}) IS NOT FALSE
    This will ensure the correct match when a NULL result is returned from the expression.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.