Giter Site home page Giter Site logo

penquins's Introduction

๐Ÿ”ญ

penquins's People

Contributors

bfhealy avatar dmitryduev avatar kmshin1397 avatar lpsinger avatar mcoughlin avatar theodlz avatar virajkaram avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

penquins's Issues

Unexpected behavior when filtering `find` queries of ZTF_exposures tables

When performing a find query on any ZTF_exposures table in kowalski, gloria, or melman, errors are being returned whenever any filtering criteria are applied. As an example, if I want to query the ZTF_exposures_20210401 table on gloria for all exposures that have been acquired in the g-band in field 562, ccd 4, quad 3, I compose and execute the following query:

from penquins import Kowalski

exp_query = {
    'query_type': 'find',
    'query' : {
        'catalog': 'ZTF_exposures_20210401',
        'filter': {
            'field': int(562),
            'ccd': int(4),
            'quad': int(3),
            'filter': int(1)
        },
        'projection': {
            '_id': 0
        }
    }
}

gloria = Kowalski(
    token=my_token,
    protocol='https',
    host='gloria.caltech.edu',
    port=443
)
response = gloria.query(query=exp_query)

The resulting error message is quite long, but basically looks like a timeout error:

ReadTimeoutError: HTTPSConnectionPool(host='gloria.caltech.edu', port=443): Read timed out. (read timeout=5)

If I change the query to include a limit on the number of returned entries that is smaller than the total number of entries that match the filtering criteria, then the query will work, such as:

exp_query = {
    'query_type': 'find',
    'query' : {
        'catalog': 'ZTF_exposures_20210401',
        'filter': {
            'field': int(562),
            'ccd': int(4),
            'quad': int(3),
            'filter': int(1)
        },
        'projection': {
            '_id': 0
        }
    },
    'kwargs': {
        'limit': 10
    }
}

For this particular field+ccd+quad+filter, I can increase the limit up to 243 without any errors, but at 244 I get the same timeout error as before. Presumably, 243 is the total number of entries that match the filtering criteria. If I knew the total number beforehand, I could work with this, but the count_documents query is also failing in the same way as the queries above. For example, the following query also results in a timeout error:

doc_query = {
    'query_type': 'count_documents',
    'query' : {
        'catalog': 'ZTF_exposures_20210401',
        'filter': {
            'field': int(562),
            'ccd': int(4),
            'quad': int(3),
            'filter': int(1),
        },
    }
}
response = gloria.query(query=doc_query)

The expected behavior for all of these queries would be to simply return the maximum number of entries that match the filtering criteria when no limit is set, or when the limit value is greater than the number of matching entries. This is how queries on other tables behave, such as the following query into the ZTF_sources_20210401 table:

source_query = {
    'query_type': 'find',
    'query' : {
        'catalog': 'ZTF_sources_20210401',
        'filter': {
            'field': int(562),
            'ccd': int(4),
            'quad': int(3),
            'filter': int(1)
        },
        'projection': {
            '_id': 0
        }
    }
}
response = gloria.query(query=source_query)

which successfully returns 27,400 entries.

I am running this code using penquins 2.2.0 and python 3.8.15.

Queries are synchronous

The query method is synchronous, meaning that if it is used in, e.g., a web handler, it blocks. Would it be possible to use httpx or similar instead of requests so that .query(...) can be async awaited?

Retry deprecation warning

I get a deprecation warning from Retry:

DeprecationWarning: Using 'method_whitelist' with Retry is deprecated and will be removed in v2.0. Use 'allowed_methods' instead

Should be super easy to fix :-)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.