dmitryduev / penquins Goto Github PK
View Code? Open in Web Editor NEWA python client for Kowalski
License: MIT License
A python client for Kowalski
License: MIT License
When performing a find
query on any ZTF_exposures table in kowalski, gloria, or melman, errors are being returned whenever any filtering criteria are applied. As an example, if I want to query the ZTF_exposures_20210401 table on gloria for all exposures that have been acquired in the g-band in field 562, ccd 4, quad 3, I compose and execute the following query:
from penquins import Kowalski
exp_query = {
'query_type': 'find',
'query' : {
'catalog': 'ZTF_exposures_20210401',
'filter': {
'field': int(562),
'ccd': int(4),
'quad': int(3),
'filter': int(1)
},
'projection': {
'_id': 0
}
}
}
gloria = Kowalski(
token=my_token,
protocol='https',
host='gloria.caltech.edu',
port=443
)
response = gloria.query(query=exp_query)
The resulting error message is quite long, but basically looks like a timeout error:
ReadTimeoutError: HTTPSConnectionPool(host='gloria.caltech.edu', port=443): Read timed out. (read timeout=5)
If I change the query to include a limit on the number of returned entries that is smaller than the total number of entries that match the filtering criteria, then the query will work, such as:
exp_query = {
'query_type': 'find',
'query' : {
'catalog': 'ZTF_exposures_20210401',
'filter': {
'field': int(562),
'ccd': int(4),
'quad': int(3),
'filter': int(1)
},
'projection': {
'_id': 0
}
},
'kwargs': {
'limit': 10
}
}
For this particular field+ccd+quad+filter, I can increase the limit
up to 243 without any errors, but at 244 I get the same timeout error as before. Presumably, 243 is the total number of entries that match the filtering criteria. If I knew the total number beforehand, I could work with this, but the count_documents
query is also failing in the same way as the queries above. For example, the following query also results in a timeout error:
doc_query = {
'query_type': 'count_documents',
'query' : {
'catalog': 'ZTF_exposures_20210401',
'filter': {
'field': int(562),
'ccd': int(4),
'quad': int(3),
'filter': int(1),
},
}
}
response = gloria.query(query=doc_query)
The expected behavior for all of these queries would be to simply return the maximum number of entries that match the filtering criteria when no limit
is set, or when the limit
value is greater than the number of matching entries. This is how queries on other tables behave, such as the following query into the ZTF_sources_20210401 table:
source_query = {
'query_type': 'find',
'query' : {
'catalog': 'ZTF_sources_20210401',
'filter': {
'field': int(562),
'ccd': int(4),
'quad': int(3),
'filter': int(1)
},
'projection': {
'_id': 0
}
}
}
response = gloria.query(query=source_query)
which successfully returns 27,400 entries.
I am running this code using penquins 2.2.0
and python 3.8.15
.
The query
method is synchronous, meaning that if it is used in, e.g., a web handler, it blocks. Would it be possible to use httpx or similar instead of requests so that .query(...)
can be async awaited?
In an effort to keep the code maintained, up to date, and extensible, we would like to move it from your GitHub account @dmitryduev to SkyPortal, where Kowalski is as well.
I get a deprecation warning from Retry
:
DeprecationWarning: Using 'method_whitelist' with Retry is deprecated and will be removed in v2.0. Use 'allowed_methods' instead
Should be super easy to fix :-)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.