Comments (8)
- Could you please try your use case this in the Python driver to make sure that it works there?
- How many Goroutines are you running concurrently?
- Haye you tried using presigned URLs for your Parquet files to make sure, that circumvent possible bad credential? See: duckdb/duckdb#5310
As we're relaying the query to DuckDB and letting DuckDB execute it, I think the only way to debug this is to include the HTTP response body to any bad requests in the error message. Not quite sure if the DuckDB developers have soimething already planned.
from go-duckdb.
Hi @marcboeker
I am running 5-10 Goroutines concurrently. I resorted to copy postgres data locally in duckdb native db file format as I did not have time to try it in Python. I will try it later when I am free again. You can close this ticket if you want.
from go-duckdb.
On a different note, I am setting up duckdb in a Go service in AWS ECS(a container service) to power the backend of an analytics page. I am getting a very slow response from the container(10-20s) even when the query itself is fast(400ms) while we run it independently. I increased container capacity as well but no luck.
Do you know similar projects where duckdb is being deployed with a containerized Go service? If you can share any references(code, documentation, github repos) related to this usecase, it would help me a lot.
Sorry for asking this question here. I had no other place to ask Go-Duckdb related stuff.
from go-duckdb.
I am getting a very slow response from the container(10-20s) even when the query itself is fast(400ms) while we run it independently. I increased container capacity as well but no luck.
Are your DuckDB queries accessing third party services like S3? Is the response still slow if you do a SELECT 1
? Please try different queries to pinpoint the problem to either the DuckDB Go driver or the query itself.
from go-duckdb.
Nope, just querying the local .duckdb
file without any external dependencies. SELECT 1
took around 400 ms which is higher than expected. From profiler, it can be seen that CGO call is taking long time inside container:
I created this barebone api with a single query and the results are similar.
https://github.com/nirdosh17/go-service-with-duckdb
from go-duckdb.
I've tried top reproduce your problem using your repo on different platforms:
Docker container on macOS host with GOARCH=amd64 and libs provided by DuckDB
[GIN] 2023/02/22 - 12:23:21 | 200 | 373.418126ms | 172.17.0.1 | GET "/stats"
[GIN] 2023/02/22 - 12:23:25 | 200 | 251.109292ms | 172.17.0.1 | GET "/stats"
Docker container on macOS host with GOARCH=AMD64 and our libs
[GIN] 2023/02/22 - 12:50:01 | 200 | 253.020833ms | 172.17.0.1 | GET "/stats"
[GIN] 2023/02/22 - 12:50:03 | 200 | 152.675833ms | 172.17.0.1 | GET "/stats"
[GIN] 2023/02/22 - 12:50:04 | 200 | 162.026166ms | 172.17.0.1 | GET "/stats"
[GIN] 2023/02/22 - 12:50:05 | 200 | 126.905958ms | 172.17.0.1 | GET "/stats"
[GIN] 2023/02/22 - 12:50:05 | 200 | 158.63575ms | 172.17.0.1 | GET "/stats"
[GIN] 2023/02/22 - 12:50:06 | 200 | 154.09875ms | 172.17.0.1 | GET "/stats"
Native on macOS with GOARCH=arm64 and our libs
[GIN] 2023/02/22 - 13:52:11 | 200 | 16.091792ms | 127.0.0.1 | GET "/stats"
[GIN] 2023/02/22 - 13:52:15 | 200 | 14.091167ms | 127.0.0.1 | GET "/stats"
[GIN] 2023/02/22 - 13:52:16 | 200 | 14.013292ms | 127.0.0.1 | GET "/stats"
[GIN] 2023/02/22 - 13:52:16 | 200 | 12.201916ms | 127.0.0.1 | GET "/stats"
As you can see, it runs the fastest on macOS native with GOARCH=arm64. The reason is not primarily Docker but the emulation from amd64 in the container to arm64 on the host which is done by Rosetta.
What is your host platform? Maybe macOS with arm64? You could try running the container in EC2 on a native amd64 platform to better compare the results.
from go-duckdb.
Thanks @marcboeker. That seems to be the main reason for performance degradation in my local machine. However, in ECS, I am running the container in linux/amd64
. At the moment, I am writing the duckdb file inside the container after copying from postgres. May be there is some I/O limitation while reading the file. So, I am trying to use another storage(EFS). Let's see how it goes.
from go-duckdb.
@nirdosh17 I'm going to close this issue. Feel free to reopen it, if there is a reproducible bug in go-duckdb.
from go-duckdb.
Related Issues (20)
- Is go-duckdb affected by duckdb/duckdb#10634? HOT 2
- NULL-bytes in BLOB HOT 1
- JSON (or any extension type) usage with Appender HOT 1
- Tables containing columns of type TIMESTAMPTZ cannot be queried. HOT 15
- install HOT 13
- Compilation error HOT 4
- Cannot pass schema and search_path as a DSN query param HOT 4
- Cannot create tables with Primary Key (silent fail) HOT 1
- How to manually commit the contents of the WAL file to the main database file before db closed? HOT 8
- [email protected]\appender.go:451:11: invalid array length 1 << 31 (untyped int constant 2147483648) HOT 2
- "unsupported type 17" on duckdb > 0.10.0 HOT 2
- Uncaught exception of type duckdb::InternalException HOT 1
- Add CI script to run gofumpt
- hope create a new tag for last update HOT 1
- Build failures on redhat linux HOT 1
- Support missing timestamp types in the appender HOT 1
- Detect incorrect column counts in the appender
- transaction.go:6:5: undefined: conn HOT 9
- Exposing filesystem interface HOT 1
- Make Apache Arrow Optional HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from go-duckdb.