percona-lab / clickhousedb_fdw Goto Github PK
View Code? Open in Web Editor NEWPostgreSQL's Foreign Data Wrapper For ClickHouse
License: Other
PostgreSQL's Foreign Data Wrapper For ClickHouse
License: Other
In some cases after grouping by certain column(s) ClickHouse still might return a lot of rows, so performing LIMIT on clickhouse side will be beneficial.
Limit (cost=0.00..0.00 rows=1 width=32)
Output: city
-> Foreign Scan on public.t1 (cost=0.00..0.00 rows=0 width=32)
Output: city
Remote SQL: SELECT city FROM "default".t1
(5 rows)
PostgreSQL 11
OS: Centos 7.6
postgres# explain (analyze,verbose) select count(year) from ontime;
QUERY PLAN
Aggregate (cost=0.00..0.01 rows=1 width=8) (actual time=31441.737..31441.737 rows=1 loops=1)
Output: count(year)
-> Foreign Scan on public.ontime (cost=0.00..0.00 rows=0 width=4) (actual time=13.313..30108.858 rows=17189046 loops=1)
Output: year, quarter, month, dayofmonth, dayofweek, flightdate, uniquecarrier, airlineid, carrier, tailnum, flightnum,
originairportid, originairportseqid, origincitymarketid, origin, origincityname, originstate, originstatefips, originstatename, o
riginwac, destairportid, destairportseqid, destcitymarketid, dest, destcityname, deststate, deststatefips, deststatename, destwac
, crsdeptime, deptime, depdelay, depdelayminutes, depdel15, departuredelaygroups, deptimeblk, taxiout, wheelsoff, wheelson, taxii
n, crsarrtime, arrtime, arrdelay, arrdelayminutes, arrdel15, arrivaldelaygroups, arrtimeblk, cancelled, cancellationcode, diverte
d, crselapsedtime, actualelapsedtime, airtime, flights, distance, distancegroup, carrierdelay, weatherdelay, nasdelay, securityde
lay, lateaircraftdelay, firstdeptime, totaladdgtime, longestaddgtime, divairportlandings, divreacheddest, divactualelapsedtime, d
ivarrdelay, divdistance, div1airport, div1airportid, div1airportseqid, div1wheelson, div1totalgtime, div1longestgtime, div1wheels
off, div1tailnum, div2airport, div2airportid, div2airportseqid, div2wheelson, div2totalgtime, div2longestgtime, div2wheelsoff, di
v2tailnum, div3airport, div3airportid, div3airportseqid, div3wheelson, div3totalgtime, div3longestgtime, div3wheelsoff, div3tailn
um, div4airport, div4airportid, div4airportseqid, div4wheelson, div4totalgtime, div4longestgtime, div4wheelsoff, div4tailnum, div
5airport, div5airportid, div5airportseqid, div5wheelson, div5totalgtime, div5longestgtime, div5wheelsoff, div5tailnum
Remote SQL: SELECT year FROM “default”.ontime
Planning Time: 0.111 ms
Execution Time: 31441.813 ms
(7 rows)
Time: 31442.579 ms (00:31.443)
Hello!
This happens at installation. Ubuntu 20. Please advise. Thanks.
The README.md reads as follows:
The clickhousedb_fdw is open-source (GPLv2 licensed).
But the project license is Apache 2.0, which seems correct since it matches ClickHouse and is compatible with PostgreSQL code as well.
If not, do you have any suggestions on adjusting both Citus as well as clickhousedb_fdw such that clickhousedb_fdw could work within Citus just like cstore_fdw? After all, the olap capability of cstore_fdw is far from good enough compared with clickhouse~
Some times, clickhousedb_fdw is unable to connect to ClickhousDB.
Currently, we are using clickhousedb_fdw in PG 11 on our production server.
We are planning to upgrade PG version to 12 or 13. But we found that clickhousedb_fdw does not support PG 12 and PG 13 version because of that we won't be able to upgrade PG version on server.
Do you plan to support Postgresql version 12 and 13? If yes, then any deadline for it.
Please let me know.
Thanks,
Nishit Shah
I've noticed that some queries shows different results in postgres and clickhouse when, if I'm not wrong, they should be the same.
A simple example:
SELECT COUNT(DISTINCT relation_id) FROM clickhouse_table WHERE relation_id IN (SELECT relation_id FROM clickhouse_table WHERE date_from > '2019-02-01');
count
-------
116
(1 row)
SELECT COUNT(DISTINCT relation_id) FROM table WHERE relation_id IN (SELECT relation_id FROM table WHERE date_from > '2019-02-01');
┌─uniqExact(report_id)─┐
│ 677 │
└──────────────────────┘
Same inconsistency if I remove the WHERE clause.
Do you have any idea what could be happening?
Thanks a lot!
I have run the test in sql/clickhousedb_fdw.sql
It appears that the aggregate doesn't push down as expected
Aggregate
Output: count(*)
-> Foreign Scan on public.ft1
Remote SQL: SELECT NULL FROM regression.t1
(4 rows)
Initial commit for ClickHouse Foreign Data Wrapper Database.
when it can be supported in window system?
Hello,
are there any plans to add support for Postgresql 14 ?
I'm currently checking out the fdw and struggling with seemingly random but frequent postgres crashes when querying a clickhouse table via the FDW. Can't figure out yet what library is causing this, and postgres log is not helpful: server process (PID 2809) was terminated by signal 6: Aborted
Any idea where I can start looking for error/debug messages?
Not too sure what the issue is, but if i do select * from table
it produces rows just fine, however the moment I put an order by col
on the end it raises an error ERROR: 24000:Invalid cursor state
Some JOIN syntax does not work like non-equality JOIN condition not supported in clickhousedb.
2019.06.26 17:06:18.942407 [ 49 ] {309308e1-6710-40f5-ad68-491d1f8a3fd5} virtual DB::MergingAndConvertingBlockInputStream::~MergingAndConvertingBlockInputStream(): Waiting for threads to finish
2019.06.26 17:06:18.944886 [ 49 ] {309308e1-6710-40f5-ad68-491d1f8a3fd5} MemoryTracker: Peak memory usage (total): 109.37 MiB.
2019.06.26 17:06:18.945094 [ 49 ] {309308e1-6710-40f5-ad68-491d1f8a3fd5} MemoryTracker: Peak memory usage (for query): 103.14 MiB.
2019.06.26 17:06:18.945185 [ 49 ] {309308e1-6710-40f5-ad68-491d1f8a3fd5} HTTPHandler: Code: 396, e.displayText() = DB::Exception: Limit for result exceeded, max bytes: 13.35 MiB, current bytes: 13.41 MiB, Stack trace:
Some PostgreSQL operator does not work in ClickhouseDB, therefore there is a need for operator conversion.
I can access Array
types by calling them varchar
and then parsing them with ('{' || left(right(field, -1), -1) || '}')::int[]
, perhaps in a local view. But is there some way for native support for Array type to be added to this driver?
I've gone through and resolved all the dependencies that a fresh install is missing to be able to make and install this project.
The only .so files i can find are: libclickhouse-1.0.so and clickhousedb_fdw.so while the documentation references a file named libclickhouseodbc.so when defining the server using the foreign data wrapper.
Using either file in it's place returns the following error when i try to run a query through it:
ERROR: could not connect to server "clickhouse_svr"
DETAIL: IM004:[unixODBC][Driver Manager]Driver's SQLAllocHandle on SQL_HANDLE_HENV failed
The make install has this output on CentOS 7 at the moment:
m -rf libclickhouse-1.0.so lib/*.o
make -f lib/Makefile
make[1]: Entering directory /root/clickhousedb_fdw' g++ -O0 -g3 -Wno-unused-variable -fPIC -Wall -I. -Ilib -c -o lib/clickhouse-client.o lib/clickhouse-client.cpp g++ lib/clickhouse-client.o -o libclickhouse-1.0.so -fPIC -shared -ldl -lstdc++ -L -lodbc make[1]: Circular test <- test dependency dropped. g++ -O0 -g3 -Wno-unused-variable -Iinclude lib/example/odbc_test.o -o test -ldl -L. -lclickhouse-1.0 -lodbc -lodbcinst make[1]: Leaving directory
/root/clickhousedb_fdw'
make -f lib/Makefile
make[1]: Entering directory /root/clickhousedb_fdw' make[1]: Circular test <- test dependency dropped. make[1]: Nothing to be done for
all'.
make[1]: Leaving directory `/root/clickhousedb_fdw'
/usr/bin/install -c -m 755 libclickhouse-1.0.so /usr/pgsql-11/lib/libclickhouse-1.0.so
/usr/bin/mkdir -p '/usr/pgsql-11/lib'
/usr/bin/mkdir -p '/usr/pgsql-11/share/extension'
/usr/bin/mkdir -p '/usr/pgsql-11/share/extension'
/usr/bin/install -c -m 755 clickhousedb_fdw.so '/usr/pgsql-11/lib/clickhousedb_fdw.so'
/usr/bin/install -c -m 644 .//clickhousedb_fdw.control '/usr/pgsql-11/share/extension/'
/usr/bin/install -c -m 644 .//clickhousedb_fdw--1.0.sql '/usr/pgsql-11/share/extension/'
/usr/bin/mkdir -p '/usr/pgsql-11/lib/bitcode/clickhousedb_fdw'
/usr/bin/mkdir -p '/usr/pgsql-11/lib/bitcode'/clickhousedb_fdw/
/usr/bin/install -c -m 644 clickhousedb_fdw.bc '/usr/pgsql-11/lib/bitcode'/clickhousedb_fdw/./
/usr/bin/install -c -m 644 clickhousedb_option.bc '/usr/pgsql-11/lib/bitcode'/clickhousedb_fdw/./
/usr/bin/install -c -m 644 clickhousedb_deparse.bc '/usr/pgsql-11/lib/bitcode'/clickhousedb_fdw/./
/usr/bin/install -c -m 644 clickhousedb_connection.bc '/usr/pgsql-11/lib/bitcode'/clickhousedb_fdw/./
/usr/bin/install -c -m 644 clickhousedb_shipable.bc '/usr/pgsql-11/lib/bitcode'/clickhousedb_fdw/./
cd '/usr/pgsql-11/lib/bitcode' && /usr/lib64/llvm5.0/bin/llvm-lto -thinlto -thinlto-action=thinlink -o clickhousedb_fdw.index.bc clickhousedb_fdw/clickhousedb_fdw.bc clickhousedb_fdw/clickhousedb_option.bc clickhousedb_fdw/clickhousedb_deparse.bc clickhousedb_fdw/clickhousedb_connection.bc clickhousedb_fdw/clickhousedb_shipable.bc
Are there any repos which currently hold prebuilt deb or rpm packages that the readme references?
The following two queries yield different results while I understand they should be equivalent. The second one yields the correct result.
SELECT COUNT(1) FROM (SELECT DISTINCT report_id FROM clickhouse_reportusages) t;
count
-------
2000
SELECT COUNT(DISTINCT report_id) FROM clickhouse_reportusages;
count
-------
2956
Their respective plans are:
EXPLAIN VERBOSE SELECT COUNT(1) FROM (SELECT DISTINCT report_id FROM clickhouse_reportusages) t;
QUERY PLAN
--------------------------------------------------------------------------------------------------
Aggregate (cost=1.51..1.52 rows=1 width=8)
Output: count(1)
-> Unique (cost=1.00..1.50 rows=1 width=4)
Output: clickhouse_reportusages.report_id
-> Foreign Scan on public.clickhouse_reportusages (cost=1.00..-1.00 rows=1000 width=4)
Output: clickhouse_reportusages.report_id
Remote SQL: SELECT report_id FROM "default".reportusages ORDER BY report_id ASC
(7 rows)
EXPLAIN VERBOSE SELECT COUNT(DISTINCT report_id) FROM clickhouse_reportusages;
QUERY PLAN
----------------------------------------------------------------------------
Foreign Scan (cost=1.00..-1.00 rows=1000 width=8)
Output: (count(DISTINCT report_id))
Relations: Aggregate on (clickhouse_reportusages)
Remote SQL: SELECT count(DISTINCT report_id) FROM "default".reportusages
(4 rows)
Can you make a release version? The release version can help us to get bug.
hello. i have two questions.
is it possible to see datatype array(string) via clickhouse_fdw?
for example
table in clickhouse:
create table events (id int8, field1 Array(String));
Create foreign table in postgre:
CREATE foreign TABLE public.events (
id int8, field1 text []
) SERVER clickhouse_svr OPTIONS (table_name 'events');
query:
select id, field1 FROM public.events
error: cannot convert clickhouse value to postgres value
where i can see all possible options in this structure?
create foreign table table1 ()SERVER clickhouse_svr OPTIONS ( all possible options ?)
Hi, I'm having trouble installing this.
I'm on ubuntu 18.04 PostgreSQL 11.
After building odbc and clickhousedb_fdw, in psql when running:
create extension clickhousedb_fdw;
I get
ERROR: could not load library "/usr/lib/postgresql/11/lib/clickhousedb_fdw.so": /usr/lib/postgresql/11/lib/clickhousedb_fdw.so: undefined symbol: odbc_fetch
Do you plan to support Postgresql version 12 ?.
clickhousedb fdw does not support the options: user and password.
postgres=# CREATE SERVER clickhouse_svr FOREIGN DATA WRAPPER clickhousedb_fdw OPTIONS(dbname 'test_database', driver '/usr/local/lib/odbc/libclickhouseodbc.so', host '127.0.0.1', user 'default', password 'clickhouse');
postgres=# error: invalid options - user
postgres=# tips: options: host, hostaddr, port, dbname, driver
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.