pavanvd / postgres-xl Goto Github PK

Public Postgres-XL repository

License: Mozilla Public License 2.0

Emacs Lisp 0.01% Makefile 0.83% Shell 0.44% C++ 2.44% SQLPL 0.58% C 87.76% PLpgSQL 4.72% PLSQL 0.26% Yacc 1.44% Lex 0.49% Perl 0.88% Assembly 0.01% Groff 0.08% DTrace 0.01% XS 0.02% Python 0.01% Batchfile 0.02% Objective-C 0.01%

postgres-xl's People

Contributors

Stargazers

Watchers

Forkers

carinekanouo jim-mlodgenski

postgres-xl's Issues

DISCARD SEQUENCES does not discard session state for the sequence

postgres=# CREATE SEQUENCE s1;
CREATE SEQUENCE
postgres=# SELECT nextval('s1'::regclass);
 nextval 
---------
       1
(1 row)
postgres=# SELECT currval('s1'::regclass);
 currval 
---------
       2
(1 row)

postgres=# DISCARD SEQUENCES ;
DISCARD SEQUENCES
postgres=# SELECT currval('s1'::regclass);
 currval 
---------
       2
(1 row)

The last statement should have given error such as
ERROR: currval of sequence "s1" is not yet defined in this session

"Failed to synchronize data node" error

This is from regression tests:

regression=# CREATE TABLE tt_22 (a int, b int) distribute by replication;
CREATE TABLE
regression=# 
regression=# INSERT INTO tt_22 VALUES (10);
INSERT 0 1
regression=# 
regression=# BEGIN;
BEGIN
regression=# SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SET
regression=# DECLARE c1 NO SCROLL CURSOR FOR SELECT * FROM tt_22 ORDER BY a FOR UPDATE;
DECLARE CURSOR
regression=# INSERT INTO tt_22 VALUES (2);
INSERT 0 1
regression=# FETCH ALL FROM c1; -- should not show the row (2)
 a  | b 
----+---
 10 |  
(1 row)

regression=# END;
ERROR:  Failed to synchronize data node
regression=#

REFRESH MATERIALISED VIEW gives error

Issue by pavanvd
Monday Jun 29, 2015 at 06:22 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/6

Regression test shows errors of these kinds:

***************
*** 434,439 ****
--- 437,444 ----
  INSERT INTO foo VALUES(3, 4, 5);
  REFRESH MATERIALIZED VIEW mv;
  REFRESH MATERIALIZED VIEW CONCURRENTLY mv;
+ ERROR:  schema "pg_temp_2" does not exist
+ CONTEXT:  SQL statement "ANALYZE pg_temp_2.pg_temp_21987_2"
  DROP TABLE foo CASCADE;
  NOTICE:  drop cascades to materialized view mv
  -- allow subquery to reference unpopulated matview if WITH NO DATA is specified

REFRESH MATERIALISED VIEW internally creates a temp table to hold temporary data and also analyses the same before running queries against it. Looks like something is broken in that logic. (We'd tried to fix some issues around REFRESH by implementing LOCAL TEMP tables etc, but looks like something was missed out)

EVENT TRIGGER not yet supported

Issue by pavanvd
Monday Jun 29, 2015 at 12:59 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/12

Could not find commutator for operator error

Issue by pavanvd
Monday Jun 29, 2015 at 11:37 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/8

This case from "equivclass" fails in regression

regression=# -- let's try that as a mergejoin
regression=# set enable_mergejoin = on;
SET
regression=# set enable_nestloop = off;
SET
regression=# 
regression=# explain (costs off)
regression-#   select * from ec1,
regression-#     (select ff + 1 as x from
regression(#        (select ff + 2 as ff from ec1
regression(#         union all
regression(#         select ff + 3 as ff from ec1) ss0
regression(#      union all
regression(#      select ff + 4 as x from ec1) as ss1
regression-#   where ss1.x = ec1.f1 and ec1.ff = 42::int8;
ERROR:  could not find commutator for operator 66437

DISCARD SEQUENCES fails to discard sequence value

Issue by pavanvd
Monday Jun 29, 2015 at 12:58 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/11

*** a/src/test/regress/expected/sequence.out    Tue Jun  2 17:01:55 2015
--- b/src/test/regress/results/sequence.out Mon Jun 29 13:01:14 2015
***************
*** 167,173 ****

  DISCARD SEQUENCES;
  SELECT currval('sequence_test'::regclass);
! ERROR:  currval of sequence "sequence_test" is not yet defined in this session
  DROP SEQUENCE sequence_test;
  -- renaming sequences
  CREATE SEQUENCE foo_seq;
--- 167,177 ----

  DISCARD SEQUENCES;
  SELECT currval('sequence_test'::regclass);
!  currval 
! ---------
!       99
! (1 row)
! 
  DROP SEQUENCE sequence_test;
  -- renaming sequences
  CREATE SEQUENCE foo_seq;

Recursive queries do not work

Even simple recursive queries which do not involve any real tables are not working.. We had seen such reports in the past and probably had a patch to fix them as well. Needs investigation

 WITH RECURSIVE t(n) AS (
      SELECT (VALUES(1))
  UNION ALL
      SELECT n+1 FROM t WHERE n < 5
  )
  SELECT * FROM t ORDER BY n;

Transaction timestamp is not set until a new transaction ID is fetched from GTM

Issue by pavanvd
Friday Jun 26, 2015 at 12:22 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/3

The following test fails today:

*** a/src/test/regress/expected/timestamp.out   
--- b/src/test/regress/results/timestamp.out   
***************
*** 46,52 ****
  SELECT count(*) AS One FROM TIMESTAMP_TBL WHERE d1 = timestamp(2) without time zone 'now';
   one
  -----
!    1
  (1 row)

  COMMIT;
--- 46,52 ----
  SELECT count(*) AS One FROM TIMESTAMP_TBL WHERE d1 = timestamp(2) without time zone 'now';
   one
  -----
!    0
  (1 row)

  COMMIT;

Initial investigations show that the transaction timestamp is not set correctly until a new transaction identifier is obtained via AssignTransactionId(). This function fetches a timestamp from the GTM and assigns it to GTMxactStartTimestamp which is then returned by now() for the lifetime of the transaction. But before that now() will continue to return stale value of GTMxactStartTimestamp (used for previous transaction)

The bug can be demonstrated with a simple test:

regression=# BEGIN;
BEGIN
regression=# SELECT now();
                 now                 
-------------------------------------
 Fri Jun 26 05:05:50.080561 2015 PDT
(1 row)

regression=# SELECT now();
                 now                 
-------------------------------------
 Fri Jun 26 05:05:50.080561 2015 PDT
(1 row)

regression=# CREATE TABLE test (a timestamp);
CREATE TABLE
regression=# SELECT now();
                 now                 
-------------------------------------
 Fri Jun 26 05:20:28.859395 2015 PDT
(1 row)

regression=# SELECT now();
                 now                 
-------------------------------------
 Fri Jun 26 05:20:28.859395 2015 PDT
(1 row)

regression=# abort;
ROLLBACK

DDL inside the transaction block fetches an XID from the GTM and the transaction timestamp changes at that point (and remains constant thereafter)

"cache lookup failed" error

Issue by pavanvd
Monday Jun 29, 2015 at 09:43 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/7

Reproduction case:

regression=# 
regression=# create temp table table_a(a int);
CREATE TABLE
regression=#   create temp view view_a as select * from table_a;
CREATE VIEW
regression=#   select view_a from view_a;
ERROR:  cache lookup failed for type 0

CREATE TABLE ... (LIKE .. ) fails for a temporary table

Issue by pavanvd
Friday Jun 26, 2015 at 10:45 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/1

Here is an example from the test case xc_temp

-- Check of inheritance between temp and non-temp tables
CREATE TEMP TABLE table_parent (a int);
CREATE TABLE table_child (like table_parent, b int);

ERROR: relation "table_parent" does not exist

Since temporary tables are not created on the remote coordinators, the second CREATE TABLE must be failing to find the base table definition

Fail to create rule on temporary tables

Issue by pavanvd
Monday Jun 29, 2015 at 12:56 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/10

Since temp tables are not created on remote coordinators, CREATE RULE on such tables could be failing if the DDL is replicated on a remote coordinator.

***************
*** 1732,1737 ****
--- 1728,1734 ----
  create temp table tt_log(f1 int, data text);
  create rule insert_tt_rule as on insert to tt do also
    insert into tt_log values(new.*);
+ ERROR:  relation "tt_log" does not exist
  select insert_tt2('foollog','barlog') limit 1;

Type corresponding to a view does not exist on datanode

Issue by pavanvd
Monday Jun 29, 2015 at 05:32 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/5

Here is a representative failure from regression testcase "updatable_view"

  CREATE VIEW rw_view1 AS SELECT b AS bb, a AS aa FROM base_tbl;
  CREATE FUNCTION rw_view1_aa(x rw_view1)
    RETURNS int AS $$ SELECT x.aa $$ LANGUAGE sql;
+ ERROR:  type rw_view1 does not exist

Since views are only created on the coordinators, it seems that this error could be coming from the datanodes as CREATE FUNCTION DDLs are executed on all the nodes.

ERROR: Could not begin transaction on data node.

'rowsecurity' testcase in regression shows many such errors. Needs investigation

NOTICEs from remotes nodes are lost

Looking at regression diffs, it seems that the NOTICEs from the remote nodes are not forwarded to clients. We recently fixed similar issue for HINTs, may be somewhat similar treatment is required for NOTICEs too.

Fail to see DDL's effect inside a function

Issue by pavanvd
Monday Jun 29, 2015 at 11:55 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/9

We have seen that the issue can be reproduced only when the function is executed for the first time in a new psql session

-- Check DDL via SPI, immediately followed by SPI plan re-use
-- (bug in original coding)

create function cachebug() returns void as $$
declare r int;
begin
  drop table if exists temptable cascade;
  create temp table temptable as select * from generate_series(1,3) as f1;
  create temp view vv as select * from temptable;
  for r in select * from vv order by 1 loop
    raise notice '%', r;
  end loop;
end$$ language plpgsql;

select cachebug();
select cachebug();

Output mismatch for a column of type reltime

The @ character goes missing in the output. The stock postgres also gives output without the @ character unless intervalstyle is set to postgres_verbose. So looks like the session parameter is not passed down to the datanode correctly (or not setup correctly at the start time)

This from regression:

  SELECT '' AS one, * FROM RELTIME_TBL
     WHERE RELTIME_TBL.f1 = reltime '@ 34 years' ORDER BY f1;
!  one |     f1     
! -----+------------
!      | @ 34 years
  (1 row)

  SELECT '' AS two, * FROM RELTIME_TBL
--- 60,68 ----

  SELECT '' AS one, * FROM RELTIME_TBL
     WHERE RELTIME_TBL.f1 = reltime '@ 34 years' ORDER BY f1;
!  one |    f1    
! -----+----------
!      | 34 years
  (1 row)

  SELECT '' AS two, * FROM RELTIME_TBL

PGOPTIONS not passed down to datanodes?

Issue by pavanvd
Friday Jun 26, 2015 at 11:30 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/2

The following test fails in "make installcheck" even after adding "-c interval_style=postgres_verbose" to PGOPTIONS. Looks like datanodes don't honour that setting and send back results in a different style.

** a/src/test/regress/expected/reltime.out 
--- b/src/test/regress/results/reltime.out
***************
*** 60,68 ****

  SELECT '' AS one, * FROM RELTIME_TBL
     WHERE RELTIME_TBL.f1 = reltime '@ 34 years' ORDER BY f1;
!  one |     f1     
! -----+------------
!      | @ 34 years
  (1 row)

  SELECT '' AS two, * FROM RELTIME_TBL
--- 60,68 ----

  SELECT '' AS one, * FROM RELTIME_TBL
     WHERE RELTIME_TBL.f1 = reltime '@ 34 years' ORDER BY f1;
!  one |    f1    
! -----+----------
!      | 34 years
  (1 row)

  SELECT '' AS two, * FROM RELTIME_TBL

Tableoid to relation name mapping broken

Issue by pavanvd
Monday Jun 29, 2015 at 05:20 GMT
Originally opened as https://github.com/pavanvd/Postgres-XL-Private/issues/4

When the ::regclass is applied at the coordinator, it either fails to map OID to relname or even worse returns a wrong result. The following query from "inherit" test case in regression demonstrates this issue:

regression=# select tableoid::regclass::text as relname, bar.* from bar order by 1,2;
 relname | f1 | f2 
---------+----+----
 35090   |  3 |  3
 35090   |  4 |  4
 35093   |  3 |  3
 35093   |  4 |  4
 35267   |  1 |  1
 35267   |  2 |  2
 35270   |  1 |  1
 35270   |  2 |  2
(8 rows)

regression=# explain verbose select tableoid::regclass::text as relname, bar.* from bar order by 1,2;
                                                   QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Sort  (cost=387.51..392.61 rows=2041 width=12)
   Output: (((bar.tableoid)::regclass)::text), bar.f1, bar.f2
   Sort Key: (((bar.tableoid)::regclass)::text), bar.f1
   ->  Result  (cost=100.00..275.30 rows=2041 width=12)
         Output: ((bar.tableoid)::regclass)::text, bar.f1, bar.f2
         ->  Append  (cost=100.00..265.10 rows=2041 width=12)
               ->  Remote Subquery Scan on all (datanode_1,datanode_2)  (cost=100.00..100.02 rows=1 width=12)
                     Output: bar.tableoid, bar.f1, bar.f2
                     ->  Seq Scan on pg_temp_2.bar  (cost=0.00..0.00 rows=1 width=12)
                           Output: bar.tableoid, bar.f1, bar.f2
               ->  Remote Subquery Scan on all (datanode_1,datanode_2)  (cost=100.00..165.08 rows=2040 width=12)
                     Output: bar2.tableoid, bar2.f1, bar2.f2
                     ->  Seq Scan on pg_temp_2.bar2  (cost=0.00..30.40 rows=2040 width=12)
                           Output: bar2.tableoid, bar2.f1, bar2.f2
(14 rows)

Since OIDs of objects may differ on coordinators and datanodes, trying to convert an OID generated by a datanode at the coordinator gives wrong result.

INSERT into a table gives wrong result

regression=# CREATE SCHEMA test_schema;
CREATE SCHEMA
regression=# SET search_path   TO test_schema ;
SET
regression=# CREATE TABLE test1 (a int, b int);
CREATE TABLE
regression=# CREATE TABLE test2 (a int, b int);
CREATE TABLE
regression=# SELECT oid, relname, relfilenode, relkind, reltoastrelid, relnamespace                                                                       FROM pg_class                                                                                                                                             WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema');
  oid  | relname | relfilenode | relkind | reltoastrelid | relnamespace 
-------+---------+-------------+---------+---------------+--------------
 27333 | test1   |       27333 | r       |             0 |        27332
 27336 | test2   |       27336 | r       |             0 |        27332
(2 rows)

regression=# EXPLAIN VERBOSE SELECT oid, relname, relfilenode, relkind, reltoastrelid, relnamespace                                                       FROM pg_class                                                                                                                                             WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema');
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on pg_catalog.pg_class  (cost=1.18..38.27 rows=108 width=81)
   Output: pg_class.oid, pg_class.relname, pg_class.relfilenode, pg_class.relkind, pg_class.reltoastrelid, pg_class.relnamespace
   Filter: (pg_class.relnamespace = $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on pg_catalog.pg_namespace  (cost=0.00..1.18 rows=1 width=4)
           Output: pg_namespace.oid
           Filter: (pg_namespace.nspname = 'test_schema'::name)
(7 rows)


regression=# EXECUTE DIRECT ON (datanode_1) 'SELECT oid, relname, relfilenode, relkind, reltoastrelid, relnamespace                                       FROM pg_class                                                                                                                                             WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = ''test_schema'')';
  oid  |  relname  | relfilenode | relkind | reltoastrelid | relnamespace 
-------+-----------+-------------+---------+---------------+--------------
 35617 | test1     |       35617 | r       |             0 |        35616
 35620 | test2     |       35620 | r       |             0 |        35616
(2 rows)

regression=# EXECUTE DIRECT ON (datanode_2) 'SELECT oid, relname, relfilenode, relkind, reltoastrelid, relnamespace                                       FROM pg_class                                                                                                                                             WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = ''test_schema'')';
  oid  |  relname  | relfilenode | relkind | reltoastrelid | relnamespace 
-------+-----------+-------------+---------+---------------+--------------
 35438 | test1     |       35438 | r       |             0 |        35437
 35441 | test2     |       35441 | r       |             0 |        35437
(2 rows)

regression=# CREATE TABLE inserttab AS SELECT oid, relname, relfilenode, relkind, reltoastrelid, relnamespace                                             FROM pg_class                                                                                                                                             WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema');
INSERT 0 1
regression=# SELECT * FROM inserttab ;
  oid  | relname | relfilenode | relkind | reltoastrelid | relnamespace 
-------+---------+-------------+---------+---------------+--------------
 35620 | test2   |       35620 | r       |             0 |        35616
(1 row)

regression=#

The SELECT query involving the pg_catalog tables correctly returns 2 rows when run on the coordinator or directly on either datanodes. But when we try to INSERT the query result into a table (this happens with both CREATE TABLE AS as well as INSERT INTO), only one row inserted. Upon further inspection, we found out that the rows get inserted only on one datanode.

One possible reason could be that the subquery (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema') is probably evaluated on one datanode, but when that result is used on the other datanode, no rows qualify for the test and hence nothing gets inserted. Anyways, need further investigations.

Type corresponding to temp table is not usable

Many tests fails because type corresponding to a temp table is not available on the remote coordinator. While this is expected because we don't create temp tables on the remote coordinator, this needs to be addressed or at least documented clearly.

Unexpected response from datanode

This from regression tests:

regression=# create table my_tab1 (a int);
CREATE TABLE
regression=#   insert into my_tab1 values(1);
INSERT 0 1
regression=#   create function f1 () returns setof my_tab1 as $$ create table my_tab2 (a int); select * from my_tab1; $$ language sql;
CREATE FUNCTION
regression=#   select f1();
ERROR:  Unexpected response from the Datanodes for 'T' message, current request type 1
CONTEXT:  SQL function "f1" statement 1
ERROR:  Unexpected response from the data nodes for 'D' message, current request type 0
regression=#

ERROR: failed to deconstruct sort operators into partitioning/ordering operators

The following test case fails in XL:

postgres=# CREATE TABLE test (a int, b int);
postgres=# EXPLAIN SELECT last_value(a) OVER (PARTITION by b) FROM (SELECT * FROM test  ORDER BY a) AS s ORDER BY a;
ERROR:  failed to deconstruct sort operators into partitioning/ordering operators