pgmemento / pgmemento Goto Github PK
View Code? Open in Web Editor NEWAudit trail with schema versioning for PostgreSQL using transaction-based logging
License: GNU Lesser General Public License v3.0
Audit trail with schema versioning for PostgreSQL using transaction-based logging
License: GNU Lesser General Public License v3.0
TL;DR: the to_regtype
method in the event trigger procedure table_alter_pre_trigger
throws either syntax errors or doesn't catch data type with spaces (= no log)
Details
The event trigger procedure table_alter_pre_trigger
tries to parse different kinds of DDL statements that modifiy columns. The general idea is to split the DDL text by spaces into an array. Whenever the keywords ADD
, ALTER
or DROP
are found within a loop over the array it is expected to find the column name in the next iteration or the one after the next. For ALTER
and DROP
events the extracted columnname can be check againt the audit_column_log
table.
For ADD COLUMN events this is not possible as the column does not exist yet. Therefore, the current assumption is to look out for a data type in the next iteration. I thought, this could easily be checked with Postgres' to_regtype
function. If the current text input resembles a data type to_regtype
returns the name, else NULL. Unfortunately, splitting up the DDL text can produce input that cause syntax errors, as reported by @coutoPL in his PR #11.
What's even worse it that data types with spaces do not produce the error but can bypass the logging. For example a data type like double precision
results in two iterations with double
and precision
. to_regtype('double')
returns zero, so the ADD COLUMN event is not logged. There would also be no entry in the audit_column_log
table.
When changing a table name, column name or type name, nothing is changed in either audit_table_log
or audit_column_log
. This prevents properly restoring data on the renamed table.
I have tried the following types of queries and not seen the updates to audit tables as I would expect:
ALTER TABLE old_table RENAME TO new_table;
ALTER TYPE old_enum RENAME TO new_enum;
ALTER TABLE table RENAME COLUMN old_column TO new_column;
Not sure if there are more queries that are being missed.
Have an audit_schema_log
table to control for which schemas pgMemento has been initialized and when to handle CREATE TABLE
events. With this it should also be possible to add support for CREATE TABLE AS
DDL events.
TEST.sql fails with
TEST 1: pgMemento setup
psql:test/setup/TEST_INSTALL.sql:182: ERROR: Error: Incorrect number of audit tables!
CONTEXT: PL/pgSQL function inline_code_block line 16 at ASSERT
Fails because spatial_ref_sys
table is automatically created in pgmemento
schema when creating postgis extension.
Postgis also creates these views: 'geography_columns', 'geometry_columns', 'raster_columns', 'raster_overviews'
and a bunch of stored procedures.
ENV:
Clean Postgres 10 database
postgis-2.5
I'd like to prevent certain users from editing log tables.
This should be possible with revoking editing permissions for log tables from a user and setting trigger functions to run as SECURITY DEFINER
so log tables cannot be edited directly.
Is it reasonable to set trigger functions to SECURITY DEFINER
by default?
Nice work you have done.
Not really an issue but i hope i am on a right track.
Will this work (customer with 4 fields)
CREATE TABLE public.customer
(
name text,
email text,
phone text,
address text,
id serial
)
-- all the events that occured
select event_id from pgmemento.row_log where audit_id = ( select audit_id from public.customer where id = 1)
-- and now i want to see how data looked at specific event (7)
`SELECT
event_id,
first_value(name) over (partition by name_partition order by event_id) as name,
first_value(email) over (partition by email_partition order by event_id) as email,
first_value(phone) over (partition by phone_partition order by event_id) as phone,
first_value(address) over (partition by address_partition order by event_id) as address
FROM (
select
event_id,
rl.changes->>'name' as name,
sum(case when (rl.changes->'name')::text is null then 0 else 1 end) over (order by event_id) as name_partition,
rl.changes->>'email' as email,
sum(case when (rl.changes->'email')::text is null then 0 else 1 end) over (order by event_id) as email_partition,
rl.changes->>'phone' as phone,
sum(case when (rl.changes->'phone')::text is null then 0 else 1 end) over (order by event_id) as phone_partition,
rl.changes->>'address' as address,
sum(case when (rl.changes->'address')::text is null then 0 else 1 end) over (order by event_id) as address_partition
from row_log rl
inner join public.customer target
on target.audit_id = rl.audit_id
where
rl. event_id = 7
) as q
Hi @FxKu. What do you think of using transition tables in the row_log
triggers (or maybe other triggers as well) to improve performance? The only drawback I can think of is that it would require Postgres 10
If a column has been changed multiple times in one transaction, multiple entries are added into audit_column_log
. All but one of these entries will have an empty txid_range
. This kinda makes sense, there isn't a transaction that would match those intermediate columns, so the range is empty.
However, when these empty range entries are present, the restore functionality no longer works for these tables. All of the restore functions end up using pgmemento.restore_query()
and that function return duplicate statements for the columns that were changed multiple times in one transaction earlier. pgmemento.get_column_list()
does not duplicate the columns, so i'm thinking is something to do with the way pgmemento.restore_query()
deals with ranges that is incompatible with empty ranges.
Hi,
I've called function create_table_audit, and all audit_table_log/audit_column_log tables are empty. I saw you have function register_audit_table but it's called only in modify_ddl_log_tables function (which is called in trigger which modifies the table itself).
in that function you do have this line
-- remember audit_id_column when registering table in audit_table_log later
but there is no call to register_audit_table?
On the opposite function drop_table_audit, you do call unregister_audit_table (for some reason).
BR
When restoring data from backup some history data appears to be lost.
More details are here: #19 (comment)
While #71 might have fixed restoring JSONB and array values it introduced a bug when restoring NULL values.
Before: COALESCE( '{"a":null}'::jsonb->'a') = NOT NULL ->> 0 = NULL
Now: COALESCE( '{"a":null}'::jsonb->>'a') = NULL -> query recent state from table)
Instead it must be:
CASE WHEN jsonb_typeof( '{"a":null}'::jsonb->'a') = 'null'
THEN NULL::columntype
ELSE COALESCE( '{"a":null}'::jsonb->>'a')::columntype -> query recent state from table)
END
I am a little confused - when I run this query I get the correct state of the data before transaction 32 - i.e. grade:1 and login_count:3
SELECT * FROM pgmemento.restore_record(1,32,'user','public', 1048, TRUE) AS (log JSONB);
However, when I run this query:
SELECT * FROM pgmemento.restore_recordset(1,32,'user','public',TRUE) AS (log JSONB);
I don't get any rows. I would have thought that restore_recordset was a superset of restore_record, or have I misunderstood?
row_log:
1247 32 1048 "{"grade": 1, "login_count": 3}"
TL;DR: test returns expected value but is not passed correctly to assertion
Final test in testrun.sh does not work properly during travis build (passing only locally).
Problem: The executed SQL file sets a temporary configuration parameter that only lives for one session. For some reason it is not known when running the DO block passed with the same psql command (with -1 set).
Hi @FxKu, thank you for creating such an awesome tool!
Did you consider adding it to the PGXN registry? I guess this might help more people to find it.
If you don't have time, I could perhaps write an initial META.json spec file and make a PR to this repo, but I guess you'd still have to upload the final package to PGXN as the author.
Let me know what you think.
Cheers
There appears to be a bug with array columns when trying to restore records, giving this error message:
ERROR: malformed array literal: "abc"
DETAIL: Array value must start with "{" or dimension information.
CONTEXT: PL/pgSQL function restore_records(integer,integer,text,text,bigint,boolean) line 11 at RETURN QUERY
This is a transcript of what I did to produce this behaviour. I have tried using both master version and v0.7, installing using the provided INSTALL_PGMEMENTO.sql script.
testdb=# create table test2 (id serial primary key, values text[]);
CREATE TABLE
testdb=# insert into test2 (values) values ('{abc,def}');
INSERT 0 1
testdb=# select * from test2;
id | values | pgmemento_audit_id
----+-----------+--------------------
1 | {abc,def} | 2
(1 row)
testdb=# update test2 set values = '{ghi,jkl}';
UPDATE 1
testdb=# select * from test2;
id | values | pgmemento_audit_id
----+-----------+--------------------
1 | {ghi,jkl} | 2
(1 row)
testdb=# select * from pgmemento.row_log;
id | audit_id | event_key | old_data | new_data
----+----------+----------------------------------------------------------+----------------------------+--------------------------------------------------------------
1 | 1 | 1594320164.98008;1594320164.98008;2298892;3;test;public | | {"id": 1, "values": {"Test": 3}, "pgmemento_audit_id": 1}
2 | 2 | 1594320573.5166;1594320573.5166;2301281;3;test2;public | | {"id": 1, "values": ["abc", "def"], "pgmemento_audit_id": 2}
3 | 2 | 1594320585.49882;1594320585.49882;2301282;4;test2;public | {"values": ["abc", "def"]} | {"values": ["ghi", "jkl"]}
(3 rows)
testdb=# select * from pgmemento.restore_records(1, 100, 'test2', 'public', 2, true) as (log jsonb);
ERROR: malformed array literal: "abc"
DETAIL: Array value must start with "{" or dimension information.
CONTEXT: PL/pgSQL function restore_records(integer,integer,text,text,bigint,boolean) line 11 at RETURN QUERY
Hi,
I've got this error message:
ERROR: extension "pgmemento" has no update path from version "0.7.1" to version "0.7.2"
SQL state: 22023
pgmemento.create_table_audit
function throws an error.
It's easy to replicate:
CREATE DATABASE test;
--switch to database test;
CREATE EXTENSION pgmemento;
CREATE TABLE test_tab (
id INTEGER
);
SELECT pgmemento.create_table_audit(
tablename := 'test_tab',
schemaname := 'public',
audit_id_column_name := 'audit_id');
You'll get:
[42701] ERROR: column "audit_id" of relation "test_tab" already exists Where: SQL statement "ALTER TABLE public.test_tab ADD COLUMN audit_id BIGINT DEFAULT nextval('pgmemento.audit_id_seq'::regclass) UNIQUE NOT NULL" PL/pgSQL function pgmemento.create_table_audit_id(text,text,text) line 8 at EXECUTE SQL statement "SELECT pgmemento.create_table_audit_id($1, $2, $3)" PL/pgSQL function pgmemento.create_table_audit(text,text,text,boolean,boolean,boolean) line 41 at PERFORM
I guess create_table_audit_id
function is called twice here:
First create_table_audit
finds nothing in audit_schema_log
table and calls create_schema_audit
Which in turn calls create_table_audit for this particular table. This one creates an id column, so the outer call fails.
My guess is there should be an ELSE
block here, so standalone create_table_audit_id
is called only when schema audit is already set.
I'm getting an error when trying to delete from a table with enabled audit.
[2019-05-30 19:40:08] [42501] ERROR: permission denied for table transaction_log
[2019-05-30 19:40:08] Where: SQL statement "INSERT INTO pgmemento.transaction_log
[2019-05-30 19:40:08] (txid, stmt_date, process_id, user_name, client_name, client_port, application_name, session_info)
[2019-05-30 19:40:08] VALUES
[2019-05-30 19:40:08] ($1, transaction_timestamp(), pg_backend_pid(), current_user, inet_client_addr(), inet_client_port(),
[2019-05-30 19:40:08] current_setting('application_name'), session_info_obj
[2019-05-30 19:40:08] )
[2019-05-30 19:40:08] ON CONFLICT (txid, stmt_date)
[2019-05-30 19:40:08] DO NOTHING
[2019-05-30 19:40:08] RETURNING id"
[2019-05-30 19:40:08] PL/pgSQL function pgmemento.log_table_event(bigint,oid,text) line 29 at SQL statement
[2019-05-30 19:40:08] SQL statement "SELECT pgmemento.log_table_event(txid_current(), TG_RELID, TG_OP)"
[2019-05-30 19:40:08] PL/pgSQL function pgmemento.log_transaction() line 3 at PERFORM
[2019-05-30 19:40:08] SQL statement "DELETE FROM ONLY "auth"."lnk_users_user_roles" WHERE $1 OPERATOR(pg_catalog.=) "user_guid""
I'm logged in and executing a query as a postgres superuser, so this looks quite strange.
I've looked into pgmemento.log_transaction()
definition and found just a plain insert into transaction_log
table. Nothing, that could prevent from inserting into it.
Any clues on how to debug this?
when creating a table like this
CREATE TABLE public."My-table"
(
"My id" character varying(15),
...
)
triggers serveral errors:
..INIT.sql:36: ERROR: relation «public.my-table» does not exist
Any time estimates on when 0.7 would be released?
It'd be great to have an extension instead of installation script.
Extension is way more migration friendly.
I would like to import data frequently. About every 5 minutes.
I would like to track when the data changes, but without filling up transaction_log
and table_event_log
.
Any ideas?
pgMemento v0.6
implements row level tracking by adding a field named audit_id
to tables.
This RFC proposes to improve the audit_id
implementation to make it less invasive.
audit_id
is a quite common name for a table column, which may conflict with native table data structure.
Currently, pgMemento does not provide any means to configure or change that behaviour.
Customization of the column name might seem to be too complicated and even over-engineering. However, it could be done some time in the future without introducing breaking changes to the module.
On the other hand, picking a more unique name for the column
introduces somewhat breaking changes, whereas it is quite easy to implement without much complexity and effort.
I suggest we change the audit_id
column name to something quite unique so that it reduces chances of conflicting with native data column names. Ideally it should contain the name of the extension as prefix.
Option A
is "pgmemento audit_id"
. Use space
to separate prefix and column nameOption B
is "pgmemento__audit_id"
. Use two underscores
to separate prefix and column nameMaybe pgMememto can already do this, and I just didn't see it, but is there a way to specify on the client side what user is making the changes? My applications by and large all connect as the same PG user in a pool, and every query runs from that pool. So every change through the app would just log the user_name as myapp
or whatever, which isn't super useful.
I could implement an updated_by
or updated_by_id
field on each audited table, and that would have the user that made the change in the audit log, but it's not at the top level with the connection username where it would be more convenient to access. Also, does pgMemento store a diff of the changes? If so, then if the same user updated a row multiple times, that updated_by
field wouldn't be present in the diff, and make it difficult to track.
I'm also thinking about things such as logging maybe the ip address of the application user which has similar problems.
Is there any way for pgMemento to lift this information up to the transaction_log
table? Maybe PG has some kind of connection context/session that could be used to set this information that pgMemento could then read in? Even if it just dumped this information into a JSONB field like transaction_log.client_session
or something could be really useful.
So far, pgMemento can be initialized per schema via an interactive INIT script. But, there should also be a function which one can use.
This function could also init logging for more schemas at once by passing a text array (or all schemas if array is empty). When realizing #24 init
should also register schemas for logging.
We store our main data inside one JSONB field. Can it track only what was changed inside jsonb?
Hi all,
When we add new columns to a wide(ish) table that is tracked, we get the following issue which makes the migration fail.
2022-08-23 08:57:26.084 UTC [34] ERROR: cannot pass more than 100 arguments to a function at character 104
2022-08-23 08:57:26.084 UTC [34] QUERY: INSERT INTO pgmemento.row_log AS r (audit_id, event_key, new_data)
SELECT audit_trail_id, $1, jsonb_build_object('col', "col", 'col2', col2, ...) AS content
The statement itself is
2022-08-23 08:57:26.084 UTC [34] STATEMENT: ALTER TABLE "public"."table" ADD COLUMN "column" TIMESTAMP WITH TIME ZONE DEFAULT NULL;
Please let us know how to proceed with this as we might have to disable tracking in PROD due to this issue if we want to release these new columns.
When using pgmemento.revert_transaction(id)
on a transaction with changes for an array column, the update to the array column fails. In this case, the table has a default value of []
NOTICE: Could not revert UPDATE event for table public.Table: null value in column "arrayColumn" violates not-null constraint
NOTICE: Could not revert UPDATE event for table public.Table: malformed array literal: "arrayValue"
The audit_column_log
does look correct for this column.
db=# select * from pgmemento.audit_column_log where column_name = 'arrayColumn';
id | audit_table_id | column_name | ordinal_position | data_type | column_default | not_null | txid_range
-----+----------------+----------------+------------------+--------------------------+--------------------------------------------------------+----------+------------
214 | 26 | arrayColumn | 53 | character varying(255)[] | ARRAY[]::character varying[]::character varying(255)[] | t | (1,)
(1 row)
Hi,
With this setup:
create table users (id int, name text, age int);
select pgmemento.create_table_audit(
schemaname => 'public',
tablename => 'users',
audit_id_column_name => 'audit_log_id',
log_old_data => TRUE,
log_new_data => TRUE,
log_state => TRUE
);
create function update_age (_id int, _age int) returns void language 'sql' as $$
update users set age = _age where id = _id;
$$;
create function update_user (_id int, _name text, _age int) returns void language 'sql' as $$
update users set name = _name where id = _id;
select update_age(_id, _age);
$$;
insert into users values (1, 'foo', 10);
Data in the row_log:
select old_data, new_data from pgmemento.row_log;
-- old_data | new_data
-- ---------+-----------------------------------------------------------
-- | {"id": 1, "age": 10, "name": "foo", "audit_log_id": 1}
Update via the function above:
select update_user(1, 'foo updated', 20);
select old_data, new_data from pgmemento.row_log;
-- old_data | new_data
-- ----------------+-----------------------------------------------------------
-- | {"id": 1, "age": 10, "name": "foo", "audit_log_id": 1}
-- {"name": "foo"} | {"age": 20}
Whereas the expected data would be:
-- old_data | new_data
-- ---------------------------+-----------------------------------------------------------
-- | {"id": 1, "age": 10, "name": "foo", "audit_log_id": 1}
-- {"name": "foo", "age": 10} | {"name": "foo updated", "age": 20}
Or perhaps:
-- old_data | new_data
-- ----------------+-----------------------------------------------------------
-- | {"id": 1, "age": 10, "name": "foo", "audit_log_id": 1}
-- {"name": "foo"} | {"name": "foo updated"}
-- {"age": 10} | {"age": 20}
Which is what happens when one runs the updates directly:
begin;
update users set name = 'foo updated' where id = 1;
update users set age = 20 where id = 1;
commit;
Tested on PostgreSQL version 12.2 with version 0.7.2 of pgMemento. Is this a valid bug or am I doing something wrong?
Hi, I installed pgmemento
(pre-compiled) in dockerfile image bitnami/postgres:12.2
like so:
FROM bitnami/postgresql:12
USER root
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt-get -y --no-install-recommends install unzip wget make
RUN wget https://github.com/pgMemento/pgMemento/releases/download/v0.7/pgmemento-0.7.zip
RUN unzip pgmemento-0.7.zip
RUN cd pgmemento-0.7 \
&& make install PG_CONFIG=/opt/bitnami/postgresql/bin/pg_config
COPY ./init/ /docker-entrypoint-initdb.d/
USER 1001
When trying to install the extension, I'm getting error
SQL Error [22023]: ERROR: extension "pgmemento" has no installation script nor update path for version "0.7"
input:
SELECT * FROM pg_available_extensions WHERE name = 'pgmemento';
CREATE EXTENSION pgmemento;
output:
"name","default_version","installed_version","comment"
pgmemento,"0.7",,Audit trail with schema versioning for PostgreSQL using transaction-based logging
SQL Error [22023]: ERROR: extension "pgmemento" has no installation script nor update path for version "0.7"
it would be nice that in a future version if it is possible to use table partitioning (https://www.postgresql.org/docs/11/ddl-partitioning.html).
what do you think about replicating stmt_date in the table_event_log and row_log tables ?
Currently if one calls pgmemento.restore_record
with an audit_id
value that does not exist, still one row is returned only with NULL
values. The expected behaviour would be that no row is returned at all.
@FxKu Can you please share your thoughts on what it takes to make it possible?
I'd like to use my existing UUID
columns as audit keys.
Hi,
this is a feature request: I would like to start auditing for some columns of table.
For instance:
SELECT pgmemento.create_table_audit(
tablename := 'contacts,
schemaname := 'public',
audit_id_column_name := 'audit_trail_id',
log_old_data := TRUE,
log_new_data := TRUE,
log_state := TRUE,
include_columns := ['firstname', 'lastname']
);
or
SELECT pgmemento.create_table_audit(
tablename := 'contacts,
schemaname := 'public',
audit_id_column_name := 'audit_trail_id',
log_old_data := TRUE,
log_new_data := TRUE,
log_state := TRUE,
exclude_columns := ['foobar']
);
Best regards,
Stéphane
I have a JSONB column in one of my records. Edits to this column are recorded correctly in the row_log
, but when I use restore_record
on that record, the JSONB column is always null
. Have I set something up incorrectly, or is this a bug?
I have attached a transcript of what I did to produce this:
testdb=# create table test (id SERIAL PRIMARY KEY, values JSONB NOT NULL DEFAULT '{}');
CREATE TABLE
testdb=# insert into test (values) values ('{ "test": 3 }');
INSERT 0 1
testdb=# select * from test;
id | values | pgmemento_audit_id
----+-------------+--------------------
1 | {"test": 3} | 53
(1 row)
testdb=# select * from pgmemento.restore_record(1, 50, 'test', 'public', 53, true) as (log JSONB);
log
-----------------------------------------------------
{"id": 1, "values": null, "pgmemento_audit_id": 53}
(1 row)
testdb=# update test set values = '{ "test": 4 }';
UPDATE 1
testdb=# select * from pgmemento.restore_records(1, 50, 'test', 'public', 53, true) as (log JSONB);
log
---------------------------------------------------------------------------------------------------------------------------------------------------------
{"id": 1, "values": null, "stmt_time": "2020-07-09T14:15:34.823561-04:00", "transaction_id": 44, "table_operation": "INSERT", "pgmemento_audit_id": 53}
{"id": 1, "values": null, "stmt_time": "2020-07-09T14:16:53.936469-04:00", "transaction_id": 45, "table_operation": "UPDATE", "pgmemento_audit_id": 53}
(2 rows)
testdb=# select * from pgmemento.row_log where audit_id = 53;
id | audit_id | event_key | old_data | new_data
----+----------+---------------------------------------------------------+-------------------------+------------------------------------------------------------
57 | 53 | 1594318534.82356;1594318534.82356;2295516;3;test;public | | {"id": 1, "values": {"test": 3}, "pgmemento_audit_id": 53}
58 | 53 | 1594318613.93647;1594318613.93647;2295517;4;test;public | {"values": {"test": 3}} | {"values": {"test": 4}}
I would expect in the output that I see the full JSONB object, but the values
are always null. They are correctly recorded in the row_log
though.
I have installed PgMemento to my database using the provided INSTALL_PGMEMENTO.sql
script, both master and v0.7 versions.
Hi all!
First of all.. congrats for this amazing script!. I just love it!
I have a newbie question (sorry)... I would like to have a "full history" of a record (or even better, a range of records). So, given an audit_id (or a set of them) and a "from date" and "to date", I would like to see the serie of records ordered by date where each record has the full state (all fields with its data) at a given date.
date | audit_id | field 1 | field 2 | ... | field n |
---|---|---|---|---|---|
2020/09/18 | 1 | original value 1 | original value 2 | ... | original value n |
2020/09/20 | 1 | original value 1 | updated value 2 | ... | original value n |
2020/09/25 | 1 | updated value 1 | updated value 2 | ... | updated value n |
This is kind of what you can do with temporal_tables.
Is there any "easy" way to reach that?.
I've seen that if I call
SELECT
*
FROM
pgmemento.restore_record(1, 9, 'company_limits_assets', 'platform', 1, TRUE)
AS (log jsonb);
I get the final status... so I have to query the transaction table, get all transactions between two dates and call pgmemento.restore_record
one time per transaction... however, I think that this is far from being efficient.
Is there any other way?
Thank you!
-- Jorge
Hey again, I believe I found another issue. I tried to create a table containing a foreign key, but I get an error in the create table trigger:
> create db testing;
> psql -f INSTALL_PGMEMENTO.sql -d testing
> psql -f INIT.sql -d testing
SET
Please enter the name of the schema to be used along with pgMemento: public
Store new data in audit logs, too? (y|N): y
Log existing data as inserted (baseline)? (y|N): y
Trigger CREATE TABLE statements? (y|N): y
Specify tables to be excluded from logging processes (separated by comma):
Initializing pgMemento in public schema ...
init
---------------------------------------------
pgMemento is initialized for public schema.
> psql -d testing
testing=# create table test (id serial primary key, value integer);
CREATE TABLE
testing=# create table test2 (id serial primary key, test_id integer references test (id));
ERROR: column "pgmemento_audit_id" of relation "test2" already exists
CONTEXT: SQL statement "ALTER TABLE public.test2 ADD COLUMN pgmemento_audit_id BIGINT DEFAULT nextval('pgmemento.audit_id_seq'::regclass) UNIQUE NOT NULL"
PL/pgSQL function create_table_audit_id(text,text,text) line 8 at EXECUTE
SQL statement "SELECT pgmemento.create_table_audit_id($1, $2, $3)"
PL/pgSQL function create_table_audit(text,text,text,boolean,boolean,boolean) line 42 at PERFORM
SQL statement "SELECT pgmemento.create_table_audit(
tablename,
schemaname,
current_default_column,
current_log_old_data,
current_log_new_data,
FALSE
)"
PL/pgSQL function table_create_post_trigger() line 42 at PERFORM
This occurred both in v0.7 and master.
The context of ALTER TABLE
events is retrieved with GET DIAGNOSTICS stack = PG_CONTEXT;
and then parsed character by character for the table name and then word by word for column names.
So far, whenever comment symbols such as --
and /* ... */
are found in the context and error is thrown, because these comments could contain SQL commands which would the be parsed as well. But there should be a way to handle these comments.
--
could be cut out/*...*/
can be handled later in the word wise parsing stagePostgres 11.1
PgMemento scripts executed from latest commit available at the moment of reporting this issue:
cc0f1ff
Dropping table without error
After creating table and dropping it an error occurs:
create table "scheduled_messages" ("id" BIGSERIAL NOT NULL PRIMARY KEY);
drop table "scheduled_messages";
sql> create table "scheduled_messages" ("id" BIGSERIAL NOT NULL PRIMARY KEY)
[2019-01-30 13:42:07] completed in 30 ms
sql> drop table "scheduled_messages"
[2019-01-30 13:42:10] [42704] ERROR: unrecognized configuration parameter "pgmemento.688"
[2019-01-30 13:42:10] Where: SQL statement "UPDATE
[2019-01-30 13:42:10] pgmemento.audit_table_log
[2019-01-30 13:42:10] SET
[2019-01-30 13:42:10] txid_range = numrange(lower(txid_range), current_setting('pgmemento.' || txid_current())::numeric, '(]')
[2019-01-30 13:42:10] WHERE
[2019-01-30 13:42:10] table_name = $1
[2019-01-30 13:42:10] AND schema_name = $2
[2019-01-30 13:42:10] AND upper(txid_range) IS NULL
[2019-01-30 13:42:10] AND lower(txid_range) IS NOT NULL
[2019-01-30 13:42:10] RETURNING
[2019-01-30 13:42:10] id"
[2019-01-30 13:42:10] PL/pgSQL function unregister_audit_table(text,text) line 6 at SQL statement
[2019-01-30 13:42:10] SQL statement "SELECT pgmemento.unregister_audit_table(obj.object_name, obj.schema_name)"
[2019-01-30 13:42:10] PL/pgSQL function table_drop_post_trigger() line 11 at PERFORM
Could anyone confirm the same behaviour in their environment?
Hey there :)
I believe I have found a bug where a trigger triggers once too often or something like that.
This is something I noticed when updating the tests for postgraphile-audit-plugin
to 0.7 and it also occurs on the current master.
It can be repoduced by running INSTALL_PGMEMENTO.sql
followed by:
CREATE SCHEMA postgraphile_audit_plugin;
SELECT pgmemento.init(
'postgraphile_audit_plugin', -- schema_name
'pgmemento_audit_id', -- audit_id_column_name
TRUE, -- log_old_data
TRUE, -- log_new_data
FALSE, -- log_state
TRUE, -- trigger_create_table
'{}' -- except_tables
);
CREATE TABLE postgraphile_audit_plugin.users (
id SERIAL PRIMARY KEY
);
CREATE TABLE postgraphile_audit_plugin.users_emails (
user_id INT NOT NULL REFERENCES postgraphile_audit_plugin.users(id)
);
The second CREATE TABLE
statement will fail with
psql:__tests__/setupSchema.sql:27: ERROR: column "pgmemento_audit_id" of relation "users_emails" already exists
CONTEXT: SQL statement "ALTER TABLE postgraphile_audit_plugin.users_emails ADD COLUMN pgmemento_audit_id BIGINT DEFAULT nextval('pgmemento.audit_id_seq'::regclass) UNIQUE NOT NULL"
PL/pgSQL function create_table_audit_id(text,text,text) line 8 at EXECUTE
SQL statement "SELECT pgmemento.create_table_audit_id($1, $2, $3)"
PL/pgSQL function create_table_audit(text,text,text,boolean,boolean,boolean) line 41 at PERFORM
SQL statement "SELECT pgmemento.create_table_audit(
tablename,
schemaname,
current_default_column,
current_log_old_data,
current_log_new_data,
FALSE
)"
PL/pgSQL function table_create_post_trigger() line 42 at PERFORM
To reproduce this, you can use our test setup:
Just clone https://github.com/mayflower/postgraphile-audit-plugin and run yarn run-tests
(you'll need yarn
and docker
installed).
Had it in the beginning. Dropped it then. But should not be hard to bring it back. Question is what literal shall be used for target_table_type
parameter?
MATERIALIZED VIEW
- propably good to integrated in scripts
MVIEW
or MATVIEW
- shorter and faster to write, when executed manually
Maybe both forms should be supported.
Hi, i'm trying to install pgMemento as an extension.
The make install puts the file down at this location: /usr/share/pgsql/extension/pgmemento.control
However, my Postgres instance wants to look for it at this location: /usr/pgsql-11/share/extension/pgmemento.control
I copied the file there, and can successfully run SELECT * FROM pg_available_extensions WHERE name = 'pgmemento';
- However, if I now run CREATE EXTENSION pgmemento;
i get the following error:
ERROR: extension "pgmemento" has no installation script nor update path for version "0.7.3"
How do i resolve this?
Would you be able to tell me how I can find out which of the schemas of my database are monitored at any one time? I have multiple schemas in my databases and have started and stopped auditing for some of the them multiple times (using the stop_auditing.sql and start_auditing.sql scripts) and lost track of it. Is there a way to find out which ones are currently monitored?
I've got an endless loop from a simple command:
DROP TABLE IF EXISTS myschema.nonexisting_table CASCADE;
Digging into a code of table_drop_pre_trigger
function, I've found that this line causes trouble.
It sets fetch_next
variable to TRUE
and this results in endless loop.
@FxKu Can you please explain the logic behind this EXCEPTION
block? Why does it set fetch_next
to FALSE
on invalid_name
and to TRUE
on undefined_table
? Isn't NULL
and invalid name?
In line
Line 53 in 9baeba2
value
extract(epoch from t.txid_time)
is repeated twice. This is unnecessary and doesn't add any uniqueness to key values.
So when you have two events of the same type in one transaction this value won't be unique.
I guess what really was meant here is
extract(epoch from e.stmt_time)
So the whole key looks like:
event_key = concat_ws(';', extract(epoch from t.txid_time), extract(epoch from e.stmt_time), t.txid, e.op_id)
@FxKu Is that true?
So far, an ALTER COLUMN
event is only logged if the data type changes and the DDL command contains the keyword USING
. The idea behind this was that it's not necessary to log whole data of the affected column if it would still work against the old data type. However, the changes inthe table layout should still be logged in the audit_column_log
which is not the case, unfortunately.
The alter_table_pre_trigger
function should also care for commands where
USING
is usednullable
behaviourIf this is parsed correctly and the event is logged the necessary updates of audit_column_log
will be applied by the alter_table_post_trigger
procedure.
The session object stored in pgmemento.transaction_log
is not a valid object after calling pgmemento.init()
. The session is there, but it is stored as a string rather than an object:
id | txid | txid_time | process_id | user_name | client_name | client_port | application_name | session_info
------+------+-------------------------------+------------+-----------+---------------+-------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | 1112 | 2020-08-20 15:05:16.323992+00 | 74 | user | 172.26.0.1/32 | 40024 | | "{\"pgmemento_init\": {\"schema_name\": '\"public\"', \"default_audit_id_column\": '\"pgmemento_audit_id\"', \"default_log_old_data\": 'true', \"default_log_new_data\": 'true', \"log_state\": 'true', \"trigger_create_table\": 'true', \"except_tables\": '[\"SequelizeMeta\", \"Sessions\"]'}}"
...
296 | 1553 | 2020-08-20 15:14:59.944585+00 | 80 | user | 172.27.0.1/32 | 34734 | | {"message": "Added test users", "client_user": "seed"}
Setting the session_info for further changes does work correctly, it is just the init that creates a string instead of an object.
The test for this is also broken, as it does attempt to check for a pgmemento_init
session_info but doesnt actually work correctly.
ASSERT (
SELECT EXISTS (
SELECT session_info ? 'pgmemento_init'
FROM pgmemento.transaction_log
WHERE id = current_setting('pgmemento.' || txid_current())::numeric
)
), 'Error: Could not find entry in transaction_log for stopping audit trail in schema %!', tab_schema;
The SELECT EXISTS
just hides the false value returned from SELECT session_info ? 'pgmemento_init'
.
https://github.com/pgMemento/pgMemento#quickstart says
To use pgMemento as an extension download and unzip the
pgmemento-.zip archive, change to extracted folder and
run the following commands from within a shell environment:make sudo make install
However there is no Makefile in the root of the archive, not in src
. There is one in extension/pgxn
but trying to use that results in an error:
~/pgMemento-0.7.1/extension/pgxn# make
grep: pgmemento.control: No such file or directory
grep: pgmemento.control: No such file or directory
make: *** No rule to make target 'sql/pgmemento.sql', needed by 'sql/pgmemento--.sql'. Stop.
It's probably not the Makefile the README was referencing though anyways.
Installation worked perfectly fine for me using the described "classic way" but I wanted to let you know. :)
Hi there!
It would be great if there were a way to let pgMemento not only save the value before a transaction, but also the value after a transaction - currently we're caluclating the value after by executing restore_record
, but this is quite expensive for a lot of changes at once, so doing the storage <-> runtime tradeoff would be well worth it.
Thank you very much!
(Here's what we are currently doing)
Idea: Add a timestamp column to table_event_log
and feed it with statement_timestamp()
. This way multiple events of the same type can be logged per transaction. Important question is: Can INSERTs into row_log
be referenced correctly and fast enough (statememt_timestamp
needs to be the same in BEFORE and AFTER phase!). Previous attemps to use transaction variables also for table events decreased the logging performance. So, this might not be an option then.
A change in the event logging would also affect the code in DDL_LOG, REVERT and SETUP where logging only one event per transaction is used as a feature (by calling log_table_event
manually).
When uninstalling pgMemento from my PostgreSQL 10 database, using the UNINSTALL_PGMEMENTO.sql script, I get the following error message:
Removing audit_id columns from audited tables ...
psql:UNINSTALL_PGMEMENTO.sql:40: ERROR: relation "slp20110006.d20180328_sa_birt
hs_pslk_source" does not exist
CONTEXT: SQL statement "SELECT EXISTS (
SELECT
1
FROM
pg_attribute
WHERE
attrelid = ($2 || '.' || $1)::regclass::oid
AND attname = 'audit_id'
AND attislocal = 't'
AND NOT attisdropped
)"
PL/pgSQL function drop_table_audit_id(text,text) line 4 at IF
SQL statement "SELECT pgmemento.drop_table_audit_id($1, $2)"
PL/pgSQL function drop_table_audit(text,text) line 4 at PERFORM
I have checked table pgmemento.audit_table_log and found an entry for table slp20110006.d20180328_sa_births_pslk_source with txid_range [1248,]. However, there is indeed no table d20180328_sa_births_pslk_source in schema slp20110006. This is a table name that previously existed but that has now been renamed.
Should the uninstall script cater for this eventuality of tables being deleted or renamed?
Is there any other way to safely uninstall/remove pgMemento from my PostgreSQL database?
Thanks in advance for any help.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.