sysadminmike / couch-to-postgres Goto Github PK

View Code? Open in Web Editor NEW

110.0 7.0 17.0 59 KB

Node libary to stream CouchDB changes into PostgreSQL

License: BSD 2-Clause "Simplified" License

JavaScript 94.44% Shell 5.56%

couchdb postgresql synchronization change-data-capture cdc cloudant

couch-to-postgres's People

Stargazers

Watchers

Forkers

neogis-de rusllonrails omaranda septima vingao wiecklabs grengojbo jeaneric dgleba uka4me teamofmonkeysnet jmorganh noahdiewald sylvanotombo davida-natolotra mahmoudfathy557

couch-to-postgres's Issues

Postgres disconnects after inserting into couchdb (http_put)

I've installed couch-to-postgres and the http extension for postgres. When I insert from Postgres to Couchdb, postgres disconnects, however in Couchdb data has been succesfully inserted.

This is the change in the trigger function couchdb_put() I made, because with POST it never worked for me:

SELECT status FROM http_put('http://127.0.0.1:5984/' || TG_TABLE_NAME || '/' || NEW.id::text, NEW.doc::text, 'Content-Type:application/json'::text) INTO RES;

This is the test:
INSERT INTO example (id, doc, from_pg) VALUES ('i', json_object('{_id,myvar}','{i, 100}')::jsonb, true);

This is the message I get from wireshark:
PUT /example/i HTTP/1.1\r\n

this is the message I get from the couchdb log:
Sun, 04 Oct 2015 16:29:33 GMT] [info] [<0.439.0>] 127.0.0.1 - - PUT /example/i 201

and this is the postgres log:

2015-10-04 10:33:35 CST [1727-3] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally an
d possibly corrupted shared memory.
2015-10-04 10:33:35 CST [1727-4] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2015-10-04 10:33:35 CST [709-20] LOG: all server processes terminated; reinitializing
2015-10-04 10:33:35 CST [1770-1] LOG: database system was interrupted; last known up at 2015-10-04 10:29:34 CST
2015-10-04 10:33:35 CST [1770-2] LOG: database system was not properly shut down; automatic recovery in progress
2015-10-04 10:33:35 CST [1770-3] LOG: record with zero length at 0/18146D0
2015-10-04 10:33:35 CST [1770-4] LOG: redo is not required
2015-10-04 10:33:35 CST [1770-5] LOG: MultiXact member wraparound protections are now enabled

Versions:
postgresql-9.4
couchdb 1.6.1
pgsql-http 1.1

I would appreciate any help from you, thank you very much in advance!!

G.Carranza

sort out couch_put - replace with couch_post

Function needs to deal with status code returns from http_post
Also use POST not PUT requests so no need to edit pgsql-http extension before compilation.

ie implement:

post_docs(docs,chunk_size) - returning recordset of status codes? or just true/false?

update readme to reflect change to http_post function in postgres http extension

Old way:
http_post(url VARCHAR, params VARCHAR, data VARCHAR, contenttype VARCHAR DEFAULT NULL)

is no good any more

multi master replication

is it possible use couchdb replicator over the postgres database on couch-to-postgres?

look at forever - for node daemon

Look at: https://www.npmjs.com/package/forever for keeping client up in case of issue

example use sql

Do a wiki/readme page of example sql queries eg:

SELECT DISTINCT doc->>'type' as doctype, count(doc->>'type')
FROM mytable GROUP BY doctype ORDER BY doctype

And any things which might bite like the ORDER BY issue

daemon - __sidekiq esque

web interface to add/remove enable/disable reset initial_since checkpoint restart stream etc

daemon - add some kind of stats collection

we have postgres or couch to dump some statistics about the feeds to

eg inserts/updates/deletes a sec / min / hr

should be pretty easy to pump to couch and use elastic search + kibrana for some pretty graphs

possible issue with couch restarting

I think the daemon may stop following a _changes feed but think all is ok if couch crashes and is restarted in between a watchdog check.

Still testing and not sure if its the daemon part or library - could possible be the npm 'follow' but not sure.

A restart of the daemon is all that is needed to fix this but can leave postgres out of sync with couch

add _add + friends feed to api

_add
_enable
_disable
_remove
All need to accept json as get/post request and keep it couchy

_feeds_status - list all feeds - current /_status does this - add more info about feed
_status - change to give status about daemon + postgres connection + watchdog + pg_watchdog and any other global info

make libary and daemon into proper npm

Need help with this as i am node beginner and think code needs some re-factoring

daemon wakeup idea

When no changes for a while it can take a while before the daemon notices the next change

Perhaps add a field in checkpoint_settings - when sending stuff to couch check after send

if time is < inactivity time send NOTIFY to daemon to wake up and check for the new changes.
make inactivity time unique per db and store in checkpoint settings

attachments - issue to hold info on them

I dont think works with _attachments - or is ignoring them - as they are in couch and I think postgres is more use manipulating/generating reports/ad hoc queries on the data rather than dealing with attahments.

Not done any tests with them yet

example.sql

Need to collect up current db schema as its currently on my laptop/head/readme

tests - need to setup some automated tests

not sure how to do this

SQL DELETE - document functionality

Do with bulk updates setting doc._deleted flag to true

Should be straight forward to do once bulk function in place - as this can be then done with

json_object_set_key(doc::json, '_deleted'::text, true):

Make a function to help which just gets passed the doc id and rev plus array of fields to keep in the deleted doc so not upset anyone with elastic search couch river (https://github.com/elasticsearch/elasticsearch-river-couchdb - Indexing Databases with Multiple Types)

ie bulk submit needs to be something like:
"docs" : [{
"_id: 2,
"_rev" : "rev",
"_deleted" : true,
"type" : "Person"
}]'

we want to find some tools to publish postgresql table to redis or couchdb as a restful service

Hovercraft / pl/Erlang / pl/sh - a place for any ideas on this

Replace put function with https://github.com/jchris/hovercraft - needs erlang extension For postgres like PL/Perl - any one know if this exists as i think it would be quite simple to embed hovercraft in postgres then and should then be possible to do proper transactions and do very large updates (i havnt tried more than a few dozen docs at the moment) - oif no pl/erlang then perhaps using pl/sh - https://github.com/petere/plsh

cat ~/.bashrc | erl -noshell -s rot13 rot13 | wc

http://www.erlang.org/faq/how_do_i.html

and do something like:
http://www.softwarepassion.com/importing-data-to-couchdb-java-ruby-and-erlang-way/

Maybe have 2 options so one for when postgres and couch are on the same machine and can communicate via pipes (for bulk updates I am sure this will be the fastest method without PL/Erlang plus less bits to go wrong and maybe with exit codes from the shell transactions could be possible) and another version for calls over http.

Change logic for from_pg to from_couch

Need to modify the sql in the library and trigger.

Update readme accordingly

I dont think its possible not to do this without somehow identifying to the trigger where the query came from

look at upsert on postgres 9.5 for update function

http://www.craigkerstiens.com/2015/05/08/upsert-lands-in-postgres-9.5/

replace the 'select count' and insert or update stuff in update function to make just one request of an upsert to postgres for new/changes to docs - i am guessing but is possible there could be a real performance boost by doing this ... need to test

works on workstation, not working on server

I worked through getting this to work on my desktop with PG and couch on our server.
When I copy the folder( couch-to-Postgres) to my server (same server as pg and couch) it will not sync.

I get this on the server. there is blue text that starts with "follow:" This text was green when I first ran it.

follow:stream http://administrator:[email protected]:5984/tcsoffice:debug } +9ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug JSON: '{"seq":"25-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQkyxNDNISjbBpi8LAJ9ZLRw","id":"wo::1010","changes":[{"rev":"2-779eb03dff0f507cbedf9dd44b5c1a24"}],"deleted":true,"doc":{"_id":"wo::1010","_rev":"2-779eb03dff0f507cbedf9dd44b5c1a24","_deleted":true}}' +19ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug Object: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug seq: '25-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQkyxNDNISjbBpi8LAJ9ZLRw',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug id: 'wo::1010',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug changes: [ { rev: '2-779eb03dff0f507cbedf9dd44b5c1a24' } ],
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug deleted: true,
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug doc: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _id: 'wo::1010',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _rev: '2-779eb03dff0f507cbedf9dd44b5c1a24',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _deleted: true
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug }
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug } +3ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug JSON: '{"seq":"27-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQ0yxNDNISjbBpi8LAJ-dLR4","id":"wo::12365","changes":[{"rev":"2-e0d0a9a121da8776801f59ccea7cc691"}],"deleted":true,"doc":{"_id":"wo::12365","_rev":"2-e0d0a9a121da8776801f59ccea7cc691","_deleted":true}}' +6ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug Object: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug seq: '27-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQ0yxNDNISjbBpi8LAJ-dLR4',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug id: 'wo::12365',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug changes: [ { rev: '2-e0d0a9a121da8776801f59ccea7cc691' } ],
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug deleted: true,
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug doc: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _id: 'wo::12365',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _rev: '2-e0d0a9a121da8776801f59ccea7cc691',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _deleted: true
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug }
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug } +13ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug emit: data +3ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Data from 2021-03-26T20:33:11.106Z +2ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Req 2021-03-26T20:33:11.106Z timeout=37500, inactivity=30000: http://192.168.0.12:5984/tcsoffice +11ms
example: Starting checkpointer
example: Checkpoint 1 is current next check in: 120 seconds
follow:http://administrator:[email protected]:5984/tcsoffice:debug Req 2021-03-26T20:33:11.106Z made no changes for 30.002s +30s
follow:http://administrator:[email protected]:5984/tcsoffice:debug Stop +1ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Destroying req 2021-03-26T20:33:11.106Z +4ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug destroy +1ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Checking database: http://192.168.0.12:5984/tcsoffice +4ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug write: { data: '', buf: '' } +8ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Confirmed database: http://192.168.0.12:5984/tcsoffice +7ms
example: {"db_name":"tcsoffice","purge_seq":"0-g1AAAABXeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUnlsQBJhgYg9R8IshIZ8KhNZEiqhyjKAgBm5Rxs","update_seq":"151-g1AAAABXeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUnlsQBJhgYg9R8IshL78KhNZEiqByvizAIAfJkdAw","sizes":{"file":467308,"external":95,"active":25024},"props":{"partitioned":true},"doc_del_count":39,"doc_count":1,"disk_format_version":8,"compact_running":false,"cluster":{"q":2,"n":1,"w":1,"r":1},"instance_start_time":"0"}
follow:http://administrator:[email protected]:5984/tcsoffice:debug Feed query 2021-03-26T20:33:41.278Z: http://192.168.0.12:5984/tcsoffice/_changes?since=11-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZwQ0yxNDNISjbBpi8LAJ19LQ4&feed=continuous&heartbeat=30000&include_docs=true +2ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Remove feed from agent pool: 2021-03-26T20:33:41.278Z +9ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Good response: 2021-03-26T20:33:41.278Z +1ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Req 2021

sysadminmike / couch-to-postgres Goto Github PK

couch-to-postgres's People

Stargazers

Watchers

Forkers

couch-to-postgres's Issues

Recommend Projects

Recommend Topics

Recommend Org