Giter Site home page Giter Site logo

sysadminmike / couch-to-postgres Goto Github PK

View Code? Open in Web Editor NEW
110.0 7.0 17.0 59 KB

Node libary to stream CouchDB changes into PostgreSQL

License: BSD 2-Clause "Simplified" License

JavaScript 94.44% Shell 5.56%
couchdb postgresql synchronization change-data-capture cdc cloudant

couch-to-postgres's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

couch-to-postgres's Issues

Postgres disconnects after inserting into couchdb (http_put)

I've installed couch-to-postgres and the http extension for postgres. When I insert from Postgres to Couchdb, postgres disconnects, however in Couchdb data has been succesfully inserted.

This is the change in the trigger function couchdb_put() I made, because with POST it never worked for me:

SELECT status FROM http_put('http://127.0.0.1:5984/' || TG_TABLE_NAME || '/' || NEW.id::text, NEW.doc::text, 'Content-Type:application/json'::text) INTO RES;

This is the test:
INSERT INTO example (id, doc, from_pg) VALUES ('i', json_object('{_id,myvar}','{i, 100}')::jsonb, true);

This is the message I get from wireshark:
PUT /example/i HTTP/1.1\r\n

this is the message I get from the couchdb log:
Sun, 04 Oct 2015 16:29:33 GMT] [info] [<0.439.0>] 127.0.0.1 - - PUT /example/i 201

and this is the postgres log:

2015-10-04 10:33:35 CST [1727-3] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally an
d possibly corrupted shared memory.
2015-10-04 10:33:35 CST [1727-4] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2015-10-04 10:33:35 CST [709-20] LOG: all server processes terminated; reinitializing
2015-10-04 10:33:35 CST [1770-1] LOG: database system was interrupted; last known up at 2015-10-04 10:29:34 CST
2015-10-04 10:33:35 CST [1770-2] LOG: database system was not properly shut down; automatic recovery in progress
2015-10-04 10:33:35 CST [1770-3] LOG: record with zero length at 0/18146D0
2015-10-04 10:33:35 CST [1770-4] LOG: redo is not required
2015-10-04 10:33:35 CST [1770-5] LOG: MultiXact member wraparound protections are now enabled

Versions:
postgresql-9.4
couchdb 1.6.1
pgsql-http 1.1

I would appreciate any help from you, thank you very much in advance!!

G.Carranza

sort out couch_put - replace with couch_post

Function needs to deal with status code returns from http_post
Also use POST not PUT requests so no need to edit pgsql-http extension before compilation.

ie implement:

post_docs(docs,chunk_size) - returning recordset of status codes? or just true/false?

example use sql

Do a wiki/readme page of example sql queries eg:

SELECT DISTINCT doc->>'type' as doctype, count(doc->>'type')
FROM mytable GROUP BY doctype ORDER BY doctype

And any things which might bite like the ORDER BY issue

daemon - add some kind of stats collection

we have postgres or couch to dump some statistics about the feeds to

eg inserts/updates/deletes a sec / min / hr

should be pretty easy to pump to couch and use elastic search + kibrana for some pretty graphs

possible issue with couch restarting

I think the daemon may stop following a _changes feed but think all is ok if couch crashes and is restarted in between a watchdog check.

Still testing and not sure if its the daemon part or library - could possible be the npm 'follow' but not sure.

A restart of the daemon is all that is needed to fix this but can leave postgres out of sync with couch

add _add + friends feed to api

_add
_enable
_disable
_remove
All need to accept json as get/post request and keep it couchy

_feeds_status - list all feeds - current /_status does this - add more info about feed
_status - change to give status about daemon + postgres connection + watchdog + pg_watchdog and any other global info

daemon wakeup idea

When no changes for a while it can take a while before the daemon notices the next change

Perhaps add a field in checkpoint_settings - when sending stuff to couch check after send

  • if time is < inactivity time send NOTIFY to daemon to wake up and check for the new changes.
  • make inactivity time unique per db and store in checkpoint settings

attachments - issue to hold info on them

I dont think works with _attachments - or is ignoring them - as they are in couch and I think postgres is more use manipulating/generating reports/ad hoc queries on the data rather than dealing with attahments.

Not done any tests with them yet

example.sql

Need to collect up current db schema as its currently on my laptop/head/readme

SQL DELETE - document functionality

Do with bulk updates setting doc._deleted flag to true

Should be straight forward to do once bulk function in place - as this can be then done with

json_object_set_key(doc::json, '_deleted'::text, true):

Make a function to help which just gets passed the doc id and rev plus array of fields to keep in the deleted doc so not upset anyone with elastic search couch river (https://github.com/elasticsearch/elasticsearch-river-couchdb - Indexing Databases with Multiple Types)

ie bulk submit needs to be something like:
"docs" : [{
"_id: 2,
"_rev" : "rev",
"_deleted" : true,
"type" : "Person"
}]'

Hovercraft / pl/Erlang / pl/sh - a place for any ideas on this

Replace put function with https://github.com/jchris/hovercraft - needs erlang extension For postgres like PL/Perl - any one know if this exists as i think it would be quite simple to embed hovercraft in postgres then and should then be possible to do proper transactions and do very large updates (i havnt tried more than a few dozen docs at the moment) - oif no pl/erlang then perhaps using pl/sh - https://github.com/petere/plsh

cat ~/.bashrc | erl -noshell -s rot13 rot13 | wc

http://www.erlang.org/faq/how_do_i.html

and do something like:
http://www.softwarepassion.com/importing-data-to-couchdb-java-ruby-and-erlang-way/

Maybe have 2 options so one for when postgres and couch are on the same machine and can communicate via pipes (for bulk updates I am sure this will be the fastest method without PL/Erlang plus less bits to go wrong and maybe with exit codes from the shell transactions could be possible) and another version for calls over http.

Change logic for from_pg to from_couch

Need to modify the sql in the library and trigger.

Update readme accordingly

I dont think its possible not to do this without somehow identifying to the trigger where the query came from

works on workstation, not working on server

I worked through getting this to work on my desktop with PG and couch on our server.
When I copy the folder( couch-to-Postgres) to my server (same server as pg and couch) it will not sync.

I get this on the server. there is blue text that starts with "follow:" This text was green when I first ran it.


follow:stream http://administrator:[email protected]:5984/tcsoffice:debug } +9ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug JSON: '{"seq":"25-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQkyxNDNISjbBpi8LAJ9ZLRw","id":"wo::1010","changes":[{"rev":"2-779eb03dff0f507cbedf9dd44b5c1a24"}],"deleted":true,"doc":{"_id":"wo::1010","_rev":"2-779eb03dff0f507cbedf9dd44b5c1a24","_deleted":true}}' +19ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug Object: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug seq: '25-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQkyxNDNISjbBpi8LAJ9ZLRw',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug id: 'wo::1010',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug changes: [ { rev: '2-779eb03dff0f507cbedf9dd44b5c1a24' } ],
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug deleted: true,
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug doc: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _id: 'wo::1010',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _rev: '2-779eb03dff0f507cbedf9dd44b5c1a24',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _deleted: true
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug }
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug } +3ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug JSON: '{"seq":"27-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQ0yxNDNISjbBpi8LAJ-dLR4","id":"wo::12365","changes":[{"rev":"2-e0d0a9a121da8776801f59ccea7cc691"}],"deleted":true,"doc":{"_id":"wo::12365","_rev":"2-e0d0a9a121da8776801f59ccea7cc691","_deleted":true}}' +6ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug Object: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug seq: '27-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZIQ0yxNDNISjbBpi8LAJ-dLR4',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug id: 'wo::12365',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug changes: [ { rev: '2-e0d0a9a121da8776801f59ccea7cc691' } ],
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug deleted: true,
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug doc: {
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _id: 'wo::12365',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _rev: '2-e0d0a9a121da8776801f59ccea7cc691',
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug _deleted: true
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug }
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug } +13ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug emit: data +3ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Data from 2021-03-26T20:33:11.106Z +2ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Req 2021-03-26T20:33:11.106Z timeout=37500, inactivity=30000: http://192.168.0.12:5984/tcsoffice +11ms
example: Starting checkpointer
example: Checkpoint 1 is current next check in: 120 seconds
follow:http://administrator:[email protected]:5984/tcsoffice:debug Req 2021-03-26T20:33:11.106Z made no changes for 30.002s +30s
follow:http://administrator:[email protected]:5984/tcsoffice:debug Stop +1ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Destroying req 2021-03-26T20:33:11.106Z +4ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug destroy +1ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Checking database: http://192.168.0.12:5984/tcsoffice +4ms
follow:stream http://administrator:[email protected]:5984/tcsoffice:debug write: { data: '', buf: '' } +8ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Confirmed database: http://192.168.0.12:5984/tcsoffice +7ms
example: {"db_name":"tcsoffice","purge_seq":"0-g1AAAABXeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUnlsQBJhgYg9R8IshIZ8KhNZEiqhyjKAgBm5Rxs","update_seq":"151-g1AAAABXeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUnlsQBJhgYg9R8IshL78KhNZEiqByvizAIAfJkdAw","sizes":{"file":467308,"external":95,"active":25024},"props":{"partitioned":true},"doc_del_count":39,"doc_count":1,"disk_format_version":8,"compact_running":false,"cluster":{"q":2,"n":1,"w":1,"r":1},"instance_start_time":"0"}
follow:http://administrator:[email protected]:5984/tcsoffice:debug Feed query 2021-03-26T20:33:41.278Z: http://192.168.0.12:5984/tcsoffice/_changes?since=11-g1AAAACTeJzLYWBgYMpgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUklMiTV____PyuDOZEhFyjAbmxqaGiRYpzCwFmal5KalpmXmoJHex4LkGRoAFL_oaZwQ0yxNDNISjbBpi8LAJ19LQ4&feed=continuous&heartbeat=30000&include_docs=true +2ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Remove feed from agent pool: 2021-03-26T20:33:41.278Z +9ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Good response: 2021-03-26T20:33:41.278Z +1ms
follow:http://administrator:[email protected]:5984/tcsoffice:debug Req 2021

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.