zalando / patroni Goto Github PK
View Code? Open in Web Editor NEWA template for PostgreSQL High Availability with Etcd, Consul, ZooKeeper, or Kubernetes
License: MIT License
A template for PostgreSQL High Availability with Etcd, Consul, ZooKeeper, or Kubernetes
License: MIT License
just a note because I won't get to it in the next couple days: the pgpass file created by patroni is named incorrectly. it needs a leading period.
Also, this pgpass file is automatically created in the working directory from which patroni is called, which isn't necessarily the user's home directory.
Replicas should have the following attributes available. This would be added as tags within the replica description, both in the API and in the SIS. If the tags are absent, default behavior remains the same; in the config, the tags below are set to True or False:
nofailover: this replica should not be a candidate for automatic failover. Used for ephemeral replicas, and multi-replica setups where there are a few designated failover targets. If the tag is missing, the replica is assumed to be a failover target.
noloadbalance: this replica should not be part of any load-balancing, if HAProxy autoconfig is being used.
clonefrom: take basebackups from this replica instead of from the master. If multiple CloneFrom replicals are available, one is randomly chosen for new basebackups. If cloning using WALE or other archival backup, this tag is ignored.
replicatefrom: not true or false, but the IP address/hostname of another replica. Used to support cascading replication.
Steps to reproduce
waiting for server to start....LOG: database system was interrupted; last known up at 2015-10-13 11:27:23 CEST
WARNING: recovery command file "recovery.conf" specified neither primary_conninfo nor restore_command
HINT: The database server will regularly poll the pg_xlog subdirectory to check for files placed there.
LOG: entering standby mode
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 0/40001D8
LOG: invalid record length at 0/40002B8
LOG: consistent recovery state reached at 0/40002B8
LOG: database system is ready to accept read only connections
done
server started
2015-10-13 11:28:24,422 INFO: started as readonly because i had the session lock
2015-10-13 11:28:24,423 INFO: Lock owner: postgresql0; I am postgresql0
2015-10-13 11:28:24,428 INFO: no action. i am the leader with the lock
I think we are not clearing the role flag after the crash, will look into today....
I'd like to rationalize how we track, store, and use postgres database connections. Currently the methods of passing connection information is quite haphazard, resulting in strange and unpredictable issues with hba and passwords. It also makes it really hard to follow the code, or to make simple fixes like issue #61.
I need to get agreement from all contributors on the way to address this, because otherwise we're headed for merge conflict hell, since touching connections and connection-passing will touch quite a bit of code.
Here is my thinking on how we should do this:
The two types of connections are:
All other connections should go through the API; there is no reason why a patroni node should need to connect directly to another node for any purpose other than replication (and replication support).
This would also have some other, follow-on changes:
So ... feedback/changes/etc.?
Currently, all replicas in the Spilo cluster are equal and can take over the role of the master when necessary. However, it is impossible to designate a replica as delayed by an hour or two, which in practice is very useful for mitigating disasters like dropping a very important table or running a delete without the where clause.
It would be an improvement to have a replica running in a delayed mode with a configurable delay. Obviously, such replica won't be able to participate in the failover process, but that's ok for the use case we are addressing.
We might think about implementing this after having the ability to mark a single node as non-participating in the election. Alternatively, using the external master feature, we may just run the whole spilo cluster in a standby mode, and add another parameter to customize such a mode: that might be either a replication delay for delayed replicas, or max_standby_archive/streaming delay to run analytical replicas without affecting the availability of the normal ones.
Previously, we could only use pg_basebackup to bootstrap the new replica, which obviously required a running master. Currently, we also have a possibility to use WAL-E, as well as user-supplied replica creation methods that don't require a running master, since the data is stored on S3 or any other external storage. We need to change the Patroni behavior to bootstrap even if the leader is not present, in case the replica creation method allows it. Since there is no way we can auto detect the ability of the replica creation method to create replicas without the running master, we should add a parameter to the method configuration, indicating that.
The result is that we can recover from stupid errors, like the case of someone just shutting down all Patroni instances at once.
Using synchronous_standby_names (one sync replica, 0+ async replicas)
http://www.postgresql.org/docs/9.4/static/warm-standby.html#SYNCHRONOUS-REPLICATION
Currently the API is completely unsecured. This limits us adding new features to the API which could actually destroy or reveal data, and isn't a great idea in general. As such, I'd like two things for the API:
There's two ways we could approach this: one would be to add these things using various snippets to the existing API.py (for example: https://gist.github.com/fxsjy/5465353). The second way would be to offer sample configurations for an nginx proxy for the API.
Thoughts?
Patroni should be compatible with traditional datacenter, as well as with most cloud providers. Therefore, we don't want AWS-specifc stuff (like backups to S3) to be part of it. Instead, it should be possible to call external programs via callbacks at various moments, i..e at server start/stop, restart/reload, promote/demote.
For systems with 1 master and 1 replica, add pgbouncer auto-configuration for rw and ro database handles.
This is a TODO item for me.
Currently, on exit Patroni tries to shut down the PostgreSQL database it manages. This makes sense generally (otherwise we won't be able to guarantee that the leftover database will keep its role), but on some occasions (i.e. upgrading Patroni itself) we might want to detach Patroni from the PostgreSQL cluster and keep the latter running.
This is on the movebasebackup branch, but does not appear to be caused by any of the code there.
2015-10-27 17:51:05,351 INFO: Starting new HTTP connection (1): 172.17.0.5
2015-10-27 17:51:05,357 INFO: trying to bootstrap from leader
waiting for server to start....LOG: database system was interrupted; last known up at 2015-10-27 17:51:05 PDT
LOG: entering standby mode
LOG: redo starts at 0/2000028
LOG: consistent recovery state reached at 0/20000F0
LOG: database system is ready to accept read only connections
FATAL: could not start WAL streaming: ERROR: replication slot "postgresql1" does not exist
done
server started
2015-10-27 17:51:07,240 INFO: bootstrapped from leader
The strange thing about that error is that the slot exists, the replica is replicating, and everything is working. Seems like we could avoid having the error sent to the output by checking pg_replication_slots on the master. Not sure what code is giving rise to this, though.
With HEAD as of a couple days ago, if you set the following:
superuser:
username: postgres
password: zalando
For a new database system, you will get this error on initialization:
2015-10-29 19:40:06,302 INFO: established a new patroni connection to the postgres cluster
ERROR: role "postgres" already exists
STATEMENT: CREATE ROLE "postgres" WITH LOGIN SUPERUSER PASSWORD 'super'
2015-10-29 19:40:06,316 INFO: removing initialize key after failed attempt to initialize the cluster
2015-10-29 19:40:06,317 INFO: closed patroni connection to the postgresql cluster
waiting for server to shut down...LOG: received immediate shutdown request
.WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
done
server stopped
2015-10-29 19:40:07,349 INFO: renaming data directory to /pgdata/data_2015-10-29-19-40-07
2015-10-29 19:40:07,350 ERROR: Error communicating with Postgresql. Will try again later
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/patroni/ha.py", line 380, in _run_cycle
return self.bootstrap() # new node
File "build/bdist.linux-x86_64/egg/patroni/ha.py", line 79, in bootstrap
self.state_handler.bootstrap()
File "build/bdist.linux-x86_64/egg/patroni/postgresql.py", line 647, in bootstrap
self.create_connection_users()
File "build/bdist.linux-x86_64/egg/patroni/postgresql.py", line 588, in create_connection_users
self.superuser['username']), self.superuser['password'])
File "build/bdist.linux-x86_64/egg/patroni/postgresql.py", line 160, in query
return self.retry(self._query, sql, *params)
File "build/bdist.linux-x86_64/egg/patroni/utils.py", line 153, in __call__
return func(*args, **kwargs)
File "build/bdist.linux-x86_64/egg/patroni/postgresql.py", line 153, in _query
raise e
ProgrammingError: role "postgres" already exists
Then initialization will fail ... and be reattempted, in a continuous loop.
For all of the roles, we should check if they already exist, and only create them if they don't (but set passwords if they do).
Will work on patch this week.
I wanted to suggest an atlernative to the current backup config files functionality on that branch.
We could support a conf.d directory, which would be outside PGDATA, and thus not wiped out by restores.
I believe this is a much better route than backing up and restoring conf files:
... in fact, I plan to implement conf.d support anyway, so why not just go that way?
If there's some reason why you also need pg.conf backup, then can we please make it an optional feature? For container-based deployments, it makes no sense to be "backing up" pg.conf etc.
I don't think moving restore.py to an outside script was on the whole a net benefit. While I understand not wanting to burden the main code with platform-specific dependencies, that puts us in the odd position of having to have a full file path in the config for something which is a critical function of Patroni. Also, if alienating the WAL-E content was your goal, naming the script "restore.py" wasn't the way to do it.
I suggest instead that we have a core function of patroni called restore, which is pg_basebackup-only restore. Other restore scripts would be called only if they are defined in config.yml, and only for alternate mechanisms (e.g. wal-e.py, barman.py).
Scenario: Created Spilo with version 18
Oct 21 09:21:11 ip-172-31-173-176 docker/88c787eafc06[785]: TypeError: cannot concatenate 'str' and 'int' objects
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: Traceback (most recent call last):
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: File "/home/postgres/patroni/patroni.py", line 6, in <module>
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: main()
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: File "/home/postgres/patroni/patroni/__init__.py", line 67, in main
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: patroni = Patroni(config)
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: File "/home/postgres/patroni/patroni/__init__.py", line 22, in __init__
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: self.dcs = self.get_dcs(self.postgresql.name, config)
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: File "/home/postgres/patroni/patroni/__init__.py", line 31, in get_dcs
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: return Etcd(name, config['etcd'])
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: File "/home/postgres/patroni/patroni/etcd.py", line 148, in __init__
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: super(Etcd, self).__init__(name, config)
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: File "/home/postgres/patroni/patroni/dcs.py", line 126, in __init__
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: self._base_path = '/service/' + self._scope
Oct 21 09:21:36 ip-172-31-173-176 docker/88c787eafc06[785]: TypeError: cannot concatenate 'str' and 'int' objects
Hello, I am testing different failover cases with patroni and I have some issues with subject.
ex-master have different xlog_location
and cannot achieve new-master
Additional info:
from zookeper:
ex-master:
{
"conn_url": "postgres://replicator:[email protected]:5432/postgres",
"api_url": "http://172.17.0.94:8019/patroni",
"tags": {},
"state": "running",
"role": "replica",
"xlog_location": 167772304
}
new-master:
{
"conn_url": "postgres://replicator:[email protected]:5432/postgres",
"api_url": "http://172.17.0.98:8059/patroni",
"tags": {},
"state": "running",
"role": "master",
"xlog_location": 318767632
}
And there is no record for ex-master as replica of new-master
select * from pg_stat_replication; (on new-master)
6465 16384 replicator walreceiver 172.17.0.102 47561 2015-11-17 18:34:09 streaming 0/13000060 0/13000060 0/13000060 0/13000060 1 sync
2368 16384 replicator walreceiver 172.17.0.97 36181 2015-11-17 18:24:23 streaming 0/13000060 0/13000060 0/13000060 0/13000060 1 potential
2374 16384 replicator walreceiver 172.17.0.95 57300 2015-11-17 18:24:23 streaming 0/13000060 0/13000060 0/13000060 0/13000060 1 potential
2373 16384 replicator walreceiver 172.17.0.100 53910 2015-11-17 18:24:23 streaming 0/13000060 0/13000060 0/13000060 0/13000060 1 potential
and in the end some logs from ex-master after net split
2015-11-17 15:30:27,512 INFO: Zookeeper session lost, state: EXPIRED_SESSION
2015-11-17 15:30:28,227 INFO: Connecting to 172.17.0.93:2181
2015-11-17 15:30:28,230 INFO: Zookeeper connection established, state: CONNECTED
172.17.0.101 - - [17/Nov/2015 15:30:28] "OPTIONS / HTTP/1.0" 503 -
2015-11-17 15:30:28,916 INFO: Lock owner: node5; I am node1
2015-11-17 15:30:28,916 INFO: does not have lock
2015-11-17 15:30:28,926 INFO: established a new patroni connection to the postgres cluster
2015-11-17 15:30:28,928 INFO: closed patroni connection to the postgresql cluster
2015-11-17 15:30:28,996 INFO: no action. i am a secondary and i am following a leader
2015-11-17 15:30:29,014 INFO: Lock owner: node5; I am node1
2015-11-17 15:30:29,014 INFO: changing primary_conninfo and restarting in progress
LOG: received fast shutdown request
waiting for server to shut down...LOG: aborting any active transactions
.FATAL: terminating connection due to administrator command
LOG: shutting down
LOG: database system is shut down
done
server stopped
waiting for server to start....LOG: database system was shut down in recovery at 2015-11-17 15:30:29 UTC
LOG: entering standby mode
LOG: consistent recovery state reached at 0/A000090
LOG: record with zero length at 0/A000090
LOG: database system is ready to accept read only connections
LOG: fetching timeline history file for timeline 2 from primary server
LOG: fetching timeline history file for timeline 3 from primary server
FATAL: could not start WAL streaming: ERROR: requested starting point 0/A000000 on timeline 1 is not in this server's history
DETAIL: This server's history forked from timeline 1 at 0/9CC0688.
LOG: new timeline 3 forked off current database system timeline 1 before current recovery point 0/A000090
2015-11-17 15:30:30,909 ERROR: get_postgresql_status
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/patroni/api.py", line 231, in get_postgresql_status
pg_is_in_recovery() AND pg_is_xlog_replay_paused()""", retry=retry)[0]
File "build/bdist.linux-x86_64/egg/patroni/api.py", line 217, in query
return self.server.query(sql, *params)
File "build/bdist.linux-x86_64/egg/patroni/api.py", line 287, in query
raise PostgresConnectionException('connection problems')
PostgresConnectionException: 'connection problems'
172.17.0.101 - - [17/Nov/2015 15:30:30] "OPTIONS / HTTP/1.0" 503 -
done
server started
172.17.0.101 - - [17/Nov/2015 15:30:32] "OPTIONS / HTTP/1.0" 503 -
172.17.0.101 - - [17/Nov/2015 15:30:34] "OPTIONS / HTTP/1.0" 503 -
172.17.0.101 - - [17/Nov/2015 15:30:36] "OPTIONS / HTTP/1.0" 503 -
172.17.0.101 - - [17/Nov/2015 15:30:38] "OPTIONS / HTTP/1.0" 503 -
2015-11-17 15:30:38,997 INFO: established a new patroni connection to the postgres cluster
2015-11-17 15:30:39,009 INFO: Lock owner: node5; I am node1
2015-11-17 15:30:39,009 INFO: does not have lock
2015-11-17 15:30:39,013 INFO: no action. i am a secondary and i am following a leader
172.17.0.101 - - [17/Nov/2015 15:30:40] "OPTIONS / HTTP/1.0" 503 -
172.17.0.101 - - [17/Nov/2015 15:30:42] "OPTIONS / HTTP/1.0" 503 -
172.17.0.101 - - [17/Nov/2015 15:30:44] "OPTIONS / HTTP/1.0" 503 -
LOG: new timeline 3 forked off current database system timeline 1 before current recovery point 0/A000090
FATAL: could not start WAL streaming: ERROR: requested starting point 0/A000000 on timeline 1 is not in this server's history
DETAIL: This server's history forked from timeline 1 at 0/9CC0688.
There's some configuration information for WAL-E integration. However, there doesn't seem to be any code for it. What's intended here other than just writing to pg.conf?
Currently in some places we are importing different modules depending on python version.
Actually most of such cases (it might be that everything is covered in our case) are covered by "six". Considering that six is required by kazoo anyway it doesn't make sense to reinvent wheel but use already existing solution.
In order to implement a number of features for patroni, among them issue #11, we need an actual UI for dynamic adminstration of Patroni systems. I anticipate having both a CLI and a web UI. I plan to start with the CLI.
The CLI will be a python program which installs with Patroni. It will have the following spec:
Call: patroni-cli --config config-file --cluster clustername command [options]
--config is not optional, and contains connection information for the DCS, as well as potentially auth information for the APIs. Possibly it should be a pure positional argument.
--cluster is optional, and is there mainly for shops which have multiple clusters. For teams with only one cluster, the clustername could be contained in the config file instead. The command line paramter supercedes clustername in the config file.
Initially, I expect to implement the following commands:
status: prints out cluster status
status $node: prints detailed information on the status of a particular node
list: lists all nodes
shutdown $node: shuts down the named node
restart $node: makes node restart
failover: initates failover by having the master release the key and reboot read-only
failover $node: initiates manual failover per above, but to the named node
disable $ttl: disables auto-failover for the whole cluster, with optional TTL
resume: resumes auto-failover for the cluster
initialize $node: makes node re-initialize itself
In theory it could happen that all information about cluster topology were wiped from configuration store. In this case patroni will immediately start new election process but at this moment list of cluster members would be empty or consisting of just one member. Basically this member will consider himself the most healthiest node and promote. This could lead to split-brain situation because for the real leader it takes some time to figure out that it lost leader lock.
We can overcome such issue with caching previous cluster configuration (when the cluster was healthy) and running is_healthiest_node against that cached configuration.
Currently, the local patroni instance can only connect using TCP. We should support local unix socket connections as well; that's both faster and more secure.
This will require some replumbing, which I regard as a cleanup anyway. Currently, we read config information, use that to create a URL, then parse that URL to get the config information back. This seems unnecessarily ornate; we should either use URLs in the config in the first place, or we should avoid this whole unnecessary step and use the config fields directly. For backwards-compatibility reasons, I plan to use the latter path.
I'll also use a file path for the socket interface if that's preferred by the user. It should be simple to differentiate this from an IPv4/v6 address.
I plan to do this in the next couple of days, so let me know if you have comments on the design.
Doing a failover test, I got this output from the replica:
2015-10-28 11:49:41,781 ERROR: request failed: GET http://127.0.0.1:8008/patroni
Traceback (most recent call last):
File "/home/josh/patroni/patroni/ha.py", line 147, in fetch_node_status
response = requests.get(member.api_url, timeout=2, verify=False)
File "/home/josh/patronienv/local/lib/python2.7/site-packages/requests/api.py", line 69, in get
return request('get', url, params=params, *_kwargs)
File "/home/josh/patronienv/local/lib/python2.7/site-packages/requests/api.py", line 50, in request
response = session.request(method=method, url=url, *_kwargs)
File "/home/josh/patronienv/local/lib/python2.7/site-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, *_send_kwargs)
File "/home/josh/patronienv/local/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, *_kwargs)
File "/home/josh/patronienv/local/lib/python2.7/site-packages/requests/adapters.py", line 423, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8008): Max retries exceeded with url: /patroni (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7ff78ea0efd0>: Failed to establish a new connection: [Errno 111] Connection refused',))
What's happening there is that the replica is failing to connect to the master because it's down. Which is as it should be; so why are we getting an error with a trace dump?
https://os-registry.stups.zalan.do/acid/patroni-1.0
That's a 404.
I'm going to upload something to the docker hub, once I have it built.
The connection strings to members of the cluster are in the form of URIs. For PostgreSQL 9.2 and later, URIs are supported by psycopg2 as connection strings, and are supported by various PostgreSQL utilities. Yet for some reason, the Patroni code spends a lot of time parsing and recomposing URIs instead of just using them. Why?
This relates to issue #61.
Folks,
I'm doing a lot of testing which involves masters which fail to start. Thing is, if this happens the initialize lock is left behind, and the only way to fix it is to manually delete the lock using curl.
There still needs to be a way that the initialize lock can be expired if there are no nodes in the cluster.
It has happened that PostgreSQL does not shut down when signaled. Patroni then waits indefinitely for postgres to shut down.
It may be a ok idea to do a shutdown -m immediate
after n minutes if postgres doesn't shutdown.
Add the following lines to patroni.yml:
postgresql_config_file: /path/to/custom/file
pg_hba_file: /path/to/custom/file
If present, these configuration variables will cause the files named to be copied to the data directory after it is initialized and before PostgreSQL is started up.
A TODO for me.
We need a simple HTTP check for HAProxy load balancing (and others) which checks if the server whose API is queried is currently the master, without parsing JSON, using a URL like /ismaster/. This will allow using http-check expect in HAProxy to determine the master.
Make sense?
So, not making much progress. When trying to start a master node I get this:
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
/usr/lib/postgresql/9.4/bin/postgres -D /pgdata/datacd p
or
/usr/lib/postgresql/9.4/bin/pg_ctl -D /pgdata/datacd p -l logfile start
waiting for server to start....LOG: database system was shut down at 2016-01-08 01:15:51 UTC
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
done
server started
2016-01-08 01:15:52,558 INFO: established a new patroni connection to the postgres cluster
2016-01-08 01:15:53,411 INFO: established a new patroni connection to the postgres cluster
2016-01-08 01:15:54,164 INFO: established a new patroni connection to the postgres cluster
2016-01-08 01:15:55,267 INFO: established a new patroni connection to the postgres cluster
2016-01-08 01:15:56,309 INFO: established a new patroni connection to the postgres cluster
2016-01-08 01:15:56,311 INFO: removing initialize key after failed attempt to initialize the cluster
2016-01-08 01:15:56,320 INFO: renaming data directory to /pgdata/datacd p_2016-01-08-01-15-56
2016-01-08 01:15:56,321 ERROR: Error communicating with Postgresql. Will try again later
Traceback (most recent call last):
... and things fail from there. The thing is, PostgreSQL is running, as subsequent attempts to start it fail due to port 5432 already being bound.
Currently, is_healthiest_node() involves each replica connecting to every other replica to compare xlog posititions and find "furthest ahead". This is fine if you have, day, two replicas, but if you have 20 it's liable to result in a really long failover time.
My proposal for changing this is as follows:
If we make sure that the publication interval for replicas is more frequent than the master key TTL, then the information about replay location should be fairly good, and the connections should be unnecessary.
This should greatly reduce the number of connections, and lag, in the event of a failure event.
Drawbacks: this will mean skipping the check for an existing master in the cluster if check_all is disabled.
I'm trying to test the code which merges in the basebackup back to postgresql.py. However, when I try to start the first replica, I get this error:
2015-10-22 18:49:34,852 INFO: Starting new HTTP connection (1): 172.17.0.5
pg_ctl: directory "data/postgresql2" is not a database cluster directory
Traceback (most recent call last):
File "./patroni.py", line 6, in
main()
File "/home/josh/patroni/patroni/init.py", line 69, in main
patroni.run()
File "/home/josh/patroni/patroni/init.py", line 50, in run
logger.info(self.ha.run_cycle())
File "/home/josh/patroni/patroni/ha.py", line 414, in run_cycle
return self._run_cycle()
File "/home/josh/patroni/patroni/ha.py", line 364, in _run_cycle
self.touch_member()
File "/home/josh/patroni/patroni/ha.py", line 53, in touch_member
'state': self.state_handler.state,
AttributeError: Postgresql instance has no attribute 'state'
It seems that it's not correctly detecting that there's an empty data dir, and I can't quite figure out why. This wasn't happening before I merged the most recent HEAD.
We need a better way to set passwords for the three PostgreSQL accounts we need to use than keeping them in a configuration file which may be in poorly secured CMS repo. I suggest that we use envdir, as several other Python projects (like WAL-E) do.
Other ideas?
Using the patroni-compose project (https://github.com/pgexperts/patroni-compose):
dbnode1_1 | scope: &scope testcluster
dbnode1_1 | ttl: &ttl 30
dbnode1_1 | loop_wait: &loop_wait 10
dbnode1_1 | restapi:
dbnode1_1 | listen: 172.17.0.12:8001
dbnode1_1 | connect_address: 172.17.0.12:8001
dbnode1_1 | auth: 'admin:admin'
dbnode1_1 | certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem
dbnode1_1 | keyfile: /etc/ssl/private/ssl-cert-snakeoil.key
dbnode1_1 | etcd:
dbnode1_1 | scope: _scope
dbnode1_1 | ttl: *ttl
dbnode1_1 | host: etcd:2379
dbnode1_1 | postgresql:
dbnode1_1 | name: dbnode1
dbnode1_1 | scope: *scope
dbnode1_1 | listen: 0.0.0.0:5432
dbnode1_1 | connect_address: 172.17.0.12:5432
dbnode1_1 | data_dir: /pgdata/data
dbnode1_1 | maximum_lag_on_failover: 10485760 # 10 megabyte in bytes
dbnode1_1 | use_slots: True
dbnode1_1 | pgpass: /tmp/pgpass0
dbnode1_1 | create_replica_method: basebackup
dbnode1_1 | pg_hba:
dbnode1_1 | - local all all trust
dbnode1_1 | - host all all 0.0.0.0/0 md5
dbnode1_1 | - hostssl all all 0.0.0.0/0 md5
dbnode1_1 | replication:
dbnode1_1 | username: replicator
dbnode1_1 | password: replicate
dbnode1_1 | network: 172.17.0.12/16
dbnode1_1 | superuser:
dbnode1_1 | username: patroni
dbnode1_1 | password: super
dbnode1_1 | admin:
dbnode1_1 | username: admin
dbnode1_1 | password: admin
dbnode1_1 | parameters:
dbnode1_1 | archive_mode: "off"
dbnode1_1 | archive_command: mkdir -p ../wal_archive && cp %p ../wal_archive/%f
dbnode1_1 | wal_level: hot_standby
dbnode1_1 | max_wal_senders: 10
dbnode1_1 | hot_standby: "on"
dbnode1_1 | max_replication_slots: 7
etcd_1 | 2015-12-10 21:32:29.231848 I | raft: ce2a822cea30bfca is starting a new election at term 1
etcd_1 | 2015-12-10 21:32:29.232039 I | raft: ce2a822cea30bfca became candidate at term 2
etcd_1 | 2015-12-10 21:32:29.232325 I | raft: ce2a822cea30bfca received vote from ce2a822cea30bfca at term 2
etcd_1 | 2015-12-10 21:32:29.232492 I | raft: ce2a822cea30bfca became leader at term 2
etcd_1 | 2015-12-10 21:32:29.232682 I | raft: raft.node: ce2a822cea30bfca elected leader ce2a822cea30bfca at term 2
etcd_1 | 2015-12-10 21:32:29.233238 I | etcdserver: published {Name:default ClientURLs:[http://0.0.0.0:2379 http://0.0.0.0:4001]} to cluster 7e27652122e8b2ae
etcd_1 | 2015-12-10 21:32:29.233343 I | etcdserver: setting up the initial cluster version to 2.2
etcd_1 | 2015-12-10 21:32:29.238663 N | etcdserver: set the initial cluster version to 2.2
dbnode1_1 | 2015-12-10 21:32:29,346 INFO: Starting new HTTP connection (1): 172.17.0.11
dbnode1_1 | 2015-12-10 21:32:29,354 INFO: removing initialize key after failed attempt to initialize the cluster
dbnode1_1 | 2015-12-10 21:32:29,375 INFO: renaming data directory to /pgdata/data_2015-12-10-21-32-29
dbnode1_1 | Traceback (most recent call last):
dbnode1_1 | File "/patroni/patroni.py", line 6, in
dbnode1_1 | main()
dbnode1_1 | File "/patroni/patroni/init.py", line 73, in main
dbnode1_1 | patroni.run()
dbnode1_1 | File "/patroni/patroni/init.py", line 54, in run
dbnode1_1 | logger.info(self.ha.run_cycle())
dbnode1_1 | File "/patroni/patroni/ha.py", line 439, in run_cycle
dbnode1_1 | return self._run_cycle()
dbnode1_1 | File "/patroni/patroni/ha.py", line 405, in _run_cycle
dbnode1_1 | return self.bootstrap() # new node
dbnode1_1 | File "/patroni/patroni/ha.py", line 80, in bootstrap
dbnode1_1 | self.state_handler.bootstrap()
dbnode1_1 | File "/patroni/patroni/postgresql.py", line 672, in bootstrap
dbnode1_1 | ret = self.initialize() and self.start()
dbnode1_1 | File "/patroni/patroni/postgresql.py", line 201, in initialize
dbnode1_1 | ret = subprocess.call(self._pg_ctl + ['initdb'] + ['-o', ' '.join(options)] if options else []) == 0
dbnode1_1 | File "/usr/lib/python2.7/subprocess.py", line 522, in call
dbnode1_1 | return Popen(_popenargs, **kwargs).wait()
dbnode1_1 | File "/usr/lib/python2.7/subprocess.py", line 710, in init
dbnode1_1 | errread, errwrite)
dbnode1_1 | File "/usr/lib/python2.7/subprocess.py", line 1206, in _execute_child
dbnode1_1 | executable = args[0]
dbnode1_1 | IndexError: list index out of range
patronicompose_dbnode1_1 exited with code 1
Two places in postgresql.py, it does this:
for setting, value in self.config['parameters'].items():
There's two problems with this code:
I'll fix this later if someone else doesn't do it first.
There is a valid use case of moving already running databases to spilo without doing an expensive dump/restore. In order to support it, we can just pretend that an external PostgreSQL data service is part of the spilo cluster, acting as a leader, with TTL that never expires. Once the cluster is started, spilo nodes will choose replica roles (as the leader is already taken) and will continue to act as replicas until the leader key is explicitly removed by spilo (since it cannot expire on its own).
In order to support this use case, the following parameters should be changed:
From the user point-of-view, the spilo cluster should be configured with the primary_conninfo, pointing to the external PostgreSQL database.
This connection string should also allow normal (non-replication) superuser connections, and the pg_hba.conf on the cluster should be configured to permit such connections from the addresses that would belong to the spilo cluster.
The changes required in the spilo workflow are the following:
If the active spilo node dies while the standby_mode is still set, another node should notice it and take its place.
If patroni is initdb'ing, we should enable data page checksums because:
... this should be possible.
We should make some kind of lock file in the PostgreSQL data directory, that would indicate that the instance is managed by Patroni of a given cluster, in order to prevent multiple Patroni daemons to attach to a single data directory. Right now this is possible, if you run a second patroni with the same data_dir, but point it to another cluster, both would compete to set the instance into the desired state (master, replica) and would likely render the instance unusable.
In a multi-replica environment, HAProxy needs to be configured and updated with the correct current master and replicas. Add a method to update haproxy.cfg with this information.
How I imagine this working is a separate, smaller daemon which monitors for changes of master and updates haproxy.cfg on the haproxy server(s).
Currently, pg_rewind can only process the former master being shut down cleanly. This is useful for the case of split brain/connection loss between the former master and DCS, but In real life some of the shutdowns will happen because of the PostgreSQL crash, due to bad hardware or bugs.
What we could do is to start the node that crashed in a single user mode and issue a checkpoint before trying to do pg_rewind on it. We have to check, though, that no data loss occurs as a result.
This is a TODO item for me.
Add sample init/upstart/systemd scripts for the most popular platforms.
Hi,
Just trying out patroni for the first time (after giving up on governor). I seem to have the replication working correctly, on my secondaries I see following error :-
2015-12-10 21:04:19,823 ERROR: Exception when changing replication slots
Log snippet with stack trace: http://pastie.org/private/aup2rpvjfxsmgsjiwv9noa
Is that something to be worried about? Changes done on master are visible on the secondaries.
-Sajal
We need the following commands in the API to support manual failover and other reasonable operations:
This will require adding new paths to the API. Given that, do we want to stick with SimpleHTTPServer?
To enable easy monitoring it would be helpful to have an api endpoint providing a healthcheck for the cluster.
We should include:
Zookeeper has ephemeral nodes, and Consul has health checks. Either of these mechanisms provides a means to detect master failure -- and even replica failure -- faster than via the polling method we need to use with etcd.
We should consider using this advanced mechanism on each of these SIS platforms.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.