I don't think moving restore.py to an outside was on the whole a net benefit.

In the above design, s could would have two return values; an error if they fail

move restore back to core libraries about patroni HOT 26 CLOSED

zalando commented on August 28, 2024

move restore back to core libraries

from patroni.

Comments (26)

alexeyklyukin commented on August 28, 2024

Actually, we are not putting a full path, but a path related to patroni, and if we agree that scripts should reside in the scripts subdirectory, then we can limit the configuration parameter to only the script name.
As for the name, I agree that it's too generic, but the goal was to combine all restore methods in one script, which is why it's named this way. I think splitting it to one script per restore method and moving the core pg_basebackup functionality back to patroni is a good idea.

If we make this parameter part of the section describing external tools, like wal-e, then we have to assign a priority as well, so that patroni knows in which order it should launch those tools. Perhaps, something like create_replica_tools = 'wal-e, barman', which a fallback to the internal pg_basebackup?

from patroni.

jberkus commented on August 28, 2024

So relative directories do NOT work if you've used setup.py to install patroni as a command. Then you're forced to use absolute directories.

Well, first ... I'm not sure that we should automatically fall back to anything. I certainly think that the circumstances under which a user had both wal-e and barman available would be highly peculiar. Here's the combinations I can think of being potentially desireable, and my personal estimated % of the % of the time we'll see them:

just use pg_basebackup: 90%
use wal-e and don't fall back: 2%
use wal-e and fall back to pg_basebackup: 3%
use highly custom script and don't fall back: 3%
use barman and fall back to pg_basebackup: 1%
use rsync and fall back to pg_basebackup: 1%

... so well theoretically "use barman and fall back to rsync" is a possible combination, I think it's completely possible that the patroni project could exist for years before anyone actually asks for it. So adding a lot of plumbing to make scripts combinable seems like a waste of time.

from patroni.

alexeyklyukin commented on August 28, 2024

Hm...I'm not sure what setup.py does, but if it does not preserve the directory structure of the original distribution we have to look into making it do so.

As for the fallback mode, I think it would be logical to implement fallback to pg_basebackup from another tool (basically 5% in your estimates if we combine fallback from rsync, wal-e and barman). We can move pg_basebackup related code to the core patroni and implement a possibility to create replicas by calling a user-defined script.

Something like
replica_method = 'wal-e' (or 'barman').

If specified, then look for the 'replica_command' parameter inside the [wal-e] or [barman] section of the patroni configuration. If this parameter is missing or empty, if replica_method is missing or empty, or if the user-defined script returned non-zero, fallback to pg_basebackup.

from patroni.

jberkus commented on August 28, 2024

I think fallback should also be an option for the user. I have a specific use-case where we're planning to deploy patroni where they do NOT want to fallback to pg_basebackup. So:

fallback_to_basebackup = False (default True)

from patroni.

jberkus commented on August 28, 2024

So, if you install patroni using setup.py, it becomes basically a system command, with no directory context except its own libraries. This is a feature, it's what you want.

But it does mean that any scripts directory has to have an absolute path.

from patroni.

alexeyklyukin commented on August 28, 2024

I'm curious about the use-case to run patroni without fallback... might make sense for huge database and high-loaded clusters, where doing things via pg_basebackup may totally kill the running master.

What we also have to do is to terminate patroni replica if it fails to follow the master, otherwise in the AWS/Docker configuration it will still keep a node in the auto-scaling group.

from patroni.

jberkus commented on August 28, 2024

Actually ....

It just occurred to me that we can really cover all cases by having:

replica_method =
replica_fallback_method =

... without making the code any more complex. The keyword "basebackup" would mean use the built-in pg_basebackup method; anything else would need to be the name of a script. So here's the defaults:

replica_method = basebackup
replica_fallback_method = none

Here's "use wal-e for large DBs, fallback to basebackup"

replica_method = /patroni/scripts/wal-e-large.py
replica_fallback_method = basebackup

Here's "use custom, don't fall back"

replica_method = /patroni/scripts/san3_rsync.py
replica_fallback_method =

Here's "use pg_basebackup, fall back to wal-e" (in case of the master being short on connections, for example):

replica_method = basebackup
replica_fallback_method = /patroni/scripts/wal-e.py

Make sense?

from patroni.

jberkus commented on August 28, 2024

In the above design, scripts could would have two return values; an error if they fail, and False if they don't meet conditions defined inside the script, like the existing wal-e script. Either would cause fallback, but a failure would be logged.

One question with this design is how do we decide if we want to retry a replica creation method? I mean, for basebackup, if the master just is too busy we should retry. Maybe we define retries in the script itself, I guess ...

from patroni.

jberkus commented on August 28, 2024

Yeah, the rsync without failback case is exactly that. 1.5TB very busy master. We do NOT want to pull a basebackup; it uses 100% of the master's network bandwidth. Later I'll take care of this with config which allows pulling basebackups from other replicas.

And yeah, right now we're not good about failing if initialization fails. I've been able to make initialization hang in several different ways.

from patroni.

alexeyklyukin commented on August 28, 2024

I think the configuration with a primary and fallback methods make sense, but I'd change it to include just the name of the method, i.e repica_method = 'wal-e' and specify the actual script in the [wal-e] section. The rationale is that there are likely several configuration parameters required to run the script, and it would be easier to configure them all and the script full path in one section. We might also serialize the content of the configuration section to the script being called. Obviously, basebackup would be an exception from this rule...

Actually, providing a list to the replica_method would spare us one parameter, i.e:
replica_methods = 'basebackup, wal-e'.
Patroni might not assume anything if this parameter is empty and just terminate with an error.

By default, this will be set to
replica_methods = 'basebackup'.

As for the retry, I think patroni should do retries the same way for any replica method, i.e. it should try to create the new replica N times, after which it would log the failure and terminate.

from patroni.

jberkus commented on August 28, 2024

Changing it to a list is fine.

Do we really want to get into having configuration sections for each method though? Is that something you have a use for?

I ask because this is something I did with HandyRep and regretted it; maintaining docs about which configs were available/required with which supplied custom methods turned into a real mess.

from patroni.

alexeyklyukin commented on August 28, 2024

We have other configuration parameters for the existing WAL-E script, for example, the threshold for the accumulated size of the WAL files on S3 since the last base backup, which triggers pg_basebackup instead of WAL-E, to avoid having excessively long recovery time after the new replica is initialised.

I think the task to provide the parameter description for the specific replica method belongs to the author of the script implementing that method. The script and the docs can be distributed totally outside of core patroni, patroni will take the configuration section with the same name as the replica method and pass it directly to the script.

I think it only makes sense to do a list in the configuration if it consists of method names, a list of script paths would look quite ugly.

from patroni.

jberkus commented on August 28, 2024

Regarding retries:

I guess the question for here is how many times should we retry basebackup? Even if we make this a configuration parameter, we need a default. I suggest two, with a 15 second sleep.

from patroni.

alexeyklyukin commented on August 28, 2024

Looks reasonable to me, although I don't think there is much value in having a per-method retry option, I'd rather create a replica with the currently configured combination of methods, and call both the primary and fallback methods on each retry, assuming all of them are stateless (i.e a failed half-way first attempt of running WAL-E does not give a speed up to the second attempt after the retry.)

from patroni.

jberkus commented on August 28, 2024

Well, I'm thinking specifically of the basebackup option. It's pretty common for network wonkiness to cause basebackup to not connect on the first try. Seems like we should give it at least two tries before switching to the fallback option.

from patroni.

jberkus commented on August 28, 2024

So I hacked out some of this. One fundamental question: do we want to run the user-supplied restore scripts as command-line scripts, or do we want to run them as python modules (including loading them)? Either is possible.

Here's a functions for the script approach (from postgresql.py). Will submit a PR when I'm done rewriting wale_restore.py for it, and have tested everything: https://gist.github.com/jberkus/4b6a949f569f0f435824

Anyway, if you want to take the module approach, speak up quickly.

from patroni.

alexeyklyukin commented on August 28, 2024

Thank you, I'm for sticking with the scripts.

from patroni.

jberkus commented on August 28, 2024

Good, because the code is mostly done.

Which tests do I need to update? Or will I just be writing new ones?

from patroni.

alexeyklyukin commented on August 28, 2024

As for the tests, I think test_postgresql.py and also test_restore.py.

I think the name "command" for the script to create a new replica is too generic, how about replica_command instead?

Also, it's still a question of whether you want to have a timeout and retries configured globally in patroni, or make a responsibility of a script author to retry a failed attempt?

I think the global configuration makes more sense, as it gives more information to patroni on what's going on (although I don't see any use for this information so far albeit from logging it), and makes replica_command scripts easier.

from patroni.

jberkus commented on August 28, 2024

well, "command" isn't a top-level tag. it would look like this:

wale_restore:
    command:
    traffic_limit:

... so it seems to me that "command" already makes sense in context.

from patroni.

jberkus commented on August 28, 2024

regarding timeouts/retries: I think it makes a lot more sense for retries to be configured by the script author with each method. How many retries are reasonable ... and whether retries are reasonable at all ... depends on what exactly is happening in the script. For example, imagine that replicas are being provisioned via ZFS snapshot export. If that failed, something really wonky could be wrong, and you wouldn't want to retry it. For some scripts, the user is going to want to retry only on specific error codes.

from patroni.

alexeyklyukin commented on August 28, 2024

I was thinking more in the line of:

wal_e:
    bucket_name:
    env_dir: 
    replica_command:

from patroni.

alexeyklyukin commented on August 28, 2024

What about including retries/timeouts in the method-specific configuration, and assume the global setting take effect if there is no one? I like the idea with the error codes, i.e. the method can indicate there is no sense in retrying it if something fatal has happened (i.e. there is no matching base backup on WAL-E)....

from patroni.

jberkus commented on August 28, 2024

Again, I don't see the need to have this be part of main patroni. See basebackup() in the gist for an example of how retries would be handled in the method itself. The user has the ability to add a "retries" config to the method config, if they want, and they they decide how those retries are implemented.

Given that 90% of users will just use the built-in basebackup method, I'm really trying to avoid making the mainstream case more complicated in order to support obscure setups.

from patroni.

alexeyklyukin commented on August 28, 2024

Yes, keeping the core thing simple makes sense 👍

from patroni.

alexeyklyukin commented on August 28, 2024

And merged, 28f1d51, thank you for the pull-request

from patroni.

move restore back to core libraries about patroni HOT 26 CLOSED

Comments (26)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent