tritondatacenter / pg_prefaulter Goto Github PK
View Code? Open in Web Editor NEWFaults pages into PostgreSQL shared_buffers or filesystem caches in advance of WAL apply
License: Apache License 2.0
Faults pages into PostgreSQL shared_buffers or filesystem caches in advance of WAL apply
License: Apache License 2.0
It would be clever to hook in to PostgreSQL as a background worker. There are actually two candidate paths for this enhancement in the future:
pg_prefaulter
processpg_prefaulter
into a background workerThe former has benefits because you could stop and upgrade the sidecar process without restarting PostgreSQL. The latter is more inline with the traditional monolithic C mindset. Given the low volume of RPCs, the former is probably preferred over the latter but additional investigation is required before making a determination.
Inspiration taken from: https://github.com/prest/bgworker
For reasons that aren't clear, the enable/disable boolean doesn't work for Circonus metrics.
As a workaround, use an invalid API key.
Reported by @gwydirsam
Suggestion from the audience:
Flag: POSIX_FADV_WILLNEED
While the memory bandwidth consumed from copying data from the OS into userland is not much, there is room for improvement through the use of posix_fadvise(2)
instead of pread(2)
.
During a recent incident, the prefaulter panicked on more than 200 PostgreSQL instances, with the following set of log messages:
19:01:00.598Z ERROR pg_prefaulter: unable to find WAL files (next step=retrying)
error: unable to query PostgreSQL checkpoint information: write unix ->/tmp/.s.PGSQL.5432: write: broken pipe: (retriable: true, purge cache: true
19:01:01.599Z LVL70 pg_prefaulter: bad, open vs close count not the same after purge (close-count=769548809, open-count=981315695)
panic: bad, open vs close count not the same after purge
Core files for the failures are tagged with this ticket in thoth.
pg_prefaulter
should pre-fault in the entire WAL segment referenced by pg_last_xlog_receive_location()
and the necessary segments as specified by the read ahead.
It is conceivable that the WAL receiver could be blocked on a RMW cycle when using a page size that is different than PostgreSQL's 8K pages. Warming the filesystem cache in advance of the receiver writing a WAL record would prevent the synchronous read(2)
call from blocking replication when synchronous_commit=remote_write
is being used. This problem will likely only manifest itself after PostgreSQL has begun recycling its WAL files.
When PostgreSQL is not running, the prefaulter appears to retain a large number of open files within the PostgreSQL data directory. So that we can cleanly unmount and remount the data file system in Manatee, the prefaulter needs to close these descriptors promptly, and avoid opening any new ones, until it is determined that PostgreSQL is running again. It would also be good to make sure the current working directory of the prefaulter process never resides within the PostgreSQL data directory.
If Consul is running, automatically register the pg_prefaulter
as a service in Consul so that if the pg_prefaulter
can trap out its health and state to Consul.
Add a target to pg_prefaulter
that dumps the necessary tab completion targets.
https://github.com/spf13/cobra/blob/master/bash_completions.md
Hey guys,
I've been wondering why this project went totally left out. I found it pretty interesting but I was eager to know the reason of this inactivity. Is it just because of loss of interest of contributors or is it somehow proven to be harmful to use?
I would be glad to hear from you.
Thanks.
I am considering using this tool to alleviate lag on slaves due to heavy primary activity, but I see many issues created 3 years ago with no responses and the last update was about a year and a half ago.
Can I get a response, please, to let me know about the future of this project?
Change the SIGINFO handler to dump stats (vs show the stack). The current behavior dumps the stack of the process, which is useful, but not necessarily desired.
Though at least some log messages are presently formatted as linefeed-delimited JSON records, the content of the messages is not quite right. The log records must contain at least the Bunyan core fields in order for the bunyan
tools to parse them correctly.
Add a generic status
command that shows:
It would appear that the prefaulter does not start up correctly if the database is not available initially. When run as an SMF service, it fails fast this way a few times and then the service goes into maintenance:
...
unable to start agent: unable to initialize db connection pool: unable to create a new DB connection pool: dial unix /tmp/.s.PGSQL.5432: connect: no such file or directory
[ Sep 13 03:18:10 Stopping because all processes in service exited. ]
[ Sep 13 03:18:10 Executing stop method (:kill). ]
[ Sep 13 03:18:10 Executing start method ("/opt/pg_prefaulter/pg_prefaulter run --config=/opt/pg_prefaulter/pg_prefaulter.toml &"). ]
[ Sep 13 03:18:10 Method "start" exited with status 0. ]
{"time":"2017-09-13T03:18:10.84142588Z","level":"debug","config-file":"/opt/pg_prefaulter/pg_prefaulter.toml"}
{"time":"2017-09-13T03:18:10.841509168Z","level":"debug","message":"args: []"}
{"time":"2017-09-13T03:18:10.841778917Z","level":"debug","message":"starting gops(1) agent"}
{"time":"2017-09-13T03:18:10.841794234Z","level":"debug","postgresql.pgdata":"/manatee/pg/data","postgresql.host":"/tmp","postgresql.port":5432,"postgresql.user":"postgres","postgresql.xlog.mode":"pg","postgresql.xlog.pg_xlogdump-path":"/opt/postgresql/current/bin/pg_xlogdump","postgresql.poll-interval":1000,"message":"flags"}
{"time":"2017-09-13T03:18:10.842199269Z","level":"info","message":"Starting pg_prefaulter"}
{"time":"2017-09-13T03:18:10.842205727Z","level":"info","message":"Stopped pg_prefaulter"}
Error: unable to start agent: unable to initialize db connection pool: unable to create a new DB connection pool: dial unix /tmp/.s.PGSQL.5432: connect: no such file or directory
Usage:
pg_prefaulter run [flags]
Flags:
-h, --help help for run
-m, --mode string Mode of operation of the database: "auto", "primary", "follower" (default "auto")
-N, --num-io-threads uint Number of IO threads to spawn for IOs
-i, --poll-interval string Interval to poll the database for state change (default "1s")
-n, --wal-readahead uint Number of WAL entries to perform read-ahead into (default 4)
-X, --xlog-mode string pg_xlogdump(1) variant: "xlog" or "pg" (default "pg")
-x, --xlogdump-bin string Path to pg_xlogdump(1) (default "/usr/local/bin/pg_xlogdump")
Global Flags:
-a, --circonus-api-key string Circonus API token
--circonus-api-url string Circonus API URL (default "https://api.circonus.com/v2")
--circonus-broker-id string Circonus Broker ID
--circonus-broker-max-response-time string Circonus Broker Max Response Time (default "500ms")
--circonus-broker-select-tag string Circonus Broker Select Tag
--circonus-check-display-name string Circonus Check Display Name (default "pg_prefaulter")
--circonus-check-force-metric-activation string Circonus Check Force Metric Activation (default "false")
--circonus-check-id string Circonus Check ID
--circonus-check-instance-id string Circonus Check Instance ID (default "4042552b-5fe0-45c4-9df4-f21eca88464d:pg_prefaulter")
--circonus-check-max-url-age string Circonus Check Max URL Age (default "5m")
--circonus-check-search-tag string Circonus Check Search Tag (default "app:pg_prefaulter,host:4042552b-5fe0-45c4-9df4-f21eca88464d")
--circonus-check-secret string Circonus Check Secret
--circonus-check-tags string Circonus Check Tags (default "app:pg_prefaulter")
--circonus-check-target-host string Circonus Check Target Host (default "4042552b-5fe0-45c4-9df4-f21eca88464d")
--circonus-debug Enable Circonus Debug
--circonus-enable-metrics Enable Circonus metrics (default true)
--circonus-submission-url string Circonus Check Submission URL
--config string config file (default "pg_prefaulter.toml")
-d, --database string postgres (default "postgres")
--enable-agent Enable the gops(1) agent interface (default true)
-H, --hostname string Hostname to connect to PostgreSQL (default "/tmp")
-l, --log-level string Log level (default "INFO")
-D, --pgdata string Path to PGDATA (default "pgdata")
-p, --port uint Hostname to connect to PostgreSQL (default 5432)
-U, --username string Username to connect to PostgreSQL (default "postgres")
unable to start agent: unable to initialize db connection pool: unable to create a new DB connection pool: dial unix /tmp/.s.PGSQL.5432: connect: no such file or directory
[ Sep 13 03:18:10 Stopping because all processes in service exited. ]
[ Sep 13 03:18:10 Executing stop method (:kill). ]
[ Sep 13 03:18:10 Restarting too quickly, changing state to maintenance. ]
Once the database is running, I can clear the maintenance status and the service appears to start correctly:
[ Sep 13 03:20:11 Leaving maintenance because clear requested. ]
[ Sep 13 03:20:11 Enabled. ]
[ Sep 13 03:20:11 Executing start method ("/opt/pg_prefaulter/pg_prefaulter run --config=/opt/pg_prefaulter/pg_prefaulter.toml &"). ]
[ Sep 13 03:20:11 Method "start" exited with status 0. ]
{"time":"2017-09-13T03:20:11.206137022Z","level":"debug","config-file":"/opt/pg_prefaulter/pg_prefaulter.toml"}
{"time":"2017-09-13T03:20:11.20630263Z","level":"debug","message":"args: []"}
{"time":"2017-09-13T03:20:11.20648666Z","level":"debug","message":"starting gops(1) agent"}
{"time":"2017-09-13T03:20:11.20670297Z","level":"debug","postgresql.pgdata":"/manatee/pg/data","postgresql.host":"/tmp","postgresql.port":5432,"postgresql.user":"postgres","postgresql.xlog.mode":"pg","postgresql.xlog.pg_xlogdump-path":"/opt/postgresql/current/bin/pg_xlogdump","postgresql.poll-interval":1000,"message":"flags"}
{"time":"2017-09-13T03:20:11.207273131Z","level":"info","message":"Starting pg_prefaulter"}
{"time":"2017-09-13T03:20:11.226491329Z","level":"debug","backend-pid":63146,"version":"PostgreSQL 9.6.4 on i386-pc-solaris2.11, compiled by gcc (GCC) 4.9.3, 64-bit","message":"established DB connection"}
{"time":"2017-09-13T03:20:11.226914389Z","level":"debug","rlimit-nofile":65536,"filehandle-cache-size":65486,"filehandle-cache-ttl":3600000,"message":"filehandle cache initialized"}
{"time":"2017-09-13T03:20:11.243983658Z","level":"info","io-worker-threads":1000,"message":"started IO worker threads"}
{"time":"2017-09-13T03:20:11.244136082Z","level":"debug","message":"Starting wait"}
{"time":"2017-09-13T03:20:11.244148015Z","level":"debug","wal-worker-thread-id":2,"message":"starting WAL worker thread"}
{"time":"2017-09-13T03:20:11.244151999Z","level":"debug","wal-worker-thread-id":1,"message":"starting WAL worker thread"}
{"time":"2017-09-13T03:20:11.244147788Z","level":"debug","wal-worker-thread-id":3,"message":"starting WAL worker thread"}
{"time":"2017-09-13T03:20:11.244208579Z","level":"debug","wal-worker-thread-id":0,"message":"starting WAL worker thread"}
{"time":"2017-09-13T03:20:11.244208705Z","level":"debug","message":"Starting pg_prefaulter agent"}
{"time":"2017-09-13T03:21:11.227143486Z","level":"debug","hit":23176,"miss":646,"lookup":23822,"hit-rate":0.9728822097221056,"message":"filehandle-stats"}
{"time":"2017-09-13T03:21:11.24432601Z","level":"debug","hit":1,"miss":2,"lookup":3,"hit-rate":0.3333333333333333,"message":"walcache-stats"}
{"time":"2017-09-13T03:21:11.244413444Z","level":"debug","hit":8178,"miss":41757,"lookup":49935,"hit-rate":0.16377290477620907,"message":"iocache-stats"}
{"time":"2017-09-13T03:22:11.228855455Z","level":"debug","hit":113623,"miss":734,"lookup":114357,"hit-rate":0.9935815035371687,"message":"filehandle-stats"}
{"time":"2017-09-13T03:22:11.244911682Z","level":"debug","hit":3,"miss":4,"lookup":7,"hit-rate":0.42857142857142855,"message":"walcache-stats"}
{"time":"2017-09-13T03:22:11.244947642Z","level":"debug","hit":57646,"miss":191404,"lookup":249050,"hit-rate":0.23146356153382855,"message":"iocache-stats"}
{"time":"2017-09-13T03:23:11.229500586Z","level":"debug","hit":199725,"miss":754,"lookup":200479,"hit-rate":0.9962390075768535,"message":"filehandle-stats"}
{"time":"2017-09-13T03:23:11.245253655Z","level":"debug","hit":6,"miss":8,"lookup":14,"hit-rate":0.42857142857142855,"message":"walcache-stats"}
{"time":"2017-09-13T03:23:11.246306122Z","level":"debug","hit":128811,"miss":331991,"lookup":460802,"hit-rate":0.279536547150403,"message":"iocache-stats"}
{"time":"2017-09-13T03:24:11.230050485Z","level":"debug","hit":240793,"miss":759,"lookup":241552,"hit-rate":0.9968578194343247,"message":"filehandle-stats"}
{"time":"2017-09-13T03:24:11.245531156Z","level":"debug","hit":8,"miss":10,"lookup":18,"hit-rate":0.4444444444444444,"message":"walcache-stats"}
{"time":"2017-09-13T03:24:11.246573786Z","level":"debug","hit":165312,"miss":400260,"lookup":565572,"hit-rate":0.29229169760879253,"message":"iocache-stats"}
{"time":"2017-09-13T03:25:11.230626347Z","level":"debug","hit":282746,"miss":766,"lookup":283512,"hit-rate":0.9972981743277181,"message":"filehandle-stats"}
{"time":"2017-09-13T03:25:11.24583358Z","level":"debug","hit":10,"miss":12,"lookup":22,"hit-rate":0.45454545454545453,"message":"walcache-stats"}
{"time":"2017-09-13T03:25:11.246880287Z","level":"debug","hit":203032,"miss":468442,"lookup":671474,"hit-rate":0.3023676270414044,"message":"iocache-stats"}
{"time":"2017-09-13T03:26:11.231247847Z","level":"debug","hit":325980,"miss":773,"lookup":326753,"hit-rate":0.9976342986904482,"message":"filehandle-stats"}
{"time":"2017-09-13T03:26:11.2461834Z","level":"debug","hit":12,"miss":14,"lookup":26,"hit-rate":0.46153846153846156,"message":"walcache-stats"}
{"time":"2017-09-13T03:26:11.247218878Z","level":"debug","hit":241664,"miss":538432,"lookup":780096,"hit-rate":0.3097875133316925,"message":"iocache-stats"}
{"time":"2017-09-13T03:27:11.231782623Z","level":"debug","hit":367433,"miss":778,"lookup":368211,"hit-rate":0.9978870810486379,"message":"filehandle-stats"}
{"time":"2017-09-13T03:27:11.246419942Z","level":"debug","hit":14,"miss":16,"lookup":30,"hit-rate":0.4666666666666667,"message":"walcache-stats"}
{"time":"2017-09-13T03:27:11.247501626Z","level":"debug","hit":278505,"miss":607739,"lookup":886244,"hit-rate":0.3142531853530179,"message":"iocache-stats"}
{"time":"2017-09-13T03:28:11.232316668Z","level":"debug","hit":430939,"miss":780,"lookup":431719,"hit-rate":0.9981932692329964,"message":"filehandle-stats"}
{"time":"2017-09-13T03:28:11.246682366Z","level":"debug","hit":18,"miss":20,"lookup":38,"hit-rate":0.47368421052631576,"message":"walcache-stats"}
{"time":"2017-09-13T03:28:11.247703691Z","level":"debug","hit":356427,"miss":744053,"lookup":1100480,"hit-rate":0.32388321459726666,"message":"iocache-stats"}
...
While writing this I defaulted to using string slices as the arguments, which is efficient, but not necessarily ideal.
The current _IOCacheKey
is too large:
type _IOCacheKey struct {
Tablespace string
Database string
Relation string
Block string
}
_IOCacheKey.Tablespace string: 0-16 (size 16, align 8)
_IOCacheKey.Database string: 16-32 (size 16, align 8)
_IOCacheKey.Relation string: 32-48 (size 16, align 8)
_IOCacheKey.Block string: 48-64 (size 16, align 8)
Plus the size of byte array backing the string
. It could/should be:
// type'ed uint64's
type _IOCacheKey struct {
Tablespace uint64
Database uint64
Relation uint64
Block uint64
}
_IOCacheKey.Tablespace string: 0-8 (size 8, align 8)
_IOCacheKey.Database string: 8-16 (size 8, align 8)
_IOCacheKey.Relation string: 16-24 (size 8, align 8)
_IOCacheKey.Block string: 24-32 (size 8, align 8)
This change will incur a parsing overhead but we have CPU to burn in the common case and are memory limited, not CPU limited. The current CPU overhead of pg_prefaulter
running on a busy follower is between 1-2% CPU, so we have CPU budget to spare.
We've found another case where the prefaulter implicitly hangs onto references to a database's filesystem after the database itself has shut down. This is a problem for Manatee because Manatee attempts to unmount and mount the filesystem as part of starting the database (itself a separate problem), but it can't unmount it while the prefaulter is holding these references. This is a similar problem as in #13, but a different way that it can happen.
In this case, there was only one process holding open a file on the database's dataset:
[root@fa0e1075 (postgres) ~]$ fuser -c /manatee/pg
/manatee/pg: 8115o
It's a pg_xlogdump process forked by the prefaulter:
[root@HA99RHND2 (eu-central-1b) ~]# ptree 8115
892 zsched
975 /sbin/init
7127 /opt/pg_prefaulter/pg_prefaulter run --config=/opt/pg_prefaulter/pg_prefau
8115 /opt/postgresql/current/bin/pg_xlogdump -f /manatee/pg/data/pg_xlog/0000
It's not holding a file via its working directory:
[root@fa0e1075 (postgres) ~]$ pwdx 8115
8115: /root
But rather it seems to be one of its open fds:
[root@HA99RHND2 (eu-central-1b) ~]# pfiles 8115
8115: /opt/postgresql/current/bin/pg_xlogdump -f /manatee/pg/data/pg_xlog/00
Current rlimit: 65536 file descriptors
0: S_IFCHR mode:0666 dev:561,8 ino:3001421048 uid:0 gid:3 rdev:38,2
O_RDONLY|O_LARGEFILE
/zones/fa0e1075-8c65-4e2c-a062-e23013e40247/root/dev/null
offset:0
1: S_IFIFO mode:0000 dev:558,0 ino:29756267 uid:0 gid:0 rdev:0,0
O_RDWR
2: S_IFIFO mode:0000 dev:558,0 ino:29756268 uid:0 gid:0 rdev:0,0
O_RDWR
3: S_IFREG mode:0600 dev:90,65658 ino:65565 uid:907 gid:0 size:16777216
O_RDONLY|O_LARGEFILE
offset:14827520
Unfortunately we don't have the path for fd 3, but it seems almost certainly the WAL file that was given as an argument to the program:
[root@HA99RHND2 (eu-central-1b) /var/tmp/dap]# pargs core.8115
core 'core.8115' of 8115: /opt/postgresql/current/bin/pg_xlogdump -f /manatee/pg/data/pg_xlog/00000001000
argv[0]: /opt/postgresql/current/bin/pg_xlogdump
argv[1]: -f
argv[2]: /manatee/pg/data/pg_xlog/000000010000118600000084
The inode number reported by stat
on that file matches the one that pfiles
reported:
[root@HA99RHND2 (eu-central-1b) /var/tmp/dap]# stat /zones/fa0e1075-8c65-4e2c-a062-e23013e40247/root/manatee/pg/data/pg_xlog/000000010000118600000084
File: `/zones/fa0e1075-8c65-4e2c-a062-e23013e40247/root/manatee/pg/data/pg_xlog/000000010000118600000084'
Size: 16777216 Blocks: 14213 IO Block: 8192 regular file
Device: 169007ah/23658618d Inode: 65565 Links: 1
Access: (0600/-rw-------) Uid: ( 907/ UNKNOWN) Gid: ( 0/ root)
Access: 2017-10-13 16:34:10.812840776 +0000
Modify: 2017-10-13 16:34:44.373763334 +0000
Change: 2017-10-13 16:34:44.373763334 +0000
Birth: -
The "-f" flag appears to block when it reaches the end of valid WAL data. And it does look like we're at the end:
[root@HA99RHND2 (eu-central-1b) /var/tmp/dap]# ls -l /zones/fa0e1075-8c65-4e2c-a062-e23013e40247/root/manatee/pg/data/pg_xlog/ | tail
-rw------- 1 907 root 16777216 Oct 13 16:33 00000001000011860000007D
-rw------- 1 907 root 16777216 Oct 13 16:33 00000001000011860000007E
-rw------- 1 907 root 16777216 Oct 13 16:33 00000001000011860000007F
-rw------- 1 907 root 16777216 Oct 13 16:33 000000010000118600000080
-rw------- 1 907 root 16777216 Oct 13 16:33 000000010000118600000081
-rw------- 1 907 root 16777216 Oct 13 16:33 000000010000118600000082
-rw------- 1 907 root 16777216 Oct 13 16:34 000000010000118600000083
-rw------- 1 907 root 16777216 Oct 13 16:34 000000010000118600000084
-rw------- 1 907 root 16777216 Oct 13 05:46 000000010000118600000085
drwx------ 2 907 907 20002 Oct 13 16:35 archive_status
There's one more file, but it has a much older mtime -- it looks like that might be a recycled file. It's not valid at all:
[root@fa0e1075 (postgres) /manatee/pg/data/pg_xlog]$ pg_xlogdump 000000010000118600000085
pg_xlogdump: FATAL: could not find a valid record after 1186/85000000
and xxd
shows that file "84" ends in all zeros, which suggests it's all invalid:
[root@HA99RHND2 (eu-central-1b) /var/tmp/dap]# xxd -a 000000010000118600000084 | tail
...
0e22000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
*
0fffff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
Offset e22000 is 14819328, just 8K behind where the pg_xlogdump process was in the pfiles output. And pg_xlogdump is in a loop sleeping from main():
[root@HA99RHND2 (eu-central-1b) /var/tmp/dap]# pstack 8115
8115: /opt/postgresql/current/bin/pg_xlogdump -f /manatee/pg/data/pg_xlog/00
ffffbf7fff25532a pollsys (ffffbf7fffdfd6a0, 0, ffffbf7fffdfd790, 0)
ffffbf7fff1ea27b pselect (0, 0, 0, 0, ffffbf7fffdfd790, 0) + 26b
ffffbf7fff1ea62a select (0, 0, 0, 0, ffffbf7fffdfd7f0) + 5a
00000000004102cb pg_usleep () + 92
000000000040853c main () + a5e
000000000040603c _start () + 6c
[root@HA99RHND2 (eu-central-1b) /var/tmp/dap]# truss -p 8115
pollsys(0xFFFFBF7FFFDFD6A0, 0, 0xFFFFBF7FFFDFD790, 0x00000000) = 0
lseek(3, 0, SEEK_SET) = 0
read(3, "93D007\001\0\0\0\0\0\084".., 8192) = 8192
lseek(3, 14811136, SEEK_SET) = 14811136
read(3, "93D005\001\0\0\0\0\0E284".., 8192) = 8192
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192
which is clearly what it does when it reaches the end of the WAL file:
1007 for (;;)
1008 {
1009 /* try to read the next record */
1010 record = XLogReadRecord(xlogreader_state, first_record, &errormsg);
1011 if (!record)
1012 {
1013 if (!config.follow || private.endptr_reached)
1014 break;
1015 else
1016 {
1017 pg_usleep(1000000L); /* 1 second */
1018 continue;
1019 }
1020 }
On timing, this process started at 16:34:11:
[root@HA99RHND2 (eu-central-1b) /var/tmp/dap]# ps -opid,stime,args -p 8115
PID STIME COMMAND
8115 16:34:11 /opt/postgresql/current/bin/pg_xlogdump -f /manatee/pg/data/pg_xlog/00000001000
and was still running until we killed it well after 16:58Z. The manatee-sitter log shows that it was at 16:40 that we brought down postgres, and shortly after that that we started it again.
I have saved:
I will tar all of these up and upload them to "/thoth/stor/tickets/pg_prefaulter#40" in the JPC Manta.
Getting error while starting pg_prefaulter.
Error: unable to start agnet: unable to create a stats agent: invalid check manager configuration
I tried to disable Circonus but still no luck.
Any pointers would be greatly appreciated.
Thanks
J
In this commit, a translation layer is added to enable pg_prefaulter to work with postgres 10+.
However, the old SQL query
SELECT timeline_id, redo_location, pg_last_xlog_replay_location() FROM pg_control_checkpoint()
is translated as
SELECT timeline_id, redo_lsn, pg_last_wal_receive_lsn() FROM pg_control_checkpoint()
This seems to have the effect of causing the code to attempt to prefault files just ahead of the most-recently-received WAL. The old behaviour appears to have been to prefault just ahead of the most-recently-replayed WAL. I am unsure whether this change in functionality was intended, but the old behaviour does seem more logical to me.
To revert to the old behaviour, change pg_last_wal_receive_lsn()
to pg_last_wal_replay_lsn()
.
The prefaulter was running under SMF. I disabled the service with the usual svcadm disable pg_prefaulter
, but the usual stop method timed out. It appears that the software does not correctly shut down on SIGTERM. Log entries from the failure, including SMF log messages:
[ Sep 13 03:16:25 Stopping because service disabled. ]
[ Sep 13 03:16:25 Executing stop method (:kill). ]
{"time":"2017-09-13T03:16:25.340818386Z","level":"info","signal":"terminated","message":"Received signal"}
{"time":"2017-09-13T03:16:25.341335305Z","level":"debug","message":"Shutting down"}
{"time":"2017-09-13T03:16:26.167822291Z","level":"error","error":"unable to execute primary check: dial unix /tmp/.s.PGSQL.5432: connect: connection refused","message":"unable to determine if database is primary or not, retrying"}
[ Sep 13 03:17:25 Method or service exit timed out. Killing contract 1830449. ]
[ Sep 13 03:17:26 Method or service exit timed out. Killing contract 1830449. ]
[ Sep 13 03:17:27 Method or service exit timed out. Killing contract 1830449. ]
[ Sep 13 03:17:28 Method or service exit timed out. Killing contract 1830449. ]
[ Sep 13 03:17:29 Method or service exit timed out. Killing contract 1830449. ]
Note that the failure to connect to the database is expected, as this was during zone shutdown. The manatee-sitter
service (and thus PostgreSQL) were already offline at the time.
Change the console logger to use https://github.com/mattn/go-isatty and use the ConsoleWriter
for pretty-printing when stdout is hooked up to a TTY (and not explicitly disabled in the config or on the CLI).
As part of the broader effort to provide statistics and monitoring for Manta components, we are moving towards exposing a Prometheus client in each software component. The rationale and some details about the plan for this are described in RFD 99.
If this was a Node program, I'd say we would use artedi. I'm not sure what the Prometheus client library landscape is like in the Go ecosystem, but I have to imagine it's pretty reasonable as Prometheus itself is written in Go!
The listen port for Prometheus connections should be configurable.
When PostgreSQL is starting up, it's not possible to connect to PG to query the WAL file that it's replaying. Fortunately this information is available in the process title. Unfortunately this is very OS-specific. Doubly unfortunate, this is often very important in recovery situations when PG is busy applying WAL files which means it's absolutely worth while to do something OS-specific and dirty (read: scraping information from the OS to figure out what WAL file we need to interrogate).
It would be very handy to have a tool that would skip the part where we talk to PostgreSQL to figure out what WAL files to process and instead allow an operator to explicitly process specific WAL files.
Add a sampling logger to enable high-volume error points that provide sufficient fidelity of the underlying condition but doesn't drown the system in IO or useless repeated data. Zerolog supports this but a global singleton sampling logger needs to be present someplace.
There are two regular expressions used to parse the output of either pg_xlogdump or https://github.com/snaga/xlogdump.
https://github.com/joyent/pg_prefaulter/blob/master/agent/walcache/cache.go#L50
and
https://github.com/joyent/pg_prefaulter/blob/master/agent/walcache/cache.go#L56
When operating in xlog-mode (ie using https://github.com/snaga/xlogdump), the output for different operations is different; and the regular expression may not match all operations.
HEAP_INSERT https://github.com/snaga/xlogdump/blob/master/xlogdump_rmgr.c#L726
HEAP_DELETE https://github.com/snaga/xlogdump/blob/master/xlogdump_rmgr.c#L769
pg_xlogdump-mode is the default; and that mode does not appear to have this problem.
On a running process, when it receives a signal (probably a SIGHUP
or SIGUSR2
), purge all of the caches.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.