PostgreSQL WAL Archival
This page describes how to turn on enable PostgreSQL's WAL file archive feature and how to perform certain workflows, assuming a tDAR infrastructure or development environment.
HOWTO: Configure PostgreSQL for WAL archiving
- Identify/create the associated paths on the postgres server. Unless otherwise stated It is not necessary for these paths to be unique.
- BASEBACKUP - the directory that will hold the "base backup" folder. There should only be one copy of this file at any given time
- XLOG - the directory that holds the archived WAL files
- PGDATA - location of the postgres database directory
- PGCONFIG - location of the postgres configuration files (postgresql.conf, pg_hba.conf, etc.)
- ARCHIVE_SCRIPT - script that postgres executes to process WAL files that are ready to be archived. See separate HOWTO.
- BACKUP_SCRIPT- script that is responsible for pruning the XLOG folder and creating a new creating a new baseline backup in BASEBACKUP. See separate HOWTO
- Configure postgresql.conf. Add / modify the following properties to these values:
- wal_level = archive
- archive_mode = on
- archive_command = '/path/to/ARCHIVE_SCRIPT %f %p' (%f represents the file name, %p represents path to file's directory)
- max_wal_senders = 2
- archive_timeout = 3600 # sets minimum WAL creation every 3600s (1 hr) - can be set at your descretion but consider impact on space requirements.
- Enable replication for postgres user
- pg_hba.conf - add separate line for replication connections (examples in the file's comments).
- ALTER USER postgres WITH REPLICATION;
- register BACKUP_SCRIPT to execute daily - via cron, cron.daily, or (whatever you windows people use).
- restart postgres
- Pray.
HOWTO: Create valid archive shell scripts
Up-to-date examples of the tDAR WAL scripts are attached to this document. These are the guidelines for creating the scripts called by postgres during the archival/recovery process
Archive Command Script
The "archive command" is the command specified by "archive_command" in postgresql.conf. This is the command postgres executes to indicate that a WAL file is ready for archival. Rules for this script:
- Exit with zero code only if successful. Postgres will repeatedly execute the command if a previous attempt failed until the operation is successful (or until pg_xlog runs out of space and postgres shuts down).
- The script should not attempt to move or modify the original file - postgres will take care of that.
Baseline Backup Script
Backups created via pg_dump or pg_dumpall are insuffiencent for a restore baseline; you must use a "base backup", e.g.
pg_basebackup --pgdata=$BASELINE_DEST --format=tar --gzip --xlog --username=postgres 1>>$LOG
Recovery Command Script
Similar to the 'archive_command' script, the recovery.conf file specifies a "restore_command" that the postgres recovery process executes when it requests a previously-archived file. This can be as simple as a unix copy command, but will likely need to be a shell script of some kind.
- Postgres will execute the comand specified by the 'restore_command' property in recovery.conf, e.g. "cp /archives/%f %p". Occurances '%f' are replaced by the filename, and '%p' are replaced by the full path to the destination where your script should place the retrieved archive.
- The restore script must exit with zero code only when successful
- The restore script must exit with non-zero if a requested archive is not present (note: this is expected behavior - postgres *will* request a missing file at least once during a normal recovery process.
- The script and it's parent directory need "rwx" permissions for userid=postgres
HOWTO: Restore a database from WAL archive
Follow these general steps to recover an archive. Using the following terminology:
- PGDATA - path to the postgres cluster data direcory
- PGRESTORE - this is a temporary path we will use for staging the files.
- base.tar.gz - path to a previous baseline backup
- PGWALARCHIVE - path to WAL archive directory
Procedure:
Stop postgres, if it's running
pg_ctl -D $PGDATA -m smart
Restore the database files from base.tar.gz to $PGRESTORE
gunzip base.tar.gz
tar -xvf base.tar --directory $PGRESTORE
Duplicate the permissions from PGDATA to PGRESTORE, including file owner and file permissions, for example
chmod 700 $PGRESTORE
find $PGRESTORE | xargs chown postgres:daemon
- Copy configuration files from $PGDATA (e.g. $PGDATE/*.conf) to $PGRESTORE.
- (Important!) comment out the following properties from $PGRESTORE/postgresql.conf
- "data_directory"
- "hba_file"
- "ident_file"
- "external_pid_file"
- Add a recovery.conf file to $PGRESTORE. At the very least this should include a 'restore_command' entry (see previous section).
Start postgres
pg_ctl -D $PGRESTORE start
Postgres will log it's progress to the standard postgres log file. Outside of the log file, the only way to tell if the restore was successful is that postgres will rename the 'recovery.conf' file to 'recovery.done'.