May 5, 2010

Using Continuous Archiving for Surround SCM PostgreSQL Databases

Surround SCM
PostgreSQL supports continuous archiving that records every change made to a database. Data is logged in the write ahead log (WAL), which is stored in the pg_xlog/ directory in the PostgreSQL data directory (e.g., C:Program FilesPostgreSQL8.3datapg_xlog). WAL archiving helps provide more complete crash recovery and allows point-in-time recovery. Following is basic information for setting up WAL archiving, creating a base backup, and recommendations for the frequency of performing base backups. For more information about WAL archiving and data recovery, refer to the PostgreSQL documentation.

Setting up WAL archiving

To set up WAL archiving, you need to identify a location and method for copying the files and enable archiving for the database. The copy operation moves files from the pg_xlog directory to another location available to the database. You can use a simple copying approach, such as using a cp command, or a more complex approach, such as using an underlying shell script, to move the file to another directory. You set parameters in the postgresql.conf file to enable WAL archiving. The archive_command parameter values may be different based on the server type and installation. Set the following parameters to enable archiving:
archive_mode = true
archive_command = ‘cp “%p” /mnt/server/archivedir/”%f”’
archive_command = ‘copy “%p” “C:\server\archivedir\%f”’
The archive command must return zero if the copy is successful and refuse to overwrite any existing archive files. When a zero result is returned, PostgreSQL automatically removes or recycles the WAL segment file. If a non-zero result is returned, the file was not archived and PostgreSQL reattempts archiving until it is successful. When you design your archiving setup, consider what will happen if the archive command fails repeatedly because user intervention is required or the archive runs out of space. For example, if you write data to a tape without an autochanger and the tape is full, data cannot be archived until the tape is swapped. Make sure error conditions or requests for user intervention are reported appropriately so the situation can be resolved quickly and archiving can continue without interruption. The pg/x_log directory will continue to be populated with WAL segment files until the situation is resolved. If the filesystem that contains the pg_xlog directory is full, PostgreSQL performs a PANIC shutdown. While no previous transactions are lost, the database will not be available until you free disk space.

Creating a base backup

After WAL archiving is set up, you can create a filesystem backup using a label to create a checkpoint. The label checkpoint has minimal impact on the database query processing. If the checkpoint must be completed quickly, issue a manual checkpoint and then issue the label command. Perform the following steps to create a base backup:
  1. Connect to the PostgreSQL database as a superuser.
  2. Issue the following command:
    SELECT pg_start_backup(‘label’)
  3. Perform a normal filesystem backup using tar, cpio, copy, or another method. You do not need to stop the database operation while the backup is running.
  4. When the backup is complete, reconnect to the database as a superuser.
  5. Issue the following command to stop backup mode and automatically switch to the next WAL segment to archive:
    SELECT pg_stop_backup();
To use the backup, you need to keep all the WAL segment files generated during and after the filesystem backups. A backup history file is stored in the WAL archive area. This text file contains the label you used in the pg_start_backup command, the starting and ending times, and WAL segments of the backup.

Base backup frequency

The time it takes to perform recovery depends on the number of WAL files that have to the applied to the base backup. To decrease recovery time, perform base backups on a regular basis. The frequency of backups depends on the volume of WAL files produced by the database, which depends on activity. A small- to mid-size amount of activity generally produces a moderate number of WAL files. In this case, a weekly base backup is probably sufficient to manage a reasonable recovery process. After a new base backup is complete and verified as successful, the old archived WAL files and previous base backup can be stored together or deleted. The old data is obsolete because of the new base backup and WAL files. A new backup history file is automatically generated.