May 5, 2010

Maintaining Surround SCM PostgreSQL Databases

Surround SCM
You need to perform the following maintenance tasks for Surround SCM PostgreSQL databases on a regular basis:
  1. Vacuum the database
  2. Rebuild the database index
  3. Maintain the log files
Following is an overview of recommended database maintenance tasks. Additional information is available in the PostgreSQL documentation.

Vacuuming the database

Vacuuming the database is an important routine task that removes old versions of updated or deleted rows from tables to reclaim space and updates table statistics so the PostgreSQL query planner can access rows efficiently. Most importantly, it protects against loss of old data due to transaction ID wraparound, which can cause database operations to stop or result in data loss. We recommend using the PostgreSQL autovacuum daemon to ensure that vacuuming is completed regularly and efficiently. Autovacuuming also prevents deadlocks, which can occur even when using manual vacuuming. Autovacuuming is enabled by default if PostgreSQL is installed with the Surround SCM installation. Set the following parameters in the postgresql.conf file if you need to enable it:
autovacuum = true
track_counts = true
You also need verify that the pg_autovacuum system table does not contain any rows. Issue the following statement against the database and verify that no rows are returned:
SELECT * FROM pg_autovacuum;

Rebuilding the index

Tables with indexed values that are frequently updated or deleted need to have the indexes rebuilt regularly. When an index is added, it is accessed quickly because the data is generally grouped together on the disk. As data is moved and removed over time, space is not always reclaimed by the new data and index access may be slower. Some Surround SCM data, such as configuration, security, workflow, trigger, and report data, is routinely updated and deleted. If destroying files is enabled in Surround SCM, tables related to files are also routinely updated and deleted. Use the PostgreSQL REINDEX command to rebuild bloated indexes. You must be the index or table owner or a superuser to use this command. The REINDEX command is similar to dropping and re-creating an index, but it only locks the index’s parent table against writes. Data can be read, but inserts and updates are not allowed. Any query that uses the index for reads is blocked. Issue the following statements to rebuild indexes:
If a data set has low-to-moderate activity, we recommend that you reindex quarterly. Systems with more activity may require more frequent reindexing. If there is a regular period when application activity is low, you may want to use standard Unix tools, such as cron scripts or Windows Task Scheduler, to run scripts that perform the reindex.

Maintaining the log files

PostgreSQL has a built-in, file-based log rotation program that is disabled by default. Enabling built-in log collection and rotation allows you to review data if an error or issues occur with the PostgreSQL database. The default logging levels include errors, log messages, fatal errors, and panics. PostgreSQL supports several methods for logging server messages, including stderr, csvlog, and syslog. On Windows, eventlog is also supported. The default is to log to stderr only. Set the following parameters in the postgresql.conf file to enable built-in log collection:
logging_collector = true
log_directory = “/<directory>”
log_filename = “postgresql.log.%a”
log_truncate_on_rotation = true
After the parameters are set, a log file is created in the specified directory for each day of the week and the logs are overwritten every seven days. The log file names are appended based on the days of the week (e.g., postgresql.log.Mon, postgresql.log.Tue, etc.). Other file masks are available for keeping the log files for longer or shorter periods of time. If logging is configured at a higher level, the log may need to be configured to roll over hourly and even every few minutes. You can also use time definitions or size to trigger the log rotation.