Skip to content
Home » All Posts » Understanding Incremental Data Backup in pg_rman

Understanding Incremental Data Backup in pg_rman

Previously, we discussed how to perform a simple backup using pg_rman. As pg_rman can back up PostgreSQL at the physical file copy level, it is quite useful for daily database maintenance. Particularly, pg_rman supports incremental backup, which only backs up the changed data. In this blog, I will explain how incremental backup works in pg_rman.

Firstly, all of this relies on the table and index page layout implemented in PostgreSQL. In PostgreSQL, each data page is laid out like this.

page header data

The pageHeaderData has many attributes, and one of the most important attributes is pd_lsn, which is updated to the last LSN that changed this page.

page layout

To support incremental backup at the data page level, pg_rman compares the pd_lsn in each 8KB data block between the previous full backup and the current database cluster. If the pd_lsn in the to-be-backed-up database cluster is greater than the corresponding page in the full backup, then make a copy of this data page; otherwise, skip the data block. Using pg_lsn, pg_rman can perform an incremental backup based on a previous valid full backup. If there is no valid full backup, pg_rman can’t create the incremental backup, and it will automatically run a full backup.

Here are some code snippets to demonstrate how this incremental backup is implemented in pg_rman. When a user executes an incremental backup as shown below,

pg_rman backup --backup-mode=incremental -B /tmp/rman/backup -D /tmp/rman/pgdata -A /tmp/rman/archive -p 5432 -d postgres

First, pg_rman will read the pg_control file to confirm the WAL segment size and then ensure that the system_identifier between the backup folder and the database cluster is the same.

Second, pg_rman will execute the following logic to determine if this is an incremental backup. It attempts to find the previous valid full backup. If a valid full backup is not found on the same timeline, it automatically initiates a full backup. If a valid full backup is found, it uses the start LSN to identify all the changed pages and performs an incremental backup.

/*
 * To take incremental backup, the file list of the latest validated
 * full database backup is needed.
 * ...
 */
if (current.backup_mode < BACKUP_MODE_FULL)
{
	/* find last completed database backup */
	prev_backup = catalog_get_last_data_backup(backup_list);
	if (prev_backup == NULL || prev_backup->tli != current.tli)
	{
		if (current.full_backup_on_error)
		{
			ereport(NOTICE,
				(errmsg("turn to take a full backup"),
	}
	else
	{
		pgBackupGetPath(prev_backup, prev_file_txt, lengthof(prev_file_txt),				DATABASE_FILE_LIST);
		prev_files = dir_read_file_list(pgdata, prev_file_txt);

		/*
		 * Do backup only pages having larger LSN than previous backup.
		 */
		lsn = &prev_backup->start_lsn;

Once all the changed files are identified, the backup_files function is called to process each page.

/* if the page has not been modified since last backup, skip it */
if (!prev_file_not_found && lsn && !XLogRecPtrIsInvalid(page_lsn) && page_lsn < *lsn)
					continue;

If the LSN of the database block page is less than the start LSN, indicating that this data page hasn’t changed since the last full backup, it is simply skipped. Otherwise, a copy of the page is created.

Join the conversation

Your email address will not be published. Required fields are marked *