Selective backup and restore¶

Version added: 2.0.0

Implementation history

The following table lists the changes in the implementation of selective backups and the versions that introduced those changes:

Version	Description
2.0.3	Support for non-sharded collections in sharded clusters
2.1.0	Added support for sharded collections
2.5.0	Ability to restore databases with users and roles
2.8.0	Ability to define multiple namespaces for backup

You can back up and restore certain namespaces - databases or collections. For example, if your “Payments” collection in the “Customers” database was corrupted, you can restore only this collection from your full backup. Or, if your “Invoices” database contains sensitive data and must be backed up frequently, you can configure the backup of only this database.

Starting in version 2.8.0, you can define several databases or collections for a backup. This simplifies the backup management since instead of having backups for every namespace, you accumulate the required data within a single backup.

Using selective backups and restores, you work only with the desired subset of data without disrupting the operations of your whole cluster.

You also drastically reduce time on backup / restore operations of the whole data set and save on storage consumption.

With the selective backup and restore functionality, you have the following options:

Back up a single database or a specific collection and restore all data from it.
Back up certain databases and / or collections and restore either full data or specific databases / collections from it.
Restore a specific collection from a single database backup
Restore certain databases and / or collections from a full backup
Make a point-in time recovery for the specified databases / collections. Available for replica sets only.

Known limitations of selective backups and restores¶

Only logical backups and restores are supported.
Selective backups and restores are supported in sharded clusters for non-sharded collections starting with version 2.0.3. Sharded collections are supported starting with version 2.1.0.
Sharded time series collections are not supported.
Multi-collection transactions are not yet supported for selective restore. However, if you use them and attempt a selective restore, it may break ACID because not all operations with this transaction are restored. PBM applies oplog events that relate only to the specified namespaces(s). Thus, from the transaction’s point of view, the data consistency may be broken.

For example, you have a transaction that involves collections A and B. When you restore collection A, PBM replays oplog events only for collection A and ignores those related to collection B. As a result, the state of collection B remains unchanged and is no longer consistent with collection A.
System collections in admin, config, and local databases cannot be backed up and restored selectively. You must make a full backup and restore to include them.
Selective point-in-time recovery is not supported for sharded clusters.
Selective backups are not supported for deployments with config shards - config server replica sets that also store application data.

Sharded collections¶

Version added: 2.1.0

You can back up and restore sharded collections. During backup, pbm-agents on each shard save the documents for the specified databases/collections and the full oplog for the period of the backup process. A pbm-agent on the config server replica set saves router config documents from the config database required for restoring the selected namespaces.

During the restore, the reverse process occurs:

A pbm-agent on each shard restores only the specified databases/collections and replays the oplog that relates only to the specified namespaces. The operations for other namespaces are ignored.
On the config server replica set, the pbm-agent restores the router configuration only for the specified sharded collections. The router configuration for other databases, collections and chunks remains intact.

The restore for sharded time series collections is not supported.

Note that selective backups and restores operate only with data and router configuration. The cluster configuration and topology-related settings are ignored. Therefore, we recommended to restore the databases/collections on the same environment.

Implementation specifics¶

During the selective restore, the primary shard for a database is set to the state it had during the backup. For example, the primary shard for the database “Staff” during backup was A. After you restore the “Staff” database, the primary shard will be set to A even if you moved the primary from A to B before the restore. All non-sharded collections will be restored on A; however, they will not be deleted from B. You must take needed actions (cleanup or move the primary back to B) to maintain them.

Restore a database with users and roles¶

Version added: 2.5.0

You can restore the specified databases with users and roles that were created against them. This feature is useful for deployments where each user has an individual database and authenticates against it. In such a way, you can recover desired datasets to their state prior to data corruption or loss.

Consider these specifics of selective restore with users and roles:

You can restore custom databases from a full backup.
Users and roles must be created in custom databases. For security considerations, users created in admin, config and local databases cannot be a part of a selective restore.
If users and roles exist in a database during the restore, they will be overwritten from the backup.

Make a backup Restore a backup

Get expert help¶

If you need assistance, visit the community forum for comprehensive and free database knowledge, or contact our Percona Database Experts for professional support and services.

Community Forum Get a Percona Expert Join Percona Squad

Last update: November 12, 2024
Created: November 12, 2024