The Cloudberry Database does not include the utility gpbackup
by default. It's maintained separately. Please follow the README to install gpbackup
before using it.
The parallel dump utility gpbackup
backs up the CloudberryDB master instance and each active segment instance at the same time.
By default, gpbackup creates dump files in the backups subdirectory.
Several dump files are created for the master, containing database information such as DDL statements, the CloudberryDB system catalog tables, and metadata files. gpbackup creates dump files for each segment.
You can perform full or incremental backups. To restore a database to its state when an incremental backup was made, it will restore the previous full backup and all subsequent incremental backups.
Each file created for a backup begins with a 14-digit timestamp key that identifies the backup set the file belongs to.
gpbackup can be run directly in a terminal on the master host, or you can add it to crontab on the master host to schedule regular backups.
The parallel restore utility gprestore
takes the timestamp key generated by gpbackup, validates the backup set, and restores the database objects and data into a distributed database in parallel. Parallel restore operations require a complete backup set created by gpbackup, a full backup and any required incremental backups.
The gpbackup utility provides flexibility and verification options for use with the automated backup files produced by gpbackup or with backup files moved from the CloudberryDB array to an alternate location.
Exercises
These exercises will walk through how to create a full backup of your database and then restore a table.
-
To run a full backup, use
gpbackup --dbname database --backup-dir /path/for/backup
. This will backup the entire database to the directory given.[gpadmin@mdw tmp]$ gpbackup --dbname tutorial --backup-dir /tmp/
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-gpbackup version = 1.2.7-beta1+dev.7
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-Greenplum Database Version = oudberry Database 1.0.0 build dev
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-Starting backup of database tutorial
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-Backup Timestamp = 20230727102819
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-Backup Database = tutorial
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-Gathering table state information
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-Acquiring ACCESS SHARE locks on tables
Locks acquired: 28 / 28 [==============================================================] 100.00% 0s
20230727:10:28:19 gpbackup:gpadmin:mdw:020061-[INFO]:-Gathering additional table metadata
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Getting storage information
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Metadata will be written to /tmp/gpseg-1/backups/20230727/20230727102819/gpbackup_20230727102819_metadata.sql
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Writing global database metadata
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Global database metadata backup complete
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Writing pre-data metadata
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Pre-data metadata metadata backup complete
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Writing post-data metadata
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Post-data metadata backup complete
20230727:10:28:20 gpbackup:gpadmin:mdw:020061-[INFO]:-Writing data to file
Tables backed up: 11 / 11 [============================================================] 100.00% 8s
20230727:10:28:28 gpbackup:gpadmin:mdw:020061-[INFO]:-Data backup complete
20230727:10:28:28 gpbackup:gpadmin:mdw:020061-[INFO]:-Skipped data backup of 3 external/foreign table(s).
20230727:10:28:28 gpbackup:gpadmin:mdw:020061-[INFO]:-See /home/gpadmin/gpAdminLogs/gpbackup_20230727.log for a complete list of skipped tables.
20230727:10:28:29 gpbackup:gpadmin:mdw:020061-[INFO]:-Found neither /usr/local/cloudberry-db/bin/gp_email_contacts.yaml nor /home/gpadmin/gp_email_contacts.yaml
20230727:10:28:29 gpbackup:gpadmin:mdw:020061-[INFO]:-Email containing gpbackup report /tmp/gpseg-1/backups/20230727/20230727102819/gpbackup_20230727102819_report will not be sent
20230727:10:28:29 gpbackup:gpadmin:mdw:020061-[INFO]:-Backup completed successfullyThis runs a full backup of the database created during the previous exercises.
-
To view the backups:
$ ls -al /tmp/
drwxrwxr-x 3 gpadmin gpadmin 21 Jul 27 10:28 gpseg-1
drwxrwxr-x 3 gpadmin gpadmin 21 Jul 27 10:28 gpseg1
drwxrwxr-x 3 gpadmin gpadmin 21 Jul 27 10:28 gpseg0 -
Now, that we have a full backup let's remove data from a table to simulate a failure.
psql (14.4, server 14.4)
Type "help" for help.
tutorial=# select count(*) from faa.otp_r;
count
---------
1024552
(1 row)This should return 1024552 rows in the table. Let's truncate the table and then check the row count:
tutorial=# truncate table faa.otp_r;
TRUNCATE TABLE
tutorial=# select count(*) from faa.otp_r;
count
-------
0
(1 row)
tutorial=#The report should now show 0 rows in the table.
-
Let's restore the data that was lost. First, exit from the psql shell by typing
\q
then issue the gprestore command(14 digital timestamp information could be got from previous gpbackup output, also you may record it down once you have done any backup before.):gprestore --include-table faa.otp_r --data-only --backup-dir /tmp/ --timestamp 20230727102819
20230727:10:38:29 gprestore:gpadmin:mdw:020373-[INFO]:-Restore Key = 20230727102819
20230727:10:38:30 gprestore:gpadmin:mdw:020373-[INFO]:-gpbackup version = 1.2.7-beta1+dev.7
20230727:10:38:30 gprestore:gpadmin:mdw:020373-[INFO]:-gprestore version = 1.2.7-beta1+dev.7
20230727:10:38:30 gprestore:gpadmin:mdw:020373-[INFO]:-Greenplum Database Version = oudberry Database 1.0.0 build dev
20230727:10:38:30 gprestore:gpadmin:mdw:020373-[INFO]:-Restoring sequence values
20230727:10:38:30 gprestore:gpadmin:mdw:020373-[INFO]:-Sequence values restore complete
Tables restored: 1 / 1 [===============================================================] 100.00% 5s
20230727:10:38:35 gprestore:gpadmin:mdw:020373-[INFO]:-Data restore complete
20230727:10:38:35 gprestore:gpadmin:mdw:020373-[INFO]:-Found neither /usr/local/cloudberry-db/bin/gp_email_contacts.yaml nor /home/gpadmin/gp_email_contacts.yaml
20230727:10:38:35 gprestore:gpadmin:mdw:020373-[INFO]:-Email containing gprestore report /tmp/gpseg-1/backups/20230727/20230727102819/gprestore_20230727102819_20230727103829_report will not be sent
20230727:10:38:35 gprestore:gpadmin:mdw:020373-[INFO]:-Restore completed successfully -
Finally, verify the row count
[gpadmin@mdw tmp]$ psql -U gpadmin tutorial
psql (14.4, server 14.4)
Type "help" for help.
tutorial=# select count(*) from faa.otp_r;
count
---------
1024552
(1 row)
The table should show 1024552 rows again as it was prior to the truncate call.