SQL dump

Eugenia Kuzina

Contents

Create a backup
Restore the dump
Dump a database cluster
Backup large databases

The dump technique allows you to create a file with SQL commands that recreates a database in the same state as it was at the time the dump was created.

Create a backup

PostgreSQL includes the pg_dump utility that implements this functionality. It does not require blocking access of other users and makes consistent backups when the database is in use.

The basic use of pg_dump is:

$ pg_dump <dbname> > <dumpfile>

Where:

<dbname> is the database name to be backed up;
<dumpfile> is a name of the backup file that will be created.

The pg_dump utility writes its result to the standard output. Dumps can be output in script or archive file formats (custom, directory, or tar). See the Options section of the pg_dump article for details. The command above creates a script text file.

Script dumps are plain text files that contain SQL commands. To restore a database from a script file, use psql. You can utilize script files to create backups and reconstruct the database on a machine with another architecture. It is also possible to manually modify script files to restore the database in another DBMS.

The pg_dump application backs up a single database. To back up an entire cluster or global objects that are common to all databases in a cluster (roles and tablespaces), use pg_dumpall.

Since pg_dump is a client application, you can create a backup from any remote host that has access to the database. It must have the read access to all tables that you want to back up. To back up an entire database, you almost always have to run it as a database superuser. If you do not have this privilege, you can back up database parts to which you have the read access. Use options -n (schema) or -t (table) as described in the following article: pg_dump.

The example below backs up the books_store database to the books_store20221215.sql file:

$ pg_dump books_store > books_store20221215.sql

The following example creates a backup of the books table from the books_store database:

$ pg_dump -d books_store -t books > books_20221215.sql

To specify which database server pg_dump should contact, use the command line options -h (host) and -p (port). The default host is localhost or the host specified in the PGHOST environment variable. The default port is set in the PGPORT environment variable. If PGPORT is not assigned, pg_dump uses the value specified during compilation. The default port value for the ADPG service is 5432. The pg_dump program connects with the database user name that is equal to the current operating system user name. To override this, specify the -U option or set the PGUSER environment variable.

The following example backs up the books_store database located on the host 10.92.6.26, pg_dump establishes the connection on behalf of the postgres user:

$ pg_dump -h 10.92.6.26 -U postgres books_store > books_store20221215.sql

The archive file formats are designed to be portable across architectures. See pg_dump for details.

NOTE

The pg_dump output can be re-loaded into newer versions of PostgreSQL, whereas file-level backups and continuous archiving are both version-specific.

Restore the dump

Use the psql program to restore dump files. The basic command syntax to restore a dump is:

$ psql <dbname> < <dumpfile>

Where:

<dbname> is the database name to be restored;
<dumpfile> is a name of the backup file.

The command above does not create the dbname database. You must create it from template0 before executing the command.

$ createdb -T template0 books_store

NOTE

The pg_dump utility creates dumps that are relative to template0. Languages, procedures, and others added via template1 are also be dumped. If you utilize a customized template1, create an empty database from template0, as in the example above.

The psql utility continues the script execution after an SQL error occurs. You can use the ON_ERROR_STOP variable to alter this behavior and force the psql exit with the exit status 3 in case of SQL errors.

$ psql --set ON_ERROR_STOP=on books_store < books_store20221215.sql

Before restoring an SQL dump, all the users who own objects or have permissions on objects in the dumped database must already exist. If they do not exist, errors occur when restoring objects with the original ownership or permissions.

After restoring a backup, run ANALYZE on each database to provide the query optimizer with correct statistics.

For archive file dumps, use the pg_restore utility.

TIP

To improve the performance of dump file restoring, pay attention to the recommendations in the Improve pg_dump performance section.

Dump a database cluster

The pg_dump program backs up a single database. It does not dump information about roles or tablespaces because they are specified for a cluster rather than for a database. The pg_dumpall utility supports dumping of the database cluster contents. pg_dumpall backs up each database in a given cluster, and preserves cluster-wide data (role and tablespace definitions). It has the following syntax:

$ pg_dumpall > <dumpfile>

Where <dumpfile> is a name of the backup file that will be created.

The resulting dump can be restored with psql:

$ psql -f dumpfile postgres

The -f option means that psql reads commands from the specified file (dumpfile). postgres is a database name. You can specify any existing database name to start from instead of postgres, but if you restore the backup of multiple databases into an empty cluster, postgres should be used. It is also necessary to have the database superuser access to restore a pg_dumpall dump. If you use tablespaces, make sure that the tablespace paths in the dump are appropriate for the new installation.

The pg_dumpall program adds to the dump file commands to re-create roles, tablespaces, and empty databases, after that invokes pg_dump for each database. This means that while each database is internally consistent, the snapshots of different databases are not synchronized.

To dump only cluster-wide data, use the pg_dumpall --globals-only option.

Backup large databases

Operating systems may have maximum file size limits that cause problems when creating large pg_dump output files. Since pg_dump writes to the standard output, you can use standard Unix tools to solve this problem.

Use compressed dumps

You can use a compression program, for example gzip:

$ pg_dump <dbname> | gzip > <filename.gz>

You can restore the dump as follows:

$ gunzip -c <filename.gz> | psql <dbname>

$ cat <filename.gz> | gunzip | psql <dbname>

Use split

The split command allows you to split the output into smaller files. The following example splits a file into 2 GB chunks:

$ pg_dump <dbname> | split -b 2G - <filename>

You can restore the dump as follows:

$ cat <filename>* | psql <dbname>

If you utilize GNU split, it is possible to use it and gzip together:

$ pg_dump <dbname> | split -b 2G --filter='gzip > $FILE.gz'

Use the zcat utility to restore it.

Use custom dump format

Since ADPG includes the zlib library, the custom pg_dump format compresses data. In this format, the size of the dump file is close to the size obtained using gzip, but it allows you to restore tables selectively. The following command dumps the database in the custom format:

$ pg_dump -Fc <dbname> > <filename>

A dump file of the custom format is not a script, use pg_restore utility to restore it:

$ pg_restore -d <dbname> <filename>

For very large databases, you might need to combine using the split utility with the custom dump format or compressed dump files.

Use parallel dump feature

To speed up the dump of a large database, you can use the pg_dump parallel mode. It dumps multiple tables at the same time. You can control the degree of parallelism with the -j parameter. Parallel dumps are only supported for the directory archive format.

$ pg_dump -j <number> -F d -f <out_directory> <dbname>

You can use pg_restore -j to restore a dump in parallel. This works for any archive in the custom or directory format, even if the archive is not created with pg_dump -j.

Found a mistake? Seleсt text and press Ctrl+Enter to report it