Performance tuning
Query performance can be affected by many factors. A part of them can be controlled by a user or administrator, while others are mostly influenced by the underlying system architecture. This article provides some tips for understanding and tuning PostgreSQL performance.
Optimize resource consumption
This section contains general guidelines on configuration parameter values that can be used as a starting point. Optimal parameter values depend on the load of a particular cluster and used hardware. Please note that there are open-source configurators that calculate basic PostgreSQL settings based on the specified hardware configuration. Some of them are available online.
Memory settings
The table below contains base recommendations on memory settings based on the server memory.
Parameter name | Description | Recommended value |
---|---|---|
effective_cache_size |
Sets the planner’s assumption about the effective size of the disk cache that is available to a single query. This is taken into account when estimating the cost of using the index. A higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be applied. When setting this parameter, you should consider both PostgreSQL shared buffers and the portion of the kernel’s disk cache that will be used for PostgreSQL data files, though some data might exist in both places. Also, take into account the expected number of concurrent queries on different tables, since they will have to share the available space. This parameter does not affect the size of shared memory allocated by PostgreSQL, and it does not reserve kernel disk cache. It is used only for estimation purposes. The system also does not assume data remains in the disk cache between queries. If this value is specified without units, it is taken as blocks, that is |
2/3 RAM |
shared_buffers |
Sets the amount of memory the database server uses for shared memory buffers. If you have a dedicated database server with 1 GB or more of RAM, a reasonable starting value for |
RAM/4 |
temp_buffers |
Sets the maximum amount of memory used for temporary buffers within each session. These are session-local buffers used only for access to temporary tables. If this value is specified without units, it is taken as blocks, that is |
256 MB |
work_mem |
Sets the planner’s assumption about the effective size of the disk cache that is available to a single query. This is taken into account when estimating the cost of using the index. A higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be applied. When setting this parameter, you should consider both PostgreSQL shared buffers and the portion of the kernel’s disk cache that will be used for PostgreSQL data files, though some data might exist in both places. Also, take into account the expected number of concurrent queries on different tables, since they will have to share the available space. This parameter does not affect the size of shared memory allocated by PostgreSQL, and it does not reserve kernel disk cache. It is used only for estimation purposes. The system also does not assume data remains in the disk cache between queries. If this value is specified without units, it is taken as blocks, that is |
RAM/32 |
maintenance_work_mem |
Specifies the maximum amount of memory to be used by maintenance operations, such as Note that when autovacuum runs, this memory may be allocated up to autovacuum_max_workers times, so be careful not to set the value too high. It may be useful to utilize the autovacuum_work_mem configuration parameter to control autovacuum |
RAM/16 |
You can set all the described parameters except autovacuum_work_mem
and temp_buffers
in the ADPG configurations section of the Configuration tree on the Clusters → ADPG cluster → Services → ADPG → Primary configuration page in ADCM UI. To specify autovacuum_work_mem
and temp_buffers
, use the ADPG configurations section of the Configuration tree. See Configuration parameters for details.
After changing the described settings, execute the Reconfigure & Restart action of the ADPG service to apply changes.
Settings related to I/O operations
The table below lists parameters related to I/O operations, changing the values of which can significantly affect performance.
Parameter name | Description |
---|---|
fsync |
If this parameter is on, the PostgreSQL server tries to make sure that updates are physically written to disk, by issuing Examples where disabling In many situations, turning off You can set this parameter in ADPG configurations on the Clusters → ADPG cluster → Services → ADPG → Primary configuration page in ADCM UI (see Configuration parameters). |
synchronous_commit |
Specifies how much WAL processing must complete before the database server returns a success indication to the client. See synchronous_commit for detailed description. Setting this parameter to The behavior of each transaction is determined by the setting in effect when it commits. It is possible to commit some transactions synchronously and others asynchronously. For example, to make a single multistatement transaction commit asynchronously when the default is the opposite, issue |
checkpoint_completion_target |
Specifies the target time of checkpoint completion as a fraction of total time between checkpoints. The default is |
effective_io_concurrency |
Sets the number of concurrent disk I/O operations that can be executed simultaneously. Raising this value will increase the number of I/O operations that a PostgreSQL session attempts to initiate in parallel. The allowed range is For magnetic drives, a good starting point for this setting is the number of separate drives comprising a RAID 0 or RAID 1 being used for the database. For RAID 5 the parity drive should not be counted. If the database is often busy with multiple queries issued in concurrent sessions, lower values may be sufficient to keep the disk array busy. A value higher than needed to keep the disks busy will only result in extra CPU overhead. SSD storages can often process many concurrent requests, so the best value might be in the hundreds. This value can be overridden for tables in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE) |
random_page_cost |
Sets the planner’s estimate of the cost of a non-sequentially-fetched disk page. The default is Reducing this value relative to Random access to mechanical disk storage is normally much more expensive than four times sequential access. However, a lower default is used ( If you believe a 90% cache rate is an incorrect assumption for your workload, you can increase This value can be overridden for tables and indexes in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE) |
Determine disk usage
In ADPG/PostgreSQL, each table has a primary heap disk file where most of the data is stored. If a table has columns with potentially large values, there can also be a TOAST file associated with the table, which is used to store these large values. There will be one valid index on each TOAST table on a disk. In addition, there can be indexes associated with the base table. PostgreSQL stores each table and index in a separate disk file. If a file exceeds 1 GB, additional files will be created. Naming conventions for these files are described in the Database file layout article.
You can monitor disk space in three ways:
Use database object size functions
These SQL functions are the easiest to use and are generally recommended. They are listed in the Database object size functions table.
For example, the pg_relation_size()
function returns the size of the specified table (pgbench_accounts
in this example) in bytes. You can use the psql tool to run the following query:
SELECT pg_relation_size('pgbench_accounts');
Result:
pg_relation_size ------------------ 671481856
The pg_size_pretty()
function converts a size in bytes into a more human-readable format with size units: bytes, KB, MB, GB, or TB:
SELECT pg_size_pretty(pg_relation_size('pgbench_accounts'));
Result:
pg_size_pretty ---------------- 640 MB
You can use the pg_database_size()
function to compute the total disk space used by the specified database (example
in this example):
SELECT pg_size_pretty (pg_database_size('example'));
Result:
pg_size_pretty ---------------- 756 MB
Inspect system catalogs
You can issue queries in psql to see the disk usage of a table. For correct results, a VACUUM or ANALYZE operation can be performed on the table. The following query determines the number of pages (the relpages
column) that belong to the pgbench_accounts
table:
SELECT pg_relation_filepath(oid), relpages FROM pg_class WHERE relname = 'pgbench_accounts';
Result:
pg_relation_filepath | relpages ----------------------+---------- base/19191/19244 | 81968
Each page is typically 8 kilobytes. This is the same result that the pg_relation_size()
function returns: KB. Note that relpages
is only updated by VACUUM
, ANALYZE
, and a few DDL commands such as CREATE INDEX
. The file path (the pg_relation_filepath
column) is of interest if you want to examine the table’s disk file directly.
To determine the space used by the TOAST tables associated with the my_table
table, use the following query:
SELECT relname, relpages
FROM pg_class,
(SELECT reltoastrelid
FROM pg_class
WHERE relname = 'my_table') AS ss
WHERE oid = ss.reltoastrelid OR
oid = (SELECT indexrelid
FROM pg_index
WHERE indrelid = ss.reltoastrelid)
ORDER BY relname;
Result:
relname | relpages -----------------------+---------- pg_toast_295925 | 0 pg_toast_295925_index | 1
The query below displays the index size for the pgbench_branches
table:
SELECT c2.relname, c2.relpages
FROM pg_class c, pg_class c2, pg_index i
WHERE c.relname = 'pgbench_branches' AND
c.oid = i.indrelid AND
c2.oid = i.indexrelid
ORDER BY c2.relname;
Result:
relname | relpages -----------------------+---------- ind_bbalance | 2 pgbench_branches_pkey | 2
To find the largest tables and indexes, use the following query:
SELECT relname, relpages
FROM pg_class
ORDER BY relpages DESC LIMIT 5;
Result:
relname | relpages -----------------------+---------- pgbench_accounts | 81968 pgbench_accounts_pkey | 13713 pg_proc | 94 pg_depend | 65 pg_toast_2618 | 63
Use the oid2name module
The oid2name utility program helps administrators examine the file structure used by ADPG/PostgreSQL.
oid2name connects to a target database and extracts OID, filenode (a unique numeric identifier used to name on-disk files belonging to a relation), and/or table name information:
$ oid2name
Result:
All databases: Oid Database Name Tablespace ----------------------------------- 19191 example pg_default 14486 postgres pg_default 14485 template0 pg_default 1 template1 pg_default
For example, consider the example
database from the list above. To do this, change the current catalog to the database catalog:
cd /pg_data1/adpg16/base/19191
where 19191
is Oid of the example
database from the command output above.
Display the first 5 database objects in the default tablespace, ordered by size (the biggest):
ls -lS * | head -5
Result:
-rw------- 1 postgres postgres 671481856 Jul 31 10:54 19244 -rw------- 1 postgres postgres 112336896 Jul 31 11:03 19261 -rw------- 1 postgres postgres 770048 Jul 31 10:43 1255 -rw------- 1 postgres postgres 532480 Aug 1 08:11 2608 -rw------- 1 postgres postgres 516096 Jul 31 10:43 2838
Determine what the 19244
file is:
oid2name -d example -f 19244
Result:
Filenode Table Name ---------------------------- 19244 pgbench_accounts
The oid2name utility requires a running database server with non-corrupt system catalogs. In situations with catastrophic database corruption, its use is limited.
Populate a database
This section provides some suggestions on how to efficiently insert a large amount of data when populating a database for the first time or during the process of database use.
Disable autocommit
When using multiple INSERT
operators, turn off autocommit and do one commit at the end. In plain SQL, this means issuing BEGIN at the start and COMMIT at the end. Some client libraries can do this automatically, in such cases you need to check this behaviour. If you allow each insertion to be committed separately, PostgreSQL does a lot of work for each added row. An additional benefit of doing all insertions in one transaction is that if the insertion of one row fails, the insertion of all rows will be rolled back. So, you will not be stuck with partially loaded data.
Use COPY
Use COPY to load all the rows in one command, instead of using a series of INSERT
commands. The COPY
command is optimized for loading large numbers of rows. It is less flexible than INSERT
but has significantly less overhead for large data loads. Since COPY
is a single command, there is no need to disable autocommit if you use this method to populate a table.
If you cannot use COPY
, it may be useful to utilize PREPARE to create a prepared INSERT
statement, and then use EXECUTE as many times as required. This allows you to avoid the overhead of repeatedly parsing and planning INSERT
.
Note that loading a large number of rows using COPY
is almost always faster than using INSERT
, even if PREPARE
is used and multiple insertions are batched into a single transaction.
If the wal_level
parameter is minimal
, COPY
is fastest when executed in the same transaction as the CREATE TABLE
or TRUNCATE
commands. In such cases, there is no need to write WAL, since in case of an error, the files containing the newly loaded data will be removed.
Remove indexes
If you load data into a table you just created, the fastest method is to add data using COPY
and then create indexes. It takes less time to create an index for existing data than to update it sequentially as each row is added.
If you need to add large amounts of data to an existing table, it may make sense to drop the indexes, load the table, and then recreate the indexes. Note that the database performance for other users might suffer during the time the indexes are missing. Additionally, you should think twice before dropping unique indexes, as without them the corresponding key checks will not be performed.
Remove foreign key constraints
Just as with indexes, a foreign key constraint can be checked "in bulk" more efficiently than row-by-row. It can be useful to drop foreign key constraints, load data, and re-create the constraints. In this case, you have to choose between the speed of loading data and loss of error checking while the constraint is missing.
Moreover, when you load data into a table with existing foreign key constraints, each new row requires an entry in the server’s list of pending trigger events, since it is the firing of a trigger that checks the row’s foreign key constraint. Loading many millions of rows can cause the trigger event queue to overflow available memory, leading to intolerable swapping or even the failure of the command. Therefore, it may be necessary, not just desirable, to drop and re-apply foreign keys when loading large amounts of data. If temporarily removing the constraint is not acceptable, the only possible solution may be to split the load operation into smaller transactions.
Increase maintenance_work_mem
Temporarily increasing the maintenance_work_mem configuration parameter when loading large amounts of data can improve performance. It helps to speed up the CREATE INDEX
commands and ALTER TABLE ADD FOREIGN KEY
commands. This does not affect the speed of the COPY
command, so this tip is only useful when you are using one or both of the techniques described above. maintenance_work_mem
is located in the ADPG configurations section of the Configuration tree on the Clusters → ADPG cluster → Services → ADPG → Primary configuration page in ADCM UI. See Configuration parameters.
Increase max_wal_size
Temporarily increasing the max_wal_size configuration parameter can also make large data loads faster. Loading a large amount of data causes checkpoints to occur more often than the normal checkpoint frequency (specified by the checkpoint_timeout
configuration parameter). Whenever a checkpoint occurs, all dirty pages must be flushed to disk. The number of checkpoints that are required can be reduced by increasing max_wal_size
during bulk data loads. max_wal_size
is also located in the ADPG configurations section of the Configuration tree on the Clusters → ADPG cluster → Services → ADPG → Primary configuration page in ADCM UI.
Disable WAL archiving and backups
When loading large amounts of data to a cluster that uses WAL archiving and streaming replication, it might be faster to take a new base backup after the load is complete than to process a large amount of incremental WAL data.
To disable WAL archiving, set the following configuration parameters:
-
wal_level to
minimal
; -
archive_mode to
off
; -
max_wal_senders to
0
.
Use ADPG configurations on the Clusters → ADPG cluster → Services → ADPG → Primary configuration tab in ADCM UI to specify these parameters. See Configuration parameters for details.
To disable backup creation, switch off the Enable backups toggle button displayed on the Primary configuration tab of the ADPG service (Clusters → ADPG cluster → Services → ADPG → Primary configuration).
To apply changes of these settings, execute the Reconfigure & Restart action.
IMPORTANT
When you disable WAL archiving, it makes any backups taken before unavailable for archive recovery and a standby server, which may lead to data loss.
|
These settings also make certain commands run faster because they do not write WAL if wal_level
is set to minimal
.
Run ANALYZE afterwards
Whenever you have significantly altered the distribution of data within a table, running the ANALYZE command is strongly recommended. Running ANALYZE
(or VACUUM ANALYZE
) ensures that the planner has up-to-date statistics about the table. With no statistics or obsolete statistics, the planner might make poor decisions during query planning, leading to poor performance. Note that if the autovacuum daemon is enabled, it might run ANALYZE
automatically.
Improve pg_dump performance
Dump scripts generated by pg_dump automatically apply several, but not all, of the above guidelines. To restore a dump as quickly as possible, you need to do a few extra things manually. Note that these points apply while restoring a dump, not while creating it.
By default, pg_dump uses COPY
, and when it is generating a complete schema-and-data dump, it is careful to load data before creating indexes and foreign keys. So, several guidelines are handled automatically. All you have to consider is the following:
-
Set appropriate (larger than normal) values for
maintenance_work_mem
andmax_wal_size
. -
If using WAL archiving or streaming replication, disable them during the restore as described in the Disable WAL archiving and backups section.
-
Experiment with the parallel dump and restore modes of both the pg_dump and pg_restore utility and find the optimal number of concurrent jobs. Parallel data copying and restoring, controlled by the
-j
option, should give you a significantly higher performance over the serial mode. -
In some cases, it may be preferable to restore the dump in a single transaction. To do this, pass the
-1
or--single-transaction
command-line option to psql or pg_restore.COPY
commands will also run faster if you use a single transaction and disable WAL archiving. Please note that when using this mode, any error will roll back the entire restore, potentially resulting in the loss of many hours of processing. -
If multiple CPUs are available in the database server, consider using the
--jobs
option of pg_restore. This allows concurrent data loading and index creation. -
Run
ANALYZE
after backup restoring.
A data-only dump also uses COPY
, but it does not drop or recreate indexes, and it does not operate with foreign keys by default. So, when loading a data-only dump, it is your responsibility to drop and recreate indexes and foreign keys. It is still useful to increase max_wal_size
while loading the data, but not maintenance_work_mem
. maintenance_work_mem
is worth changing only if you manually recreate indexes and foreign keys. It is also recommended to run ANALYZE
at the end of the recovery.