Data directory structure
This article describes a data directory structure of an ADPG cluster.
ADPG stores data files in the directory specified by the Data directory configuration parameter. The default value is /pg_data1
. You can change this value during cluster installation before the ADPG service is installed.
Contents of the cluster data directory
The directory specified in Data directory contains a subdirectory for the current cluster. For ADPG v16.x.x, it is adpg16. The contents of the cluster subdirectory are listed in the table below.
Item | Description |
---|---|
PG_VERSION |
File containing the ADPG/PostgreSQL major version number |
base |
Directory containing per-database subdirectories |
current_logfiles |
File that records what log files the logging collector currently writes to |
global |
Directory containing cluster-wide tables, such as |
pg_commit_ts |
Directory containing transaction commit timestamp data |
pg_dynshmem |
Directory containing files used by the dynamic shared memory subsystem |
pg_logical |
Directory containing status data of subscriptions or replication slots for logical decoding |
pg_multixact |
Directory containing multi-transaction status data used for shared row locks |
pg_notify |
Directory containing |
pg_replslot |
Directory containing replication slot data |
pg_serial |
Directory containing information about committed serializable transactions |
pg_snapshots |
Directory containing exported snapshots |
pg_stat |
Directory containing permanent files for the statistics subsystem |
pg_stat_tmp |
Directory containing temporary files for the statistics subsystem |
pg_subtrans |
Directory containing subtransaction status data |
pg_tblspc |
Directory containing symbolic links to tablespaces |
pg_twophase |
Directory containing state files for prepared transactions |
pg_wal |
Directory containing WAL files |
pg_xact |
Directory containing transaction commit status data |
postgresql.auto.conf |
File used for storing configuration parameters that are set by the |
postmaster.opts |
File recording the command-line options the server was last started with |
postmaster.pid |
Lock file recording the current postmaster process ID (PID), cluster data directory path, postmaster start timestamp, port number, Unix-domain socket directory path (could be empty), the first valid value of the |
For each database, there is a subdirectory within the base directory named after the database’s OID. You can obtain OID from the pg_database system catalog:
SELECT oid, datname FROM pg_database;
oid | datname -----+----------- 5 | postgres 1 | template1 4 | template0
To list the contents of the base directory, run the following command:
$ cd /pg_data1/adpg16/base && ls
Result:
1 4 5
The database subdirectory is the default location for the database’s files including system catalogs.
Relation files
Standard relations such as tables or indexes are stored in separate files. These files are named by the table or index filenode
value, which is stored in the relfilenode
column of the pg_class system catalog. As an example, display a relation name and filenode
number for the first five records from pg_class
:
SELECT relname, relfilenode FROM pg_class LIMIT 5;
relname | relfilenode -------------------+------------- books_book_id_seq | 16390 books | 16391 books_pkey | 16396 authors_id_seq | 16398 authors | 16399
Since the listed relations belong to the postgres
database, the corresponding files with names 16390
, 16391
, 16396
, 16398
, and 16399
are located in the /pg_data1/adpg16/base/5 directory.
For temporary relations, the file name has the t<backend>_<filenode>
format, where <backend>
is the number of the backend process, which creates the file, and <filenode>
is the filenode
value. For example, t8_16433
.
In addition to the main file, each table and index has a free space map, which stores information about free space available in the relation. The free space map has the same file name as a corresponding relation plus the _fsm
suffix. For example, 16432_fsm
.
Tables also have visibility maps to track pages that do not contain dead tuples. A visibility map is stored using the same file name as a corresponding table with the _vm
suffix. For example, 16432_vm
.
Unlogged tables and indexes have an additional file known as the initialization fork, which is stored with the suffix _init
.
Note that although a table’s filenode
often matches its OID, this is not always the case. Some operations as TRUNCATE
, REINDEX
, CLUSTER
, and some ALTER TABLE
forms can change filenode
while preserving the OID.
When a table or index exceeds 1 GB, it is divided into segments with a maximum size of 1 GB. The first segment’s file name is the same as filenode
. Subsequent segments are named filenode.1
, filenode.2
, etc. This arrangement avoids problems on platforms that have file size limitations.
If a table contains columns with potentially large entries, a TOAST table is created to store these large values separately. The main table is linked to its TOAST table via the reltoastrelid
field of the pg_class system catalog. For more information, see TOAST.
Temporary files required for resource-consuming operations are created within base/pgsql_tmp. The name of a temporary file has the following format: pgsql_tmp<PID>.<Number>
, where <PID>
is the PID of the owning backend process, <Number>
distinguishes different temporary files of the current backend process. For example, pgsql_tmp29522.1
.
Tablespace files
Each user-defined tablespace has a symbolic link in the pg_tblspc subdirectory located in the cluster data directory (by default, /pg_data1/adpg16), which points to the physical tablespace directory — the location specified in the CREATE TABLESPACE
command. This symbolic link is named according to the tablespace OID. Inside the physical tablespace directory, ADPG creates a subdirectory with a name that depends on the ADPG version, for example, PG_16_202307071. Within this subdirectory, there is a subdirectory for each database that has elements in the tablespace. It is named according to the database’s OID. Tables and indexes are stored within this directory using the filenode
naming scheme described above.
The pg_default
tablespace is not accessed through pg_tblspc, but corresponds to the base subdirectory located in the cluster data directory. Similarly, the pg_global
tablespace is not accessed through pg_tblspc, but corresponds to global.
Temporary files are created within the pgsql_tmp subdirectory of a tablespace directory.
Obtain path to relation file
The pg_relation_filepath(relation regclass)
function returns the path relative to the cluster data directory for any relation. Keep in mind that this function gives the name of the first segment related to the current relation. You may need to append a segment number and/or the _fsm
, _vm
, or _init
suffix to obtain all the files associated with the relation.
Example:
SELECT pg_relation_filepath('books');
pg_relation_filepath ---------------------- base/5/16391
It is also possible to display the relation name and its path to a file for each relation from the pg_class
system catalog:
SELECT relname, pg_relation_filepath(oid) FROM pg_class;
books_book_id_seq | base/5/16390 books | base/5/16391 (1) books_pkey | base/5/16396 authors_id_seq | base/5/16398 authors | base/5/16399 authors_pkey | base/5/16403 pg_statistic | base/5/2619 pg_type | base/5/1247 pgbench_accounts | base/5/16422 pgbench_branches | base/5/16423 pgbench_tellers | base/5/16425 pgbench_branches_pkey | base/5/16426 pgbench_tellers_pkey | base/5/16428 pgbench_accounts_pkey | base/5/16430 pgbench_history | base/5/16457 pg_toast_1255 | base/5/2836 pg_toast_1255_index | base/5/2837 ... table1 | pg_tblspc/16451/PG_16_202307071/5/16454 (2) ...
1 | Table from the default tablespace. |
2 | Table from a user-defined tablespace. |