Data directory structure

Eugenia Kuzina

Contents

Contents of the cluster data directory
Relation files
Tablespace files
Obtain path to relation file

This article describes a data directory structure of an ADPG cluster.

ADPG stores data files in the directory specified by the Data directory configuration parameter. The default value is /pg_data1. You can change this value during cluster installation before the ADPG service is installed.

Contents of the cluster data directory

The directory specified in Data directory contains a subdirectory for the current cluster. For ADPG v16.x.x, it is adpg16. The contents of the cluster subdirectory are listed in the table below.

Item Description

PG_VERSION

File containing the ADPG/PostgreSQL major version number

base

Directory containing per-database subdirectories

current_logfiles

File that records what log files the logging collector currently writes to

global

Directory containing cluster-wide tables, such as pg_database

pg_commit_ts

Directory containing transaction commit timestamp data

pg_dynshmem

Directory containing files used by the dynamic shared memory subsystem

pg_logical

Directory containing status data of subscriptions or replication slots for logical decoding

pg_multixact

Directory containing multi-transaction status data used for shared row locks

pg_notify

Directory containing LISTEN/NOTIFY status data

pg_replslot

Directory containing replication slot data

pg_serial

Directory containing information about committed serializable transactions

pg_snapshots

Directory containing exported snapshots

pg_stat

Directory containing permanent files for the statistics subsystem

pg_stat_tmp

Directory containing temporary files for the statistics subsystem

pg_subtrans

Directory containing subtransaction status data

pg_tblspc

Directory containing symbolic links to tablespaces

pg_twophase

Directory containing state files for prepared transactions

pg_wal

Directory containing WAL files

pg_xact

Directory containing transaction commit status data

postgresql.auto.conf

File used for storing configuration parameters that are set by the ALTER SYSTEM SQL command

postmaster.opts

File recording the command-line options the server was last started with

postmaster.pid

Lock file recording the current postmaster process ID (PID), cluster data directory path, postmaster start timestamp, port number, Unix-domain socket directory path (could be empty), the first valid value of the listen_address parameter (IP address, or *, or empty if not listening on TCP), and shared memory segment ID. This file is not present after server shutdown

For each database, there is a subdirectory within the base directory named after the database’s OID. You can obtain OID from the pg_database system catalog:

SELECT oid, datname FROM pg_database;

 oid |  datname
-----+-----------
   5 | postgres
   1 | template1
   4 | template0

To list the contents of the base directory, run the following command:

$ cd /pg_data1/adpg16/base && ls

Result:

1  4  5

The database subdirectory is the default location for the database’s files including system catalogs.

Relation files

Standard relations such as tables or indexes are stored in separate files. These files are named by the table or index filenode value, which is stored in the relfilenode column of the pg_class system catalog. As an example, display a relation name and filenode number for the first five records from pg_class:

SELECT relname, relfilenode FROM pg_class LIMIT 5;

      relname      | relfilenode
-------------------+-------------
 books_book_id_seq |       16390
 books             |       16391
 books_pkey        |       16396
 authors_id_seq    |       16398
 authors           |       16399

Since the listed relations belong to the postgres database, the corresponding files with names 16390, 16391, 16396, 16398, and 16399 are located in the /pg_data1/adpg16/base/5 directory.

For temporary relations, the file name has the t<backend>_<filenode> format, where <backend> is the number of the backend process, which creates the file, and <filenode> is the filenode value. For example, t8_16433.

In addition to the main file, each table and index has a free space map, which stores information about free space available in the relation. The free space map has the same file name as a corresponding relation plus the _fsm suffix. For example, 16432_fsm.

Tables also have visibility maps to track pages that do not contain dead tuples. A visibility map is stored using the same file name as a corresponding table with the _vm suffix. For example, 16432_vm.

Unlogged tables and indexes have an additional file known as the initialization fork, which is stored with the suffix _init.

Note that although a table’s filenode often matches its OID, this is not always the case. Some operations as TRUNCATE, REINDEX, CLUSTER, and some ALTER TABLE forms can change filenode while preserving the OID.

When a table or index exceeds 1 GB, it is divided into segments with a maximum size of 1 GB. The first segment’s file name is the same as filenode. Subsequent segments are named filenode.1, filenode.2, etc. This arrangement avoids problems on platforms that have file size limitations.

If a table contains columns with potentially large entries, a TOAST table is created to store these large values separately. The main table is linked to its TOAST table via the reltoastrelid field of the pg_class system catalog. For more information, see TOAST.

Temporary files required for resource-consuming operations are created within base/pgsql_tmp. The name of a temporary file has the following format: pgsql_tmp<PID>.<Number>, where <PID> is the PID of the owning backend process, <Number> distinguishes different temporary files of the current backend process. For example, pgsql_tmp29522.1.

Tablespace files

Each user-defined tablespace has a symbolic link in the pg_tblspc subdirectory located in the cluster data directory (by default, /pg_data1/adpg16), which points to the physical tablespace directory — the location specified in the CREATE TABLESPACE command. This symbolic link is named according to the tablespace OID. Inside the physical tablespace directory, ADPG creates a subdirectory with a name that depends on the ADPG version, for example, PG_16_202307071. Within this subdirectory, there is a subdirectory for each database that has elements in the tablespace. It is named according to the database’s OID. Tables and indexes are stored within this directory using the filenode naming scheme described above.

The pg_default tablespace is not accessed through pg_tblspc, but corresponds to the base subdirectory located in the cluster data directory. Similarly, the pg_global tablespace is not accessed through pg_tblspc, but corresponds to global.

Temporary files are created within the pgsql_tmp subdirectory of a tablespace directory.

Obtain path to relation file

The pg_relation_filepath(relation regclass) function returns the path relative to the cluster data directory for any relation. Keep in mind that this function gives the name of the first segment related to the current relation. You may need to append a segment number and/or the _fsm, _vm, or _init suffix to obtain all the files associated with the relation.

Example:

SELECT pg_relation_filepath('books');

pg_relation_filepath
----------------------
 base/5/16391

It is also possible to display the relation name and its path to a file for each relation from the pg_class system catalog:

SELECT relname, pg_relation_filepath(oid) FROM pg_class;

 books_book_id_seq                              | base/5/16390
 books                                          | base/5/16391 (1)
 books_pkey                                     | base/5/16396
 authors_id_seq                                 | base/5/16398
 authors                                        | base/5/16399
 authors_pkey                                   | base/5/16403
 pg_statistic                                   | base/5/2619
 pg_type                                        | base/5/1247
 pgbench_accounts                               | base/5/16422
 pgbench_branches                               | base/5/16423
 pgbench_tellers                                | base/5/16425
 pgbench_branches_pkey                          | base/5/16426
 pgbench_tellers_pkey                           | base/5/16428
 pgbench_accounts_pkey                          | base/5/16430
 pgbench_history                                | base/5/16457
 pg_toast_1255                                  | base/5/2836
 pg_toast_1255_index                            | base/5/2837
...
 table1                                         | pg_tblspc/16451/PG_16_202307071/5/16454 (2)
...

1	Table from the default tablespace.
2	Table from a user-defined tablespace.

Found a mistake? Seleсt text and press Ctrl+Enter to report it