Table types

Overview

ADB (as well as Greengage DB) supports several types of tables with different data storage models:

  • Heap. This is the default table type. Heap tables are recommended for OLTP-type workloads. You can use them if the data is frequently modified after being initially loaded. It is also the most suitable kind of tables for single-row queries INSERT, UPDATE, and DELETE.

  • Append-optimized (AO). These tables are preferable for OLAP-type workloads. They are also optimized for bulk data loading. You can use them if the data is rarely modified after being initially loaded, i.e. the read-only queries prevail. Unlike heap tables (that can be only row-oriented), append-optimized tables support two forms of data orientation:

    • Row-oriented. This storage model is recommended for retrieving all or most of the table columns at once.

    • Column-oriented. This model is good for data computing over a small number of columns. You can also use it for tables that need regular updates of a small part of the columns — without modifying the rest of them.

Table types
Table types
Table types
Table types

When you create a table, you should choose the most suitable table type. Starting with ADB 7, this type is defined by the table access method — an interface introduced in PostgreSQL 12.

The following subsections show the difference between the table types that are possible in ADB. This information can be useful for choosing the best storage model for your workloads.

Heap vs Append-optimized

Criteria Heap Append-optimized

Load

OLTP

OLAP

Size of tables

Small tables, e.g. dimension tables

Denormalized fact tables that are typically the largest ones in the system

Frequency of data updates after the initial loading

Often

Rarely

Prevailing queries

Many iterative transactions including single-row updates and parallel batch operations

  • Loading by batches.

  • Access by read-only queries.

Ability to define unique values

Supported — by using the UNIQUE and PRIMARY KEY column constraints

 — 

Compression

 — 

  • Row-oriented: ZSTD, ZLIB (table level).

  • Column-oriented: ZSTD, ZLIB, RLE_TYPE (table and column level).

Syntax

Heap tables are created by default:

CREATE TABLE test_heap (a int, b text)
    DISTRIBUTED BY (a);

To create an append-optimized row-oriented table, use the USING ao_row clause.

CREATE TABLE test_ao_row (a int, b text)
  USING ao_row
  DISTRIBUTED BY (a);

To create an append-optimized column-oriented table, use the USING ao_column clause:

CREATE TABLE test_ao_column (a int, b text)
  USING ao_column
  DISTRIBUTED BY (a);

The legacy WITH (appendoptimized=true|false, orientation=row|column) way is also supported

Recommendations on usage

OLTP-type workloads — with many iterative transactions over small tables

Data warehouse environment (OLAP-type workloads) — with bulk loading and subsequent read-only requests over large denormalized tables

CAUTION

It is not recommended to perform the following commands over append-optimized tables due to their inefficiency:

  • The single-row INSERT, UPDATE, or DELETE operations.

  • The parallel batch UPDATE or DELETE operations (this does not apply to the batch INSERT operations).

Row-oriented vs Column-oriented

Criteria Row-oriented Column-oriented

Table type

Heap and append-optimized

Append-optimized

Frequency of data updates after the initial loading

Often

Rarely

Frequency of data insertion

Often

Rarely  
 
(column-oriented tables are not optimized for the INSERT operations, as column values for a row are written to different places on disk)

Number of columns in the table

Relatively small

Many columns (dozens or more)

Number of requested columns

All or the majority of columns

A small subset of columns

Prevailing queries

Any

  • The SELECT queries that:

    • Aggregate many values of a single column (the WHERE and HAVING predicates also use this column):

      SELECT SUM(price) FROM book;
      SELECT SUM(price) FROM book WHERE price > 1000;
    • Return a relatively small number of columns (the WHERE predicate uses a single column):

      SELECT title FROM book WHERE public_year=2022;
  • Rare UPDATE queries that modify a small subset of columns without modifying other column data.

Possible at the table level in append-optimized tables.  
Algorithms: ZSTD, ZLIB

Possible at the table and the column levels in append-optimized tables.  
Algorithms: ZSTD, ZLIB, RLE_TYPE

Syntax

  • Heap:

    CREATE TABLE test_heap (a int, b text)
        DISTRIBUTED BY (a);
  • Append-optimized:

    CREATE TABLE test_ao_row (a int, b text)
      USING ao_row
      DISTRIBUTED BY (a);

Append-optimized only:  

CREATE TABLE test_ao_column (a int, b text)
  USING ao_column
  DISTRIBUTED BY (a);

Recommendations on usage

  • OLTP-type workloads — with many iterative transactions including frequent inserts and updates.

  • OLAP-type workloads or mixed workloads — with queries where many columns of a single row are retrieved all at once.

Data warehouse environment (OLAP-type workloads) — with aggregations of many values of a single column or regular selects/updates of a small number of columns

Found a mistake? Seleсt text and press Ctrl+Enter to report it