What’s new in Greenplum 7. Conclusion

Vasiliy Ivanov

Contents

Migrate to partitioning from Postgres
Row Level Security
Default permissions for schema and role
WALs are only retained for the last checkpoint
Just-in-Time (JIT) compilation
Generated columns
SQL/JSON
Autovacuum and autoanalyze
ANALYZE optimization for AO tables
External tables
Resource groups
Sum up

Hello everyone! Here is the final, third part of the overview describing changes in Greenplum 7. If for some reason you missed the previous publications, you can find them in our blog Part 1 and Part 2. And today we will talk about changes in table partitioning and resource groups, JIT compilation of expressions and much more. Let’s get started!

Migrate to partitioning from Postgres

Previous versions of Greenplum used its own original partitioning scheme, which was implemented for Postgres versions that did not have such support. Later, Postgres 10 also introduced its own partitioning (f0e4475). Greenplum 7 has migrated completely to the Postgres partitioning infrastructure. This means that many familiar tools have been removed and are no longer available (for example, the pg_partitions combine view, pg_partition, pg_partition_rule catalog relations, and so on). The table definition syntax familiar from the Postgres documentation is considered the main one. When dumping a schema, it will be used to describe it.

Table definition syntax

However, support for the classic syntax familiar from previous versions of Greenplum has been preserved as much as possible, except in cases where the expressions it defines have no analogues in the Postgres variant.

Greenplum table definition syntax

For example, range partitions now always include the lower bound and exclude the upper one. Using the classic syntax, you can define another variant, but only for those data types that allow expressing non-strict equality through strict equality.

Classic Greenplum syntax

The new features are only available when using the Postgres syntax. This includes the ability to split a table into ranges by several columns or based on the remainder of a hash value, and heterogeneous multi-level partitioning. Another issue is that only the Postgres-derived planner supports them, but not ORCA.

Row Level Security

Starting with Postgres 9.5, in addition to the standard access model at the table operation type level, it is now possible to define data access policies within a table. Each policy is specified using a condition. The condition defines a set of rows that the associated role can read or modify. In the example below, we create a policy for the alice role that grants access only to rows that contain an even value in the b column.

A policy that grants access only to rows that contain an even value in column b

A policy that grants access only to rows containing an even value in column b

The expression specified in the policy definition is used to filter rows during scanning or to check for a match during insertion.

Using an expression from a policy

Due to the implementation of this functionality being abstracted from the storage format, the policies are also available for AO tables. On the other hand, the ORCA planner in the current Greenplum versions at the time of writing this article does not support queries to tables for which access policies are activated.

The ORCA planner does not support queries on tables with access policies

Default permissions for schema and role

Continuing with the topic of distributing access rights, I would like to note the ability introduced by Postgres 10 to set default permissions for objects that will be created later within the schema.

Set default permissions

Or set default permissions for objects that will be created by a specific role.

Specify a role

WALs are only retained for the last checkpoint

Postgres up to version 10 inclusive (and therefore Greenplum versions earlier than 7) retained write-ahead log segments from the last two checkpoints for crash recovery. The rationale was to improve recovery reliability in cases where the most recent checkpoint was not found in the write-ahead log. However, in some cases such recovery would still be impossible, for example due to the permanent deletion of tablespace directories at the time of the last checkpoint. Therefore, it was decided to abandon this practice (4b0d28d), while significantly reducing the amount of disk space occupied by write-ahead logs.

Just-in-Time (JIT) compilation

Just-in-Time compilation

In my opinion, another significant innovation for the analytical workload is the Just-in-Time compilation of universal code sections for processing a specific user request. Procedures for checking user-specified conditions, "deforming" rows from the disk representation into a representation for subsequent processing — these operations are performed for each retrieved row, of which, in a self-respecting request, there will be at least millions. As a result, the costs of converting the interpreted execution version into a specialized version will be covered by the gain in performance. By default jit is disabled, the state is controlled by the GUC of the same name. The output of jit statistics for EXPLAIN ANALYZE is also closed with a separate configuration variable gp_explain_jit. ORCA supports jit, however, to support differences in the cost model of this planner, duplicates of all configuration parameters with the optimizer_ prefix are introduced (for example, optimizer_jit_above_cost).

Generated columns

Postgres 12 introduced standard-compliant generated column syntax. Currently, only stored generated columns are supported (fc22b66). Support for virtual columns or columns computed during scanning was planned to be implemented later. This syntax differs from attributes with a default value by the impossibility of redefining the value by the user, as well as the ability to use other row attributes in the expression (except for the generated ones). A synthetic example of using this functionality is shown below.

Example of generated columns

Only immutable functions can be used in an expression (for example, converting a time instant to a string is not suitable, since it is a stable function).
A generated column value is not exported or imported using COPY.
In Greenplum, a generated column cannot be used as a distribution key, which makes sense since the value of such a column is calculated in the Modify Table node after a potential redirect to the segment that will store the row.

SQL/JSON

With the Postgres 12 commits, support for the JSON path language became available. With functions and operators, the capabilities for processing JSON have extended significantly. More details on the capabilities can be found at page of the official documentation. As an example, I will give a simple query that allows you to extract only those flights where the destination is the Murmansk airport.

Using the JSON path language in queries

Autovacuum and autoanalyze

Cleaning tables from deleted rows and keeping statistics up to date has always been a headache for Greenplum administrators. Scheduled tasks that do not fit into maintenance window restrictions, finding criteria to run — all that stuff. What’s changed in the new version?

Unfortunately, automatic VACUUM for user tables didn’t happen. Autovacuum processes will operate only with tables in the pg_catalog schema and, optionally, auxiliary tables of Append Optimized relations (gp_autovacuum_scope). Also, according to the code, heap tables that are dangerously close to wraparound should be processed by vacuum, although I do not see this in the documentation. In addition, autovacuum processes operate independently on each segment, the distributed VACUUM has been postponed by the VMware team indefinitely.

But there is an automatic statistics collection. It will be initiated only from the coordinator — statistics on segments, as before, are not used. The Postgres approach will be used to make a decision. For this purpose, statistics collection after performing DML operations from writing processes to the coordinator was improved.

Statistics collection

Here anl_base_treshhold is a threshold value (default is 50 tuples), anl_scale_factor is a percentage of the number of tuples stored in the table (default is 10%). Specific values can be configured for each table individually.

However, how this works in production still needs to be assessed. The analyzedb utility has not disappeared from the distribution.

ANALYZE optimization for AO tables

Another issue related to statistics collection was the ANALYZE performance for AO tables. The reason is the inability to utilize the approach used for heap tables, where, we can first sample pages (the scan_analyze_next_block function of the table access method), and then the tuples in them (scan_analyze_next_tuple) based on a fixed page size. As a result, the table was scanned completely. This, in turn, led to attempts to store the number of updates in a table in external tools (analyzedb) in order to decide whether to collect statistics again.

One of our colleagues, after studying the issue, suggested reworking the approach by introducing virtual sampling blocks that assume a fixed number of rows in them. The performance gain should be achieved by skipping blocks of arbitrary length without wasting resources on decompression. However, the discussion about the patch has died down.

In 2023, the developers eventually rolled out their patch (2939f9a), in which they abandoned two-level sampling, adding optimization for the case when the table has at least one index (and therefore a block directory — Block Directory) and received a 20-fold increase in performance. To implement this approach, the table accessor interface has been extended with the relation_acquire_sample_rows function.

External tables

The support of external table protocols survived the update to the major version of Greenplum, but working with them is now hidden behind the FDW API. For this purpose, a built-in "remote" server — gp_exttable_server was implemented, which is responsible for working with the selected protocol. A full support for the CREATE EXTERNAL TABLE syntax is declared, but the expression is converted on the fly into a foreign table definition. The query plan will also have Foreign Scan.

Foreign Scan in a plan of a query to a external table

Resource groups

The resource limitation mechanism has been significantly redesigned. To begin with, the GUC gp_resource_manager now has four values: none (default), group, group-v2, queue. And yes, the queues are very much alive.

The group-v2 mode implements support for cgroup v2, which is so relevant for modern Linux distributions. The io controller, available for this version, made it possible to implement bandwidth limitation when accessing disk devices (the IO_LIMIT resource group parameter, read/write, MB/s, or IOPS). It can be configured individually for different tablespaces. It is worth considering that with a write restriction, this mechanism will only work well for Append Optimized tables, since heap table pages are normally written on a disk by a separate background process — background writer, which will not be subject to role restrictions.

However, this process is subject to the restrictions of the new built-in system_group resource group. Initially, the postmaster process and its child processes (for example, autovacuum worker processes) that have not been assigned another group or allowed to run without restrictions are placed in this group.

Limitations for different versions of cgroup are implemented using different controllers. For example, to limit the CPU for the first version, cpu.cfs_quota_us is used in combination with cpu.shares, for the second version — cpu.max and cpu.weight. The only option left is to set a hard limit, similar to the combination of CPU_RATE_LIMIT and GUC gp_resource_group_cpu_ceiling_enforcement, set to true in Greenplum 6. The mentioned GUC has been removed. CPU_RATE_LIMIT is split into two separate options CPU_MAX_PERCENT and CPU_WEIGHT that control the above cgroup parameters.

Perhaps the most controversial changes occurred in the memory consumption limiting subsystem. The developers present this as a simplification of the configuration. In fact, MEMORY_LIMIT now only affects the calculation of statement_mem for queries running in a group and can be easily overridden by a user using the GUC gp_resgroup_memory_query_fixed_mem. There is no further control over memory consumption during execution by resource groups. At the moment when memory consumption on a segment exceeds gp_vmem_protect_limit * runaway_detector_activation_percent, Greenplum will begin to interrupt requests starting with the heaviest one on the segment, and not in the group, as it was before. The MEMORY_SPILL_RATIO and MEMORY_SHARED_QUOTA, GUC gp_resource_group_enable_recalculate_query_mem and gp_resource_group_memory_limit options have been removed.

Sum up

As we see, Greenplum 7 brings many new capabilities to the analytics workload. For myself, I highlighted the modern partitioning scheme, BRIN indexes, JIT compilation, expanding the capabilities of FDW, as well as numerous optimizations. On the other hand, some of Postgres 12 functionality will not be available in its original form for objective reasons. The ORCA planner will also require improvements to accommodate the new features. Radical changes always require a period of stabilization. I hope that my article helped you answer the question — why do we need a new Greenplum.

Found a mistake? Seleсt text and press Ctrl+Enter to report it