What’s new in Greenplum 7. Conclusion
- Migrate to partitioning from Postgres
- Row Level Security
- Default permissions for schema and role
- WALs are only retained for the last checkpoint
- Just-in-Time (JIT) compilation
- Generated columns
- SQL/JSON
- Autovacuum and autoanalyze
- ANALYZE optimization for AO tables
- External tables
- Resource groups
- Sum up
Hello everyone! Here is the final, third part of the overview describing changes in Greenplum 7. If for some reason you missed the previous publications, you can find them in our blog Part 1 and Part 2. And today we will talk about changes in table partitioning and resource groups, JIT compilation of expressions and much more. Let’s get started!
Migrate to partitioning from Postgres
Previous versions of Greenplum used its own original partitioning scheme, which was implemented for Postgres versions that did not have such support. Later, Postgres 10 also introduced its own partitioning (f0e4475).
Greenplum 7 has migrated completely to the Postgres partitioning infrastructure. This means that many familiar tools have been removed and are no longer available (for example, the pg_partitions
combine view, pg_partition
, pg_partition_rule
catalog relations, and so on). The table definition syntax familiar from the Postgres documentation is considered the main one. When dumping a schema, it will be used to describe it.
However, support for the classic syntax familiar from previous versions of Greenplum has been preserved as much as possible, except in cases where the expressions it defines have no analogues in the Postgres variant.
For example, range partitions now always include the lower bound and exclude the upper one. Using the classic syntax, you can define another variant, but only for those data types that allow expressing non-strict equality through strict equality.
The new features are only available when using the Postgres syntax. This includes the ability to split a table into ranges by several columns or based on the remainder of a hash value, and heterogeneous multi-level partitioning. Another issue is that only the Postgres-derived planner supports them, but not ORCA.
Row Level Security
Starting with Postgres 9.5, in addition to the standard access model at the table operation type level, it is now possible to define data access policies within a table. Each policy is specified using a condition. The condition defines a set of rows that the associated role can read or modify. In the example below, we create a policy for the alice
role that grants access only to rows that contain an even value in the b
column.
The expression specified in the policy definition is used to filter rows during scanning or to check for a match during insertion.
Due to the implementation of this functionality being abstracted from the storage format, the policies are also available for AO tables. On the other hand, the ORCA planner in the current Greenplum versions at the time of writing this article does not support queries to tables for which access policies are activated.
Default permissions for schema and role
Continuing with the topic of distributing access rights, I would like to note the ability introduced by Postgres 10 to set default permissions for objects that will be created later within the schema.
Or set default permissions for objects that will be created by a specific role.
WALs are only retained for the last checkpoint
Postgres up to version 10 inclusive (and therefore Greenplum versions earlier than 7) retained write-ahead log segments from the last two checkpoints for crash recovery. The rationale was to improve recovery reliability in cases where the most recent checkpoint was not found in the write-ahead log. However, in some cases such recovery would still be impossible, for example due to the permanent deletion of tablespace directories at the time of the last checkpoint. Therefore, it was decided to abandon this practice (4b0d28d), while significantly reducing the amount of disk space occupied by write-ahead logs.
Just-in-Time (JIT) compilation
In my opinion, another significant innovation for the analytical workload is the Just-in-Time compilation of universal code sections for processing a specific user request. Procedures for checking user-specified conditions, "deforming" rows from the disk representation into a representation for subsequent processing — these operations are performed for each retrieved row, of which, in a self-respecting request, there will be at least millions. As a result, the costs of converting the interpreted execution version into a specialized version will be covered by the gain in performance. By default jit
is disabled, the state is controlled by the GUC of the same name. The output of jit
statistics for EXPLAIN ANALYZE
is also closed with a separate configuration variable gp_explain_jit
. ORCA supports jit
, however, to support differences in the cost model of this planner, duplicates of all configuration parameters with the optimizer_
prefix are introduced (for example, optimizer_jit_above_cost
).
Generated columns
Postgres 12 introduced standard-compliant generated column syntax. Currently, only stored generated columns are supported (fc22b66). Support for virtual columns or columns computed during scanning was planned to be implemented later. This syntax differs from attributes with a default value by the impossibility of redefining the value by the user, as well as the ability to use other row attributes in the expression (except for the generated ones). A synthetic example of using this functionality is shown below.
-
Only immutable functions can be used in an expression (for example, converting a time instant to a string is not suitable, since it is a stable function).
-
A generated column value is not exported or imported using
COPY
. -
In Greenplum, a generated column cannot be used as a distribution key, which makes sense since the value of such a column is calculated in the
Modify Table
node after a potential redirect to the segment that will store the row.
SQL/JSON
With the Postgres 12 commits, support for the JSON path language became available. With functions and operators, the capabilities for processing JSON have extended significantly. More details on the capabilities can be found at page of the official documentation. As an example, I will give a simple query that allows you to extract only those flights where the destination is the Murmansk airport.
Autovacuum and autoanalyze
Cleaning tables from deleted rows and keeping statistics up to date has always been a headache for Greenplum administrators. Scheduled tasks that do not fit into maintenance window restrictions, finding criteria to run — all that stuff. What’s changed in the new version?
Unfortunately, automatic VACUUM
for user tables didn’t happen. Autovacuum processes will operate only with tables in the pg_catalog
schema and, optionally, auxiliary tables of Append Optimized relations (gp_autovacuum_scope
). Also, according to the code, heap tables that are dangerously close to wraparound should be processed by vacuum, although I do not see this in the documentation. In addition, autovacuum processes operate independently on each segment, the distributed VACUUM
has been postponed by the VMware team indefinitely.
But there is an automatic statistics collection. It will be initiated only from the coordinator — statistics on segments, as before, are not used. The Postgres approach will be used to make a decision. For this purpose, statistics collection after performing DML operations from writing processes to the coordinator was improved.
Here anl_base_treshhold
is a threshold value (default is 50
tuples), anl_scale_factor
is a percentage of the number of tuples stored in the table (default is 10%
). Specific values can be configured for each table individually.
However, how this works in production still needs to be assessed. The analyzedb
utility has not disappeared from the distribution.
ANALYZE optimization for AO tables
Another issue related to statistics collection was the ANALYZE
performance for AO tables. The reason is the inability to utilize the approach used for heap tables, where, we can first sample pages (the scan_analyze_next_block
function of the table access method), and then the tuples in them (scan_analyze_next_tuple
) based on a fixed page size. As a result, the table was scanned completely. This, in turn, led to attempts to store the number of updates in a table in external tools (analyzedb
) in order to decide whether to collect statistics again.
One of our colleagues, after studying the issue, suggested reworking the approach by introducing virtual sampling blocks that assume a fixed number of rows in them. The performance gain should be achieved by skipping blocks of arbitrary length without wasting resources on decompression. However, the discussion about the patch has died down.
In 2023, the developers eventually rolled out their patch (2939f9a), in which they abandoned two-level sampling, adding optimization for the case when the table has at least one index (and therefore a block directory — Block Directory
) and received a 20-fold increase in performance. To implement this approach, the table accessor interface has been extended with the relation_acquire_sample_rows
function.
External tables
The support of external table protocols survived the update to the major version of Greenplum, but working with them is now hidden behind the FDW API. For this purpose, a built-in "remote" server — gp_exttable_server
was implemented, which is responsible for working with the selected protocol. A full support for the CREATE EXTERNAL TABLE
syntax is declared, but the expression is converted on the fly into a foreign table definition. The query plan will also have Foreign Scan
.
Resource groups
The resource limitation mechanism has been significantly redesigned. To begin with, the GUC gp_resource_manager
now has four values: none
(default), group
, group-v2
, queue
. And yes, the queues are very much alive.
The group-v2
mode implements support for cgroup v2, which is so relevant for modern Linux distributions. The io
controller, available for this version, made it possible to implement bandwidth limitation when accessing disk devices (the IO_LIMIT
resource group parameter, read/write, MB/s, or IOPS). It can be configured individually for different tablespaces. It is worth considering that with a write restriction, this mechanism will only work well for Append Optimized tables, since heap table pages are normally written on a disk by a separate background process — background writer
, which will not be subject to role restrictions.
However, this process is subject to the restrictions of the new built-in system_group
resource group. Initially, the postmaster process and its child processes (for example, autovacuum worker processes) that have not been assigned another group or allowed to run without restrictions are placed in this group.
Limitations for different versions of cgroup are implemented using different controllers. For example, to limit the CPU for the first version, cpu.cfs_quota_us
is used in combination with cpu.shares
, for the second version — cpu.max
and cpu.weight
. The only option left is to set a hard limit, similar to the combination of CPU_RATE_LIMIT
and GUC gp_resource_group_cpu_ceiling_enforcement
, set to true in Greenplum 6. The mentioned GUC has been removed. CPU_RATE_LIMIT
is split into two separate options CPU_MAX_PERCENT
and CPU_WEIGHT
that control the above cgroup parameters.
Perhaps the most controversial changes occurred in the memory consumption limiting subsystem. The developers present this as a simplification of the configuration. In fact, MEMORY_LIMIT
now only affects the calculation of statement_mem
for queries running in a group and can be easily overridden by a user using the GUC gp_resgroup_memory_query_fixed_mem
. There is no further control over memory consumption during execution by resource groups. At the moment when memory consumption on a segment exceeds gp_vmem_protect_limit * runaway_detector_activation_percent
, Greenplum will begin to interrupt requests starting with the heaviest one on the segment, and not in the group, as it was before. The MEMORY_SPILL_RATIO
and MEMORY_SHARED_QUOTA
, GUC gp_resource_group_enable_recalculate_query_mem
and gp_resource_group_memory_limit
options have been removed.
Sum up
As we see, Greenplum 7 brings many new capabilities to the analytics workload. For myself, I highlighted the modern partitioning scheme, BRIN indexes, JIT compilation, expanding the capabilities of FDW, as well as numerous optimizations. On the other hand, some of Postgres 12 functionality will not be available in its original form for objective reasons. The ORCA planner will also require improvements to accommodate the new features. Radical changes always require a period of stabilization. I hope that my article helped you answer the question — why do we need a new Greenplum.