Manage fault-tolerant execution in Trino

By default, when a Trino node runs out of resources while executing a query or fails due to some other reason, the entire query fails and has to be re-submitted manually. Typically, the longer a query execution lasts, the more likely it is to fail.

The Trino service provides a fault-tolerant execution (FTE) mechanism that mitigates query failures by retrying failed queries and their subtasks. If FTE is enabled and a query fails, Trino persists intermediate data in a file system, allowing other Trino workers to perform a retry using this data.

NOTE
  • The support for FTE is connector-dependent. The list of Trino connectors that support FTE is available in the Trino documentation.

  • The Trino FTE mechanism does not apply to broken queries or other user errors. For example, Trino does not retry a query that has failed due to invalid SQL syntax or incompatible column types.

Enable FTE

By default, the FTE mechanism is disabled for the Trino service. You can enable FTE using ADCM (Clusters → <clusterName> → Services → Trino → Primary Configuration → Fault-tolerant execution). Particularly, the retry-policy parameter should be specified. The parameter accepts the following values:

  • NONE — disables FTE (default).

  • QUERY — if a Trino worker fails, Trino retries the entire query. This policy is recommended when the majority of Trino requests are small queries.

  • TASK — if a Trino worker fails, Trino retries individual query tasks. This policy is best suited for retrying large batch queries, however, it can result in higher latency for big number of short-running queries.

    NOTE
    Selecting the TASK retry policy automatically adjusts some relevant Trino configurations to follow best practices for a fault-tolerant Trino cluster. For the TASK retry policy, it is strongly recommended to set the task.low-memory-killer.policy=total-reservation-on-blocked-nodes parameter; otherwise, queries may need to be killed manually if Trino worker components run out of memory.

Exchange manager and exchange data

In Trino, Exchange manager is a component responsible for transferring exchange data between Trino workers while executing a distributed query. When FTE is enabled, this component becomes responsible for spooling exchange data to a file system like HDFS, Ozone, etc. When activating FTE, you can select an Exchange manager implementation to save intermediate query data in a specific file system.

FTE settings

The Trino service provides several settings groups in ADCM, which are used for tuning the FTE behavior. These are:

  • Fault-tolerant execution. The settings define the retry behavior, including retry policies, intervals, encryption, etc.

  • Exchange manager settings. The settings are used to control the intermediate spooling data generated by an Exchange manager (storage type and location, permissions, buffers, etc.).

Spool data to HDFS, Ozone, and local FS

By default, the Trino service provides two Exchange manager implementations that can spool data to HDFS, Ozone, and to a local file system.

When you enable FTE, the Trino service automatically detects whether HDFS or Ozone is installed in the ADH cluster, and updates all the required settings to work with the current storage type.

To store exchange data to HDFS, set the exchange-manager.name=hdfs parameter and specify the path for storing spooling data by using exchange.base-directories with the hdfs:/// schema in the path.

To store exchange data in Ozone, set the exchange-manager.name=hdfs parameter and specify the storage path using the ofs:/// schema.

NOTE
Specifying both HDFS and Ozone paths in exchange.base-directories is not allowed.

To store exchange data in the local file system, set the exchange-manager.name=local parameter and specify the path for storing spooling data using exchange.base-directories (use the file:/// scheme).

The Trino service can automatically generate directories for spooling data. Using FTE settings, you can specify the location, directory permissions, etc.

Found a mistake? Seleсt text and press Ctrl+Enter to report it