Run Flink on YARN

This section provides an example of running Flink on top of the YARN cluster. The example runs a Flink application in the application mode, which is the recommended mode for production use, as it provides a better isolation between applications. In this mode, when a Flink cluster runs on YARN, the main() method of the Flink application JAR is executed in YARN containers.

To run Flink on YARN, follow the steps below:

  1. On the cluster node with a Flink client installed, open a new bash session under the flink user.

    $ sudo -u flink bash
  2. In the opened bash session, run the command:

    $ . /etc/flink/conf/flink-env.sh
  3. Run the Flink application with the following command:

    $ flink run-application (1)
    -t yarn-application (2)
    -Dsecurity.kerberos.token.provider.hadoopfs.renewer=yarn (3)
    /usr/lib/flink/examples/batch/WordCount.jar
    1 Runs Flink in the application mode.
    2 Sets YARN as the deployment target for the given application.
    3 Forces to use the yarn user to refresh delegation tokens (the default user is flink).
    NOTE
    The -Dsecurity.kerberos.token.provider.hadoopfs.renewer=yarn flag is mandatory if your ADH cluster is kerberized.

    The output should look similar to the following:

    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/lib/flink/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.35.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    2023-12-21 13:44:07,573 INFO  org.apache.hadoop.yarn.client.RMProxy                        [] - Connecting to ResourceManager at ka-adh-3.ru-central1.internal/10.92.40.84:8032
    2023-12-21 13:44:07,744 INFO  org.apache.hadoop.yarn.client.AHSProxy                       [] - Connecting to Application History server at ka-adh-2.ru-central1.internal/10.92.40.180:10200
    2023-12-21 13:44:07,751 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
    2023-12-21 13:44:07,913 INFO  org.apache.hadoop.conf.Configuration                         [] - resource-types.xml not found
    2023-12-21 13:44:07,913 INFO  org.apache.hadoop.yarn.util.resource.ResourceUtils           [] - Unable to find 'resource-types.xml'.
    2023-12-21 13:44:07,960 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Cluster specification: ClusterSpecification{masterMemoryMB=2048, taskManagerMemoryMB=2048, slotsPerTaskManager=1}
    2023-12-21 13:44:08,413 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory      [] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
    2023-12-21 13:44:08,803 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Cannot use kerberos delegation token manager, no valid kerberos credentials provided.
    2023-12-21 13:44:08,809 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Submitting application master application_1702943615795_0009
    2023-12-21 13:44:08,841 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl        [] - Submitted application application_1702943615795_0009
    2023-12-21 13:44:08,841 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Waiting for the cluster to be allocated
    2023-12-21 13:44:08,842 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Deploying cluster, current state ACCEPTED
    2023-12-21 13:44:13,620 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - YARN application has been deployed successfully.
    2023-12-21 13:44:13,621 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Found Web Interface ka-adh-2.ru-central1.internal:43046 of application 'application_1702943615795_0009'.
  4. Open the YARN web interface. The web UI link can be found in ADCM (Clusters → <ADH_cluster> → Services → YARN → Info). The submitted application with the SUCCEEDED status is available in the list as shown below.

YARN web UI
YARN web interface
YARN web UI
YARN web interface
Found a mistake? Seleсt text and press Ctrl+Enter to report it