Run Flink on YARN
This section provides an example of running Flink on top of the YARN cluster.
The example runs a Flink application in the application mode, which is the recommended mode for production use, as it provides a better isolation between applications.
In this mode, when a Flink cluster runs on YARN, the main()
method of the Flink application JAR is executed in YARN containers.
To run Flink on YARN, follow the steps below:
-
On the cluster node with a Flink client installed, open a new
bash
session under theflink
user.$ sudo -u flink bash
-
In the opened bash session, run the command:
$ . /etc/flink/conf/flink-env.sh
-
Run the Flink application with the following command:
$ flink run-application (1) -t yarn-application (2) -Dsecurity.kerberos.token.provider.hadoopfs.renewer=yarn (3) /usr/lib/flink/examples/batch/WordCount.jar
1 Runs Flink in the application mode. 2 Sets YARN as the deployment target for the given application. 3 Forces to use the yarn
user to refresh delegation tokens (the default user isflink
).NOTEThe-Dsecurity.kerberos.token.provider.hadoopfs.renewer=yarn
flag is mandatory if your ADH cluster is kerberized.The output should look similar to the following:
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/flink/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.35.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2023-12-21 13:44:07,573 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at ka-adh-3.ru-central1.internal/10.92.40.84:8032 2023-12-21 13:44:07,744 INFO org.apache.hadoop.yarn.client.AHSProxy [] - Connecting to Application History server at ka-adh-2.ru-central1.internal/10.92.40.180:10200 2023-12-21 13:44:07,751 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2023-12-21 13:44:07,913 INFO org.apache.hadoop.conf.Configuration [] - resource-types.xml not found 2023-12-21 13:44:07,913 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils [] - Unable to find 'resource-types.xml'. 2023-12-21 13:44:07,960 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cluster specification: ClusterSpecification{masterMemoryMB=2048, taskManagerMemoryMB=2048, slotsPerTaskManager=1} 2023-12-21 13:44:08,413 WARN org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory [] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 2023-12-21 13:44:08,803 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cannot use kerberos delegation token manager, no valid kerberos credentials provided. 2023-12-21 13:44:08,809 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Submitting application master application_1702943615795_0009 2023-12-21 13:44:08,841 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Submitted application application_1702943615795_0009 2023-12-21 13:44:08,841 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Waiting for the cluster to be allocated 2023-12-21 13:44:08,842 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Deploying cluster, current state ACCEPTED 2023-12-21 13:44:13,620 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - YARN application has been deployed successfully. 2023-12-21 13:44:13,621 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface ka-adh-2.ru-central1.internal:43046 of application 'application_1702943615795_0009'.
-
Open the YARN web interface. The web UI link can be found in ADCM (Clusters → <ADH_cluster> → Services → YARN → Info). The submitted application with the
SUCCEEDED
status is available in the list as shown below.