Logging in Airflow
Overview
Airflow writes text logs used for analyzing errors that can occur while running DAGs. These logs are located in the /var/log/airflow/ directory of the Airflow server’s host, but they are also available in the Airflow UIs.
To view Airflow logs on the host:
-
Connect to the Airflow server via SSH and run the following command:
$ ls -la /var/log/airflow/
The output looks similar to this:
total 12 drwxr-xr-x. 5 airflow airflow 77 Jul 15 14:18 . drwxr-xr-x. 16 root root 4096 Aug 5 08:44 .. drwxr-xr-x. 13 airflow airflow 4096 Jul 30 10:12 dag_id=adcm_check drwxr-xr-x. 2 airflow airflow 109 Jul 29 23:54 dag_processor_manager drwxr-xr-x. 20 airflow airflow 4096 Aug 5 07:56 scheduler
-
View the desired log file:
$ cat /var/log/airflow/dag_id=adcm_check/run_id=manual__2024-07-15T14:18:43.743847+00:00/task_id=runme_1/attempt=1.log
The output is similar to this:
[2024-07-15T14:18:46.143+0000] {taskinstance.py:1103} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: adcm_check.runme_1 manual__2024-07-15T14:18:43.743847+00:00 [queued]> [2024-07-15T14:18:46.144+0000] {taskinstance.py:1308} INFO - Starting attempt 1 of 4 [2024-07-15T14:18:46.157+0000] {taskinstance.py:1327} INFO - Executing <Task(BashOperator): runme_1> on 2024-07-15 14:18:43.743847+00:00 [2024-07-15T14:18:46.164+0000] {standard_task_runner.py:57} INFO - Started process 7082 to run task [2024-07-15T14:18:46.168+0000] {standard_task_runner.py:84} INFO - Running: ['airflow', 'tasks', 'run', 'adcm_check', 'runme_1', 'manual__2024-07-15T14:18:43.743847+00:00', '--job-id', '12', '--raw', '--subdir', 'DAGS_FOLDER/adcm_check.py', '--cfg-path', '/tmp/tmpr_3qshf4'] [2024-07-15T14:18:46.171+0000] {standard_task_runner.py:85} INFO - Job 12: Subtask runme_1 [2024-07-15T14:18:46.233+0000] {task_command.py:410} INFO - Running <TaskInstance: adcm_check.runme_1 manual__2024-07-15T14:18:43.743847+00:00 [running]> on host elenas-adh3.ru-central1.internal [2024-07-15T14:18:46.326+0000] {taskinstance.py:1545} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='airflow' AIRFLOW_CTX_DAG_ID='adcm_check' AIRFLOW_CTX_TASK_ID='runme_1' AIRFLOW_CTX_EXECUTION_DATE='2024-07-15T14:18:43.743847+00:00' AIRFLOW_CTX_TRY_NUMBER='1' AIRFLOW_CTX_DAG_RUN_ID='manual__2024-07-15T14:18:43.743847+00:00'
Grep logs
You can search through the logs for specific information, like error messages. To do this, connect to the host with the logs you want to inspect and use a grep
command.
For example:
$ cat /var/log/airflow/dag_processor_manager/dag_processor_manager.log | grep -i -A3 -B1 error | grep -i -A3 -B1 error
This command searches for messages containing the word error
in the process manager log. The -i
option allows you to ignore case distinctions. The -A3 -B1
options expand the output to one line before and three lines after the line containing the error.
Example output:
File Path PID Runtime # DAGs # Errors Last Runtime Last Run ------------------------------------------------------------------------------------------------------------------------- ----- --------- -------- ---------- -------------- ------------------- /opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown.py 0 0 0.03s 2024-08-05T10:38:25 /opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown_taskflow.py 0 0 0.03s 2024-08-05T10:38:25 -- ... File Path PID Runtime # DAGs # Errors Last Runtime Last Run ------------------------------------------------------------------------------------------------------------------------- ----- --------- -------- ---------- -------------- ------------------- /opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown.py 0 0 0.03s 2024-08-05T10:38:55 /opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown_taskflow.py 0 0 0.03s 2024-08-05T10:38:55 --
Logging levels
Airflow uses the standard Python logging framework and supports the following log levels (from least to most informative):
-
CRITICAL
— reports on a serious error, indicating that the service itself may be unable to continue running. -
FATAL
— indicates that an operation can’t continue execution and will terminate. -
ERROR
— notifies that a program is not working correctly or has stopped. -
WARN
— warns about potential problems. This doesn’t mean that the service is not working but it raises a concern. -
INFO
— informs regarding the program lifecycle or state. -
DEBUG
— prints debugging information about the internal states of the program.
Enabling one level of logging will enable this level and all levels above it. For example, if you set the logging level to ERROR
, then only errors, fatal and critical messages would get into the log files, but not INFO
, DEBUG
, and WARN
.
Logging configuration
To change logging properties via ADCM:
-
On the Clusters page, select the desired cluster.
-
Go to the Services tab and click on Airflow2.
-
Select the required parameter and make the necessary changes.
-
Confirm changes by clicking Save.
-
In the Actions drop-down menu, select Restart, make sure the Apply configs from ADCM option is set to
true
, and click Run.
Airflow logging parameters:
-
Logging level — the logging level for the Airflow service.
-
Logging level for Flask-appbuilder UI — the logging level for Flask-appbuilder.
-
cfg_properties_template — the Airflow configuration file that contains logging settings for tasks.
[logging]
base_log_folder = /var/log/airflow
remote_logging = False
remote_log_conn_id =
google_key_path =
remote_base_log_folder =
encrypt_s3_logs = False
{% endraw %}
logging_level = {{ services.airflow2.config.airflow_cfg.logging_level }}
{% raw -%}
celery_logging_level =
{% endraw %}
fab_logging_level = {{ services.airflow2.config.airflow_cfg.fab_logging_level }}
{% raw -%}
logging_config_class =
colored_console_log = True
colored_log_format = [%%(blue)s%%(asctime)s%%(reset)s] {%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
task_log_prefix_template =
log_filename_template = dag_id={{ ti.dag_id }}/run_id={{ ti.run_id }}/task_id={{ ti.task_id }}/{%% if ti.map_index >= 0 %%}map_index={{ ti.map_index }}/{%% endif %%}attempt={{ try_number }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = /var/log/airflow/dag_processor_manager/dag_processor_manager.log
task_log_reader = task
extra_logger_names =