Logging in Airflow

Overview

Airflow writes text logs used for analyzing errors that can occur while running DAGs. These logs are located in the /var/log/airflow/ directory of the Airflow server’s host, but they are also available in the Airflow UIs.

Log paths

 
The full path to each DAG log file looks like this:

/var/log/airflow/dag_id=<DAG_ID>/run_id=<DAG_Run_ID>/task_id=<Task_ID>/<log_number>.log

where:

  • <DAG_ID> is the DAG identifier.

  • <DAG_Run_ID> is the identifier of the DAG run, which combines in itself the type of run and a timestamp. For example, run_id=scheduled__2024-07-14T14:18:33.254657+00:00.

  • <Task_ID> is the task identifier.

  • <log_number> is the sequence number of the log (begins with 1).

The full path to the process manager log files is /var/log/airflow/dag_processor_manager/.

The full path to the scheduler logs is /var/log/airflow/scheduler/.

To view Airflow logs on the host:

  1. Connect to the Airflow server via SSH and run the following command:

    $ ls -la /var/log/airflow/

    The output looks similar to this:

    total 12
    drwxr-xr-x.  5 airflow airflow   77 Jul 15 14:18 .
    drwxr-xr-x. 16 root    root    4096 Aug  5 08:44 ..
    drwxr-xr-x. 13 airflow airflow 4096 Jul 30 10:12 dag_id=adcm_check
    drwxr-xr-x.  2 airflow airflow  109 Jul 29 23:54 dag_processor_manager
    drwxr-xr-x. 20 airflow airflow 4096 Aug  5 07:56 scheduler
  2. View the desired log file:

    $ cat /var/log/airflow/dag_id=adcm_check/run_id=manual__2024-07-15T14:18:43.743847+00:00/task_id=runme_1/attempt=1.log

    The output is similar to this:

    [2024-07-15T14:18:46.143+0000] {taskinstance.py:1103} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: adcm_check.runme_1 manual__2024-07-15T14:18:43.743847+00:00 [queued]>
    [2024-07-15T14:18:46.144+0000] {taskinstance.py:1308} INFO - Starting attempt 1 of 4
    [2024-07-15T14:18:46.157+0000] {taskinstance.py:1327} INFO - Executing <Task(BashOperator): runme_1> on 2024-07-15 14:18:43.743847+00:00
    [2024-07-15T14:18:46.164+0000] {standard_task_runner.py:57} INFO - Started process 7082 to run task
    [2024-07-15T14:18:46.168+0000] {standard_task_runner.py:84} INFO - Running: ['airflow', 'tasks', 'run', 'adcm_check', 'runme_1', 'manual__2024-07-15T14:18:43.743847+00:00', '--job-id', '12', '--raw', '--subdir', 'DAGS_FOLDER/adcm_check.py', '--cfg-path', '/tmp/tmpr_3qshf4']
    [2024-07-15T14:18:46.171+0000] {standard_task_runner.py:85} INFO - Job 12: Subtask runme_1
    [2024-07-15T14:18:46.233+0000] {task_command.py:410} INFO - Running <TaskInstance: adcm_check.runme_1 manual__2024-07-15T14:18:43.743847+00:00 [running]> on host elenas-adh3.ru-central1.internal
    [2024-07-15T14:18:46.326+0000] {taskinstance.py:1545} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='airflow' AIRFLOW_CTX_DAG_ID='adcm_check' AIRFLOW_CTX_TASK_ID='runme_1' AIRFLOW_CTX_EXECUTION_DATE='2024-07-15T14:18:43.743847+00:00' AIRFLOW_CTX_TRY_NUMBER='1' AIRFLOW_CTX_DAG_RUN_ID='manual__2024-07-15T14:18:43.743847+00:00'

Grep logs

You can search through the logs for specific information, like error messages. To do this, connect to the host with the logs you want to inspect and use a grep command.

For example:

$ cat /var/log/airflow/dag_processor_manager/dag_processor_manager.log | grep -i -A3 -B1 error | grep -i -A3 -B1 error

This command searches for messages containing the word error in the process manager log. The -i option allows you to ignore case distinctions. The -A3 -B1 options expand the output to one line before and three lines after the line containing the error.

Example output:

File Path                                                                                                                  PID    Runtime      # DAGs    # Errors  Last Runtime    Last Run
-------------------------------------------------------------------------------------------------------------------------  -----  ---------  --------  ----------  --------------  -------------------
/opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown.py                                                            0           0  0.03s           2024-08-05T10:38:25
/opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown_taskflow.py                                                   0           0  0.03s           2024-08-05T10:38:25
--
...
File Path                                                                                                                  PID    Runtime      # DAGs    # Errors  Last Runtime    Last Run
-------------------------------------------------------------------------------------------------------------------------  -----  ---------  --------  ----------  --------------  -------------------
/opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown.py                                                            0           0  0.03s           2024-08-05T10:38:55
/opt/airflow/lib/python3.10/site-packages/airflow/example_dags/example_setup_teardown_taskflow.py                                                   0           0  0.03s           2024-08-05T10:38:55
--

Logging levels

Airflow uses the standard Python logging framework and supports the following log levels (from least to most informative):

  1. CRITICAL — reports on a serious error, indicating that the service itself may be unable to continue running.

  2. FATAL — indicates that an operation can’t continue execution and will terminate.

  3. ERROR — notifies that a program is not working correctly or has stopped.

  4. WARN — warns about potential problems. This doesn’t mean that the service is not working but it raises a concern.

  5. INFO — informs regarding the program lifecycle or state.

  6. DEBUG — prints debugging information about the internal states of the program.

Enabling one level of logging will enable this level and all levels above it. For example, if you set the logging level to ERROR, then only errors, fatal and critical messages would get into the log files, but not INFO, DEBUG, and WARN.

Logging configuration

To change logging properties via ADCM:

  1. On the Clusters page, select the desired cluster.

  2. Go to the Services tab and click on Airflow2.

  3. Select the required parameter and make the necessary changes.

  4. Confirm changes by clicking Save.

  5. In the Actions drop-down menu, select Restart, make sure the Apply configs from ADCM option is set to true, and click Run.

Airflow logging parameters:

  • Logging level — the logging level for the Airflow service.

  • Logging level for Flask-appbuilder UI — the logging level for Flask-appbuilder.

  • cfg_properties_template — the Airflow configuration file that contains logging settings for tasks.

Logging parameters in the cfg_properties_template config
[logging]
base_log_folder = /var/log/airflow
remote_logging = False
remote_log_conn_id =
google_key_path =

remote_base_log_folder =
encrypt_s3_logs = False
{% endraw %}
logging_level = {{ services.airflow2.config.airflow_cfg.logging_level }}
{% raw -%}
celery_logging_level =
{% endraw %}
fab_logging_level = {{ services.airflow2.config.airflow_cfg.fab_logging_level }}
{% raw -%}
logging_config_class =
colored_console_log = True
colored_log_format = [%%(blue)s%%(asctime)s%%(reset)s] {%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
task_log_prefix_template =
log_filename_template = dag_id={{ ti.dag_id }}/run_id={{ ti.run_id }}/task_id={{ ti.task_id }}/{%% if ti.map_index >= 0 %%}map_index={{ ti.map_index }}/{%% endif %%}attempt={{ try_number }}.log
log_processor_filename_template = {{ filename }}.log
dag_processor_manager_log_location = /var/log/airflow/dag_processor_manager/dag_processor_manager.log
task_log_reader = task
extra_logger_names =
Found a mistake? Seleсt text and press Ctrl+Enter to report it