NIFi configuration best practices

Typical settings for Linux may not be suitable for I/O intensive applications such as NiFi. For the areas of the operating system used, the requirements of the distribution may vary.

The guidelines below may be helpful in improving compatibility between NiFi and your operating system.

For the best application of the recommendations, you should refer to the documentation for a specific distribution.

Increase the number of file handles

NiFi could potentially have a very large number of open file handles at any given time.

File handles are integer identifiers that define data structures. The Linux kernel refers to these structures as file structures because they describe open files. The file structure index is a file handle.

To change the open file limit, you need to edit /etc/security/limits.conf, a file that sets resource limits for users logged in through PAMPrivileged Access Management (PAM). To do this, for all user groups, increase the value of the nofile parameter — the maximum number of open files for types hard and soft, by adding the following lines to /etc/security/limits.conf:

*  hard  nofile  50000
*  soft  nofile  50000

It is necessary to take into account the size of the operating system kernel memory when increasing the limit.

Increase the number of forked processes

NiFi can be configured to create a significant number of flows. To increase the allowed number, you also need to edit the /etc/security/limits.conf file. To do this, for all user groups, increase the value of the nproc parameter — the maximum number of processes for the hard and soft types, by adding the following lines to /etc/security/limits.conf:

*  hard  nproc  10000
*  soft  nproc  10000

And also, if available, you need to edit the /etc/security/limits.d/90-nproc.conf or /etc/security/limits.d/20-nproc.conf file, which overrides the nproc value specified in the /etc /security/limits.conf file. In this file, it is necessary for all user groups to increase the value of the nproc parameter — the maximum number of processes for the soft type, by adding the line:

*  soft  nproc  10000

Increase the number of available TCP socket ports

The number of available TCP socket ports is especially important if your thread will be setting up and dropping a large number of sockets in a short period of time.

To set the maximum number of TCP sockets, run the command:

$ sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000"

Set the duration of the TIMED_WAIT state for sockets

TIME_WAIT is the state of the server socket in which it will time out in order to collect randomly delayed packets on the network. The timeout duration can be changed to allow you to quickly configure and disable new sockets.

The following commands are used to change the timeout duration:

  • for distributions with kernel 2.6:

$ sudo sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait="1"
  • for distributions with kernel 3.0:

$ sudo sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait="1"

Disable swap option

The swap setting affects the kernel’s preference to move memory pages from applications to the swap page rather than releasing memory from the cache.

To improve the efficiency of NiFi, it is recommended to set the swappiness parameter to zero. This ensures that if memory is limited, the page cache will be reduced in an attempt to reclaim memory before the application’s pages are moved to swap space.

To disable the swap option, you need to edit /etc/sysctl.conf, the file that controls kernel options.

You need to add the following line:

vm.swappiness = 0

For partitions handling different NiFi repositories, disable the atime parameter — a field to write the timestamp. This can cause an unexpected bandwidth spike. To do this, in the /etc/fstab file (which is used to configure mount options for various block devices, disk partitions, and remote file systems), add the noatime parameter for the partitions of interest, which completely disables the recording of file access time.

Increase the hardware resources

For efficient operation of NiFi, it is recommended:

  • If possible, use separate servers or racks for NiFi instances.

  • Provide HA (High Availability) in accordance with the requirements described in the Hardware requirements article.

Found a mistake? Seleсt text and press Ctrl+Enter to report it