NIFi configuration best practices
Typical settings for Linux may not be suitable for I/O intensive applications such as NiFi. For the areas of the operating system used, the requirements of the distribution may vary.
The guidelines below may be helpful in improving compatibility between NiFi and your operating system.
For the best application of the recommendations, you should refer to the documentation for a specific distribution.
Increase the number of file handles
NiFi could potentially have a very large number of open file handles at any given time.
File handles are integer identifiers that define data structures. The Linux kernel refers to these structures as file structures because they describe open files. The file structure index is a file handle.
To change the open file limit, you need to edit /etc/security/limits.conf, a file that sets resource limits for users logged in through PAMPrivileged Access Management (PAM).
To do this, for all user groups, increase the value of the nofile
parameter — the maximum number of open files for types hard
and soft
, by adding the following lines to /etc/security/limits.conf:
* hard nofile 50000 * soft nofile 50000
It is necessary to take into account the size of the operating system kernel memory when increasing the limit.
Increase the number of forked processes
NiFi can be configured to create a significant number of flows. To increase the allowed number, you also need to edit the /etc/security/limits.conf file.
To do this, for all user groups, increase the value of the nproc
parameter — the maximum number of processes for the hard
and soft
types, by adding the following lines to /etc/security/limits.conf:
* hard nproc 10000 * soft nproc 10000
And also, if available, you need to edit the /etc/security/limits.d/90-nproc.conf or /etc/security/limits.d/20-nproc.conf file, which overrides the nproc
value specified in the /etc /security/limits.conf file. In this file, it is necessary for all user groups to increase the value of the nproc
parameter — the maximum number of processes for the soft
type, by adding the line:
* soft nproc 10000
Increase the number of available TCP socket ports
The number of available TCP socket ports is especially important if your thread will be setting up and dropping a large number of sockets in a short period of time.
To set the maximum number of TCP sockets, run the command:
$ sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000"
Set the duration of the TIMED_WAIT state for sockets
TIME_WAIT is the state of the server socket in which it will time out in order to collect randomly delayed packets on the network. The timeout duration can be changed to allow you to quickly configure and disable new sockets.
The following commands are used to change the timeout duration:
-
for distributions with kernel 2.6:
$ sudo sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait="1"
-
for distributions with kernel 3.0:
$ sudo sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait="1"
Disable swap option
The swap setting affects the kernel’s preference to move memory pages from applications to the swap page rather than releasing memory from the cache.
To improve the efficiency of NiFi, it is recommended to set the swappiness
parameter to zero. This ensures that if memory is limited, the page cache will be reduced in an attempt to reclaim memory before the application’s pages are moved to swap space.
To disable the swap option, you need to edit /etc/sysctl.conf, the file that controls kernel options.
You need to add the following line:
vm.swappiness = 0
For partitions handling different NiFi repositories, disable the atime
parameter — a field to write the timestamp. This can cause an unexpected bandwidth spike. To do this, in the /etc/fstab file (which is used to configure mount options for various block devices, disk partitions, and remote file systems), add the noatime
parameter for the partitions of interest, which completely disables the recording of file access time.
Increase the hardware resources
For efficient operation of NiFi, it is recommended:
-
If possible, use separate servers or racks for NiFi instances.
-
Provide HA (High Availability) in accordance with the requirements described in the Hardware requirements article.