Centos – Difference between ulimit, lsof, cat /proc/sys/fs/file-max

centosfile-descriptors

I've been getting java.io.IOException: Too many open files while running a Kafka instance and using one topic with 1000 partitions so I started investigating the file descriptors limits in my ec2 vm. I cannot understand which is exactly the limit for open files on a Centos 7 machine since all the following commands produce different results. The commands are:

  • ulimit -a: open files 1024
  • lsof | wc -l: 298280
  • cat /proc/sys/fs/file-max: 758881 (which is consistent with /proc/sys/fs/file-nr)

If the actual limit is the one the last command produces then I am well below it (lsof | wc -l: 298280). But if this is the case, the output of the ulimit command is quite unclear to me since I am well above the 1024 open files.

According to the official documentation the best way to check for file descriptors in Centos is the /proc/sys/fs/file-max file but are there all these seemingly "inconsistencies" between these commands?

Best Answer

  1. file-max is the maximum number of files that can be opened across the entire system. This is enforced at the kernel level.

  2. The man page for lsof states that:

In the absence of any options, lsof lists all open files belonging to all active processes.

This is consistent with your observations, since the number of files as reported by lsof is well below the file-max setting.

  1. Finally, ulimit is used to enforce resource limits at a user level. The parameter 'number of open files' is set at the user level, but is applied to each process started by that user. In this case, a single Kafka process can have up to 1024 file handles open (soft limit).

You can raise this limit on your own up to the hard limit, 4096. To raise the hard limit, root access is required.

If Kafka is running as a single process, you could find the number of files opened by that process by using lsof -p [PID].

Hope this clears things up.

Related Question