[Solved] Dotnet core: kestrel hangs, Too many open files in system.

Mahdi Shanak
4 min readDec 6, 2017

--

Linux system comes with many security configurations predefined, one of them is the limitation on the maximum number of open files that the system can open at the same time ( per user/per process), the dot net framework during processing need to use and open many file for caching and to enhance performance. This sometimes leads to the problem in which the operating system ( in our case is Centos7) kill the process when the process open many of files that exceed the maximum allowed open files.

The problem

When we deployed the dotnet core 2 application ( MVC ) to centos 7, we used nginx as reverse proxy for Kestrel ( as explained in the documentation), every thing was working correctly for the first while. But after some time ( about 12 houres ) the dot net application hangs and stop responding.

The reason :

log file screenshot

After days of search, and after enabling logging in supervisor, the problem was “ Too many open files”. the same problem sometimes appears with Postgres database in Linux (Centos, Ubuntu).
linux system has a security configuration to limit the number of open files by users and processes, which in our case was very small ( default =1024).

The solution

To solve the problem , we need to :

  1. Increase the maximum number of open files for the system and for each process.
  2. Resolve the CLOSE_WAIT problem .

1- Increase the maximum number of open files:

Step 1: increase the limit for the system

To view the max-open-file , use the following command:

cat /proc/sys/fs/file-max

to increase the limit, we need to open the setting file located at :

sudo vi /etc/sysctl.conf

and add the following line ( or update it if exist):

fs.file-max = 640000

for this setting to take effect, you need to restart the server

sudo reboot

after reboot, you can check the setting again by the same command:

cat /proc/sys/fs/file-max

Step2 : increase the limit for processes:

we found that the previous instruction will not increase the max number of open files for each process,

To check the max-open-files for dotnet ( we assume that your dotnet application is running). we first need to know the ProcessId for dotnet by using the flowing command

ps -A

then you need to search for dotnet in the list :

our dotnet process Id is 480

then we will use this process ID ( in out example ID=480) to view the maximum allows open files for dotnet as follow :

cat /proc/480/limits      #change 480 to your processId

you will notice that the max open file still small and didn’t use the 640000 we configured in step 1.

Before continue, i recommend to use the following command to show the number of open files that currently used by dot-net

sudo ls /proc/480/fd | wc -l  ##change 480 to your process Id

To change max-open-file number: we have to update two files ( if you use supervisor for running the application ) .

file 1: open the following file:

sudo vi /etc/security/limits.conf

In this file we need to add the following 4 line ( you can delete the content of the file before adding the lines) :

*    soft nofile 640000
* hard nofile 640000
root soft nofile 640000
root hard nofile 640000

file2: if you use superviser, we need to update the configuration to increase the number of open files as follow:

sudo vi /etc/supervisord.conf  
# the path of your supervisor configuration

then find and change the “minfds” to be as follow:

minfds=640000

reboot

for these changes to take effect, we prefer to restart the system

sudo reboot

after reboot, you can check the max open files for the donet process as explained in the previous step ( step 2)

2- Fix CLOSE_WAIT problem

This problem is happen when a request opens a connection but the connection keep open even when the request complete. There are two main reasons, first is nginx bug, or dotent HttpClient request.

2.1 Nginx:

We need to disable proxy buffering and set time out for the request, edit the configuration file :

sudo vi /etc/nginx/conf.d/default.conf

Then add the following line to the Location /

proxy_buffering off;
proxy_read_timeout 7200;

The full example will be as

location /{
proxy_pass http://localhost:5000;
proxy_buffering off;
proxy_read_timeout 7200;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection keep-alive;
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
fastcgi_buffers 16 16k;
fastcgi_buffer_size 32k;
}

finally restart nginx

sudo systemctl restart nginx

2.2 HttpClient

If you have many HttpWebRequest or HttpCliens , you need to share the same httpClient in many request since each httpClient has its own pool.

You can refere to this tutorial

--

--