We have a server application that when a default recycle interval (29 hours) causes the application pool to recycle, sometimes the recycled new pool does not start up. This means that all requests hitting the server fail resulting in support calls.
What is happening?
After testing on my laptop, I recycled my app pool several times and on each time I could see a new app pool starting up whilst at the same time the old app pool is terminated after a few seconds. No issues here at all, all seems to be working as expected.
Although why are there two app pools?
The reason I see two app pools is because IIS overlaps the app pool when recycling so there usually isn’t any downtime during a recycle. So requests still come in without every knowing the app pool has been recycled and the requests are served seamlessly with no downtime.
After looking at the event viewer I saw warnings about the shutdown time limit period has been exceeded.
IIS has a configurable Shutdown Time Limit (in seconds) as shown (see IIS Manager advanced settings for the application pool):
This determines the interval that IIS 7 and later gives a worker process to finish all requests before the WWW service terminates the worker. So 90 seconds is set here and IIS will wait 90 seconds before killing the application pool if it has not shutdown by then.
So what was stopping the app pool from closing down and why was the new app pool not starting?
Looking at the server where the issue is seen, there is a high number of requests coming in from external devices (Set top boxes). So shutting down an app pool will not happen instantaneously as requests have to be served before not allowing new ones. In fact the Set top boxes were requesting video streams so that involves a lot more data being streamed from the server.
The app pool when it starts uses Lucene indexes and in order to do that it needs to acquire a lock. Now if the old pool has not closed down it has not released the lock, which means the new app pool cannot start.
How to fix this issue?
IIS has another setting ‘Disable Overlapped Recyle’ which will first shutdown the pool before starting a new one as shown here:
Setting this value to true ensures that when the new app pool starts the existing app pool will no longer be running which will mean that any Lucene locks will have been released. The new app pool will no longer have issues acquiring any locks.
To learn more about IIS Overlapping see the excellent video by Scott Forsyth.