We had an issue where one of our Application Pools (IIS 6) was 'dying'. The interesting thing about it was while the application itself was dead (error page, sometimes showing an outOfMemory exception, other times showing a meaningless exception), the appPool itself appeared in the GUI to be running. The more troubleshooting I do with Windows OS's, the more I do not trust what the GUI says. Appearances in Windows can be deceiving. For several weeks - until we understood the root cause of the issue - are resolution to the dead application was to restart the 'running' appPool.
A side note - this appPool was running on a Network Load-balanced server, on the dominant node. The appPool for this particular application only ever failed on the dominant node. It never failed on the other node of the cluster, so we had reason to believe that the 'load' had something to do with the root issue.
In the end, we discovered the root cause of our dying application by accident. We were cleaning up our files on the prod servers when it became apparent to us that log4net on another application (and another appPool) was set to log 'ALL'. This resulted in 1.5Mb of log created about every 5 minutes. I had a hunch that all this logging was having a memory impact on the server - we certainly didn't want to be logging at the 'ALL' level anyway, so we created a change request and fixed that issue ASAP. Then we crossed our fingers, hoping that our dying AppPool issue would go away as well.
It's been more than two weeks since we implemented this log4net change and our flaky appPool hasn't died once. Lessons learned:
- Watch for excessive logging on prod servers
- Excessive logging can apparently cause other appPools to through outOfMemory Exceptions!