Wednesday, February 9, 2011

Windows VM False Performance Monitor Stats

We almost exclusively run VM's now in most of our environments. Recently, we added two new Windows VM's in production for a particular solution.  Thinking we were being prudent, we asked to have the boxes provisioned with 4 Gigs of Memory since that's what the old 'hardware' boxes had. We deployed our solutions to the new boxes and tested them without a hitch.  So a week later a switch was flipped and the new VM's went live while the old boxes no longer took requests.  After a few days, we discovered (by looking at the task manager) that we were getting some conflicting and concerning readings for memory utilization. Check this out:

- the task manager graph said we were using 3.7 gigs of memory
- looking at the actual numbers below the graph, we only had 300Mb of memory available on the machine
- looking in the processes tab, none of the processes were using a significant amount of memory (several were using between 200-300MB, but that was within normal tolerances when compared with servers running the same system in other environments.  However these same processes where paging memory like they shouldn't have been!  Page faults were very high.
- Totaling the memory used in the individual processes, we were NOT using 3.7 Gigs of memory.

After doing some more investigation, we asked the team that built the server how the memory had been provisioned for these new boxes.  It turned out that they both had originally been given 1Gig of memory and then later (after they were running) 3 more Gigs were added.  At this point our hunch was that these  boxes weren't seeing the additional 3Gigs that had been subsequently added.  After doing a hard reboot on both boxes, this turned out to be the case.

Lesson learned: make sure you follow VM procedure (hard re-boots) for adding/removing resources to VM servers.  Short-cuts can create some pretty confusing statistics in the OS.

No comments: