Tuesday, June 9, 2009

Troubleshooting

Things to always watch out for when troubleshooting:
- locked out users - users who have been locked out of the system because the wrong password has been entered 3 consecutive times
- out of disk space (proper monitoring would fix this problem)
- GUI display that is contrary to what is actually happening on the system. Gac'ed dll's that really aren't gac'ed. In other words, assume nothing.
- inconsistent application of permissions across environments.
- duplicate libraries/dll's on machines.
- closed ports or ports/services not listening/running. Also port conflicts. More than one service wanting to use the same port.
- services, IO connections, db connections not getting closed/terminated properly.
- logic between functions or classes that creates infinite loops

Friday, June 5, 2009

IIS binary formatting error

Oh man, did I get a slap in the face yesterday. I was helping/coordinating a patch into production for our team and was asked if we needed to 'drainstop' (read turn off load balancing) to deploy to our app servers. I said no, I had never done that before here. Later I realized that I probably haven't done a prod patch to the app servers here period.
Anyway, we pushed our change onto running servers and everything appeared to deploy fine until we actually tried to use them. It seemed that even though we pushed new components and gac'ed them, that was ignored because the server still had a running request on the old one. Once that request was gone, our app servers were down. When our app servers go down and our web servers try to access them in that state, we get this wonderfully intuitive error called a 'binary formatter error' with a bunch of junk that doesn't make any sense.
So, in the end, two lessons learned:
1. Always drainstop servers when you're deploying onto a live system
2. Binary Formatter errors can mean that you communication with a remote servers is down.