We have a number of different development environments. Deployments follow a natural progression through these environments before they get put into production. We were getting reading for a release and doing deploys and testing in our Staging environment (our env just before prod in our progression). One morning recently, our Nunit tests failed after a run, complaining about not finding adodb.dll registered.
We were surprised because adodb.dll is a Microsoft dll and we certainly weren't deploying any dll's like that - for that matter, we hadn't done any deploys in the last day or two. So why were our Nunits failing suddenly?
Quickly looking into the event logs, we discovered that SSRS (Sql Server Reporting Services) had been (unbeknowst to us) uninstalled on our transaction server during the evening. I went and spoke with the infrastructure group who did the uninstall and they got very concerned as they were ready to uninstall SSRS in production the next evening.
In the end, it appears that uninstalling SSRS did, in fact, remove adodb.dll from the Gac on our transaction server. Our infrastructure group cancelled the uninstall from production. We are trying to impress upon them and others the importance of communicating changes like this even if the assumption is there is no impact. Fortunately, our process of percolating changes up in different environments managed to catch this change before it blew us up in production. However, it wasn't caught in the lower environments because it was unadvertised. Communication is key!
1 comment:
An Addendum to this post:
After a succession of successful releases, don't get lured into a false sense of security - especially if you releasing new components. Here are some tips:
- Escalation/contingency plans for releases should be mapped out in advance. They should answer questions like 'When and who should we bring in to help us troubleshoot this problem?' and 'At what point to we need to make a go/rollback decision?'
- Always deploy the version that was in your staging/acceptance environment. No exceptions, no skipping environment.
- Give some thought to what you can 'silently' put into production before the actual release. These are usually net new components that won't be started until the actual release is over, but can be 'pre-deployed' to minimized the amount of content pushed to production during the release.
- If you have 'build verification tests' or 'configuration tests' that you run against production, make sure they test as much as possible. The more configurations you can quickly test to get assurance that what you want to be there really is there, the better.
- Having a written 'script' of the deploy to remind you of everything you need to do is key. There is always something (not necessarily technical) that gets in the way of completely automating a deploy into production, so having a hard copy reminder of everything that needs to be done is important. One forgotten permissions can lead to hours of headache.
Post a Comment