All Day Outage
Exchange datastore repairs take hours
Yesterday, I worked almost 16 hours. And nothing went right.
For those that have been reading here for some time, I am a Network Admin [for over 28 years now] and have been at my current company just shy of 11 years. I am in the process of looking for another place of employment, in my hopes to not celebrate 12 years at this job.
Yesterday’s outage was mostly focused on core networking upgrades. Specifically, the back end going from multiple 1gb copper to 10gb fiber. On a 10 year old Cisco 6509 switch. Most of this was out of my hands. I was simply on hand to help where needed while the Cisco experts took over. After 8 hours, we couldn’t make it work. So we rolled it back. In the process, the entire network was down, which meant servers and workstations couldn’t connect to each other. No big deal. They will connect when it all comes back online. And some of the VLAN connections weren’t affected. Or so we thought.
My specialty is Exchange Server. I’ve been building and running Exchange platforms for almost 20 years now. While I’ve not touched newer versions like Exchange 2010 or 2013, I have had my share of supporting older versions. In knowing this, we normally dismount the datastores before any maintenance window. I didn’t do that this time as there wasn’t any reason to think the datastores would be affected.
Three of the four datastores mounted without issue. Came up, no errors or consistency problems.
The fourth datastore had consistency errors. And the last 3 days of backups had the same consistency errors. So this means spending the next 24 hours running the repair and defrag steps to bring this last datastore back online.
I did get some sleep.. about 5 hours. So I can be alert in my monitoring of the defrag process today before I copy the datastore back to it’s rightful place on the NetApp SAN.
Much Needed Upgrade
I’ve been pushing for an Exchange upgrade for over 3 years now. Budgeted, yet never approved. Each and every year since. Frustrating for myself as it means that those who control the purchasing decisions never have to spend one minute recovering from these types of outages. A newer version of Exchange is easier to manage, backup and restore [since it will be placed on our VMware vCenter platform running on Flexpod>]
So we soldier on with restoring another damaged datastore, with the hope that maybe.. just maybe, we’ll use this to finally upgrade and make future life easier on myself or the person who eventually replaces me.
This would be so much easier if TRON were available.