Archive

Archive for October, 2016

Those WTF moments

October 14, 2016 Leave a comment

Sometimes in the world of IT, you have moments where all you can do is scratch your head and ask WTF happened. Such was the case on Monday this week, before I even got back into work and before the term started. I received a text from my head of IT who said that he was unable to access one of our (virtual) servers to post our PowerPoint daily notice we show our learners. In-between getting dressed and packing my bags for work, I remoted in to take a look.

I couldn’t see the server on the network nor could I Remote Desktop in to it. Ping worked surprisingly, but that seemed to be about it. Going to Hyper-V manager revealed the server was on and I could connect via the console. I picked up a clue as what to what could be wrong when I noticed that the Heartbeat status wasn’t being reported to Hyper-V Manager. This indicated that the service had stopped running for some reason.

The previous Friday I had rebooted all our servers in order to finish their update cycles, as well as to prepare them for the term ahead. This particular server had come up from the reboot ok, so I didn’t do an in depth check. It’s never given me an issue like what happened before, so I made the mistake of assuming all was well. Anyway, after connecting, I could see that all the Hyper-V services were not running inside the VM. Manually trying to start them didn’t work. I tried to upgrade the Integration Components since Hyper-V indicated that my other VM’s needed an update for the components. No matter how I ran the setup file, it would not execute on the sick VM. By this time I had to leave to get to work, so the problem had to wait until I got in.

After arriving at work and settling in, I cloned the VM to my PC so I could play around more easily. Numerous attempts at a cure all failed, until I came across a post on the internet that described the same symptoms as I had. There was a link to a Microsoft KB article, which included steps on how to fix the problem. The KB dated from a few years back, so I found it incredibly bizarre that the problem only hit us now. Still, the sick server is running Server 2008, so I went ahead and made the change in the registry as documented. A reboot later and the server on my PC was suddenly working normally again. All relevant services were starting up correctly again and the server was back in action.

Since it was successful on my local cloned image, I went ahead and made the same change on the sick VM itself. Sure enough, one reboot later and we were back in business. In the aftermath, I spent a lot of time trying to figure out what caused this issue. While I did have IIS installed on the server years ago, I don’t recall there ever being a SSL certificate on that server. How exactly we ended up with the situation is probably something I’ll never fully know. As I said to my colleague, we’ve both seen random stuff over the years, but this one was really a WTF moment in a big way.

Categories: General, Software