Home > Computer Hardware > When a server gives you grey hair…

When a server gives you grey hair…


About a month ago I did a fairly large server migration project at our school. In addition to installing a brand new server, I also did a lot of logical migrations of virtual servers or server roles. The main hub around which my plans revolved was our generic Intel server.

IMG_0320

Originally purchased and installed in March/April 2011, the server has been humming away quietly for the last 3 years, doing what it was meant for – virtualization. It hasn’t been all smooth sailing however, as their have been some quirks. The original 16GB SSD died one day, and by died I mean a complete and utter death – not even visible in the BIOS. That took XenServer along with it, giving me a miserable 2 days getting the server back up to scratch. Funny enough, the mechanical 160GB I used to run XenServer after the crash was still running fine up to when I removed it and swapped it for Microsoft Hyper-V.

Another issue that came up early in the machine’s life was XenServer hanging at random intervals, usually at night when the network was quiet. No info in any logs, just a random hard lock up that only a power cycle could clear. My ex colleague and I eventually discovered that it was due to XenServer not liking the lower power C states on the then new Intel “Nehalem” Xeon chips. Disabling the lower C states in the BIOS solved that issue and XenServer then chugged away for the rest of its life.

This server has never had its firmware updated, so when I did the server migration I took the opportunity to flash the latest firmware available. This would also help in getting the server ready for Server 2012 R2 and Hyper-V. I prepared my flash drive with the update, no problem there. Updating modern Intel servers is pretty easy, so I didn’t expect any issues. Of course, that’s when the issues started…

During the update, the update was unable to determine the chassis type, which I found odd. Still, I manually entered the model when given a choice. A few seconds later, all the fans in the chassis ramped up to full speed and stayed there, as well as the system health LED blinking an ominous amber. Prior to this, the server was near whisper quiet, only ramping up fans during POST. I ran the firmware update again, thinking maybe something hadn’t stuck. No joy, system was still sounding like a jet engine. Although the server was perfectly usable, I couldn’t imagine the system running like that 24/7 in the server room. That noise level would drive anyone mad after a while! I left defeated for that night and decided to hit it head on the next morning. I did some research when I got home and based on what info I could discover using the server’s built in event log, I was able to narrow down where the potential problem was – fans not being installed on the correct headers for that specific chassis.

The next morning, my colleague and I took the server out the rack to look at the internals. Along with some guides from Intel’s web site, we found that the issue causing the jet like noise was what I suspected from my research. When our hardware IT provider built the server, he didn’t connect the AUX cable from the power supply to the motherboard, as well as the fact that 2 of the 3 on-board fans were plugged into the wrong fan headers. On a desktop PC this wouldn’t be an issue, but server systems are picky about what is present and what is missing. If the firmware expects a fan to be plugged into header X and if it’s not there it’s going to make a fuss.

After sorting out the cabling the server went back to being whisper quiet after the next boot. The system health LED went back to green, the way it’s meant to be. Just for safety, I ran the firmware update again, which worked perfectly this time. The chassis was detected correctly and the system behaved normally after the update. With the hardware issues resolved, we moved onto installing Windows and getting Hyper-V up and running.

Since the migration, the server is behaving well. The server is showing its age a little, as the Nehalem chips are quite old now. Surprisingly, the system is still doing ok with 48GB RAM, though we may bump that up to 64GB to enable the system to host more VM’s. In the end, it just goes to show that it’s sometimes the smallest of things that give us the most grief.

Advertisements
  1. Marius
    April 13, 2015 at 11:02

    Not sure I have found the right guy or if you have already found a solution, but on edugeek you have described a problem with server 2012 and slow performance with office files.. I’m not a member there so I can’t post to your thread, but we have found a solution that works for us on this issue as it is still not resolved.

    Try disabling SMB v2 and 3 on server or clients: https://support.microsoft.com/en-us/kb/2696547

    Your thread on edugeek: http://www.edugeek.net/forums/windows-server-2012/135125-slow-opening-office-files-server-2012r2.html

    Hope this reaches you 🙂

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: