Home > Networking > The long hunt for a cure

The long hunt for a cure


At the end of March 2014, our school took ownership of a new Intel 2600GZ server to replace our previous HP ML350 G5 server which was the heart of our network. The HP had done a fantastic job over the years, but was rapidly starting to age and wasn’t officially supported by Windows Server 2012 R2. Our new server has 32GB of RAM, dual Xeon processers, dual power supplies, 4 network ports and a dedicated remote management card. Although a little pricier than what I had originally budgeted for, it matched what the HP had and would earn its keep over the next 5-7 years worth of service.

After racking and powering up the server, I installed firmware updates and then Server 2012 R2. Install was quicker than any other server I’ve done in the past, thanks to the SSD boot volume. After going through all the driver installs, Windows Updates and so on, the server was almost ready to start serving. One of the last actions I did was to bond all 4 network ports together to create a network team. My thinking was that having a 4Gb/s team would prevent any bottlenecks to the server when under heavy load, as well as provide redundancy should a cable or switch port go faulty. Good idea in theory, but in reality I’ve never had a cable or port in the server room go bad in 6+ years.

Looking back now, I’m not sure exactly why I bothered creating a team. While the server is heavily used as a domain controller, DHCP, DNS and file server, it never comes close to saturating 1Gb/s, let alone 4. Almost every computer in the school is still connected at 100Mb/s, so the server itself never really comes under too much strain.

Either way, once everything was set up, I proceeded to copy all the files across from the old HP to the new Intel server. I used Robocopy to bulk move files, and in some cases needed to let the process finish up over night since there were so many files, especially lots of small files. Data deduplication was turned on, shares were shared and everything looked good to go.

When school resumed after the holidays, the biggest problem came to light right on the first morning: users being unable to simultaneously access Office files. We have a PowerPoint slideshow that is run every morning in the register period that has all the daily notices for meetings, events, reminders, detention etc. Prior to the move, this system worked without fault for many years. After the move, the moment the 2nd or 3rd teacher tried to access the slideshow, they would get this result:

WP_20140409_001
Green bar of doom crawling across the navigation pane, while this odd Downloading box would appear and take forever to do anything and would tend to lock Explorer up. Complaints naturally came in thick and fast and the worst part is that I couldn’t pinpoint what the issue was, aside from my suspicion that the new SMB3 protocol was to blame. I had hoped that the big Update 1 update that shipped for Windows 8.1 and Server would help, but it didn’t. Disabling SMB signing didn’t help either. At one point, my colleague and I even installed Windows 8.1 and Office 2013 on some test machines to try and rule out that possibility, but they ended up doing the same thing. As a stop gap measure, I made a dedicated Notices drive on the old HP, which was still running Server 2008, which ran fine with concurrent access to the same file. Online forums weren’t any real help and none of the other admins in Cape Town I spoke to had encountered the problem either.

In the last school holidays just gone by, we finally had a decent gap between other jobs to experiment on the new server and see if we could correct the problem. I broke the network team, unplugged 3 of the 4 cables and disabled the LACP protocol on the switch. After reassigning the correct IP to the now single network port, we did some tests on opening up files on 2 and then 3 computers at the same time. We opened up 15MB Word documents, 5MB complicated Excel files, 200MB video files and more. The downloading box never showed up once. Unfortunately, without heavier real world testing by the staff, I don’t know if the problem has been resolved once and for all. I am intending to move the Notices drive during the next school holiday and we will see what happens after that.

Chalk one up for strange issues that are almost impossible to hunt down.

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: