The Clean Windows PC experience

February 15, 2015 Leave a comment

Microsoft Windows is an amazing piece of software. It powers an incredibly wide range of hardware, as well as running on wildly different system specifications. One person may have a bargain basement Celeron or Pentium laptop, while another person is running on a fully tricked out Core i7 beast – Windows covers it all. With multiple OEM’s making products, the consumer is generally spoiled for choice across a wide range of price points. The downside to this however is that Windows has often been associated with a race to the bottom of the barrel, while Apple for example refuses to go below a certain line and rightly or wrongly, and maintains a prestigious, upmarket image.

Part of the race to the bottom means that profits for OEM’s are razor thin. Make a mistake and your competitors are going to pounce. Fail to keep up and likewise. Fail to cut down on costs and you risk going bust. As a result of this fierce competition, consumers have sometimes been the victim of this industry competition. Laptops are built with creaky plastic that doesn’t always sit flush, screen resolutions haven’t increased in years, mechanical hard drives are still king, multiple models that often leave people confused as to what the differences are between it and another model, the amount of RAM is just enough to get by with and cheap Realtek network and audio solutions are used etc… On the software side of things, OEM’s take money from anti-virus vendors to preload their wares onto the computers. Throw in CD/DVD burning trial solutions, vendor back up programs as well as other useless vendor software and you are left with a horrible laptop/desktop experience. Users don’t love Windows, they just tolerate it.

The hardware issue is tricky, since that depends on economies of scale to work. A SSD hard drive for example would greatly improve people’s experiences with their computer, but a 250GB drive for example still costs much more than a 1TB mechanical drive. Screen resolution in laptops is slowly starting to move forward again, but it will take time. Trackpads are also finally starting to improve, but it’s still hit and miss. With desktops, it’s really become about trying to cut down on size as much as possible and go small.

The software side of things is where the most immediate improvement can be made. If OEM’s followed Microsoft’s Windows Signature Edition experience, I think many a customer would be happy. Instead of having Windows loaded down with bloatware, trials and other software, Windows instead would come clean out the box, with a few minimal applications installed – Flash, Adobe Reader, Skype and Microsoft Security Essentials (for Windows 7). For Windows 8 based machines, the OEM’s should make sure that the devices are shipped with Windows 8.1 minimum, but ideally Update 1 should be installed as well, which improves the experience on traditional laptops/desktops. OEM’s should strive to keep their images as up to date as possible, so that the end user isn’t downloading a few GB worth of updates after their first boot. There’s nothing worse that powering up and watching Windows Update firing up and tearing through a few GB worth of bandwidth as it pulls down patches.

Lastly, hardware in the computer should not require an application be installed so that the driver is installed as well. I’ve had this problem with Lenovo and Samsung laptops, where in order to get rid of an outstanding entry in Device Manager, I’ve had to install one of the Samsung/Lenovo utilities. Often these utilities don’t work well and just add frustration for the end user.

Famed Windows blogger Paul Thurrott has a few articles up where he goes right back to basics and does completely clean installs of Windows on some of his devices. As he notes, it’s sometimes the only way to truly be rid of all the bloatware OEM’s like to install. Included are steps on how to legally download clean ISO images you can burn to disk or USB stick for a clean install of Windows. You can find his articles here, here, here, here and here.

SMART Board and USB port fun

February 1, 2015 Leave a comment

Over the last two weeks, we’ve slowly been ramping up our classroom computer swap program at work. 6 year old Core 2 based computers with horrid chassis and power supplies are coming out, being replaced with first gen Core i3 boxes that are quieter, smaller and faster. However, a recent event almost threatened to derail the project.

I placed one of the replacement computers in a class, had it setup as per usual and all was going well. After rebooting however, I noticed that the SMART Board (model SB-680) was not behaving properly. The board was either vanishing just before the computer was fully booted into Windows, or the board would constantly reset and be basically unusable. Changing USB ports did hot help at all, they only gave a temporary fix that lasted until the next boot.

I got the reseller of the board involved to do deeper technical diagnostics, though in honesty it was more a case of handing the problem over to someone else. This past Friday afternoon they arrived and we started a long troubleshooting process. The board was hooked up to the techies laptop and after some time, it settled down and behaved normally. We then swapped out the controller card, swapped out the board itself and tested on the replacement PC. All to no avail, the problem kept coming back. We even tried a new USB booster cable and USB cable, same result.

In desperation, I went into the BIOS to change the USB settings for the newer classroom motherboards. The Intel DQ57TM motherboards had been running completely fine for the last 4 years without issues in both our computer labs, so I couldn’t understand why it would give issues now. They are all flashed to the latest firmware Intel offers, so there would be no fix that way. It turns out that one simple BIOS setting may have caused the issue.

When I setup the computer via network boot and install using Microsoft’s Deployment Toolkit, I had to set the USB Backward Compatibility option to Disabled in the BIOS, as the keyboard and mouse were non functional in Windows PE. After the whole install process was over, I didn’t bother to change the setting again, since I didn’t believe it would affect anything. Suffice to say, enabling the option caused Windows to install a whole bunch of extra USB root hubs and stuff after the reboot. In turn, this then let the SMART Board behave properly. Our reseller’s techie learned something new, as did I. Now I know that I must make sure the setting goes back to Enabled before installation in the classroom, so that headaches can be avoided.

The truly bizarre thing however is that the problem only seems to be triggered if the SMART Board is hooked up to the computer via an USB booster extension cable. If the board is close enough to the computer desk and doesn’t use the booster extension, the board seems to work fine with the setting at Disabled. I have 2 classrooms where such is the case, and neither of those rooms have reported issues with their boards since the school year started.

Another quirky problem to add to the knowledge base of fun when it comes to SMART Boards.

UCT IT Management short course

October 28, 2014 Leave a comment

After 10 weeks of study, thought provoking questions as well as the odd bit of frustration, I finally finished off the last module of my UCT IT Management short course last night. Offered in partnership between the University of Cape Town and a private company called GetSmarter, the course is aimed at widening the knowledge of IT managers of all walks. There are a wide range of other courses on offer from the site, ranging from 8 – 10 weeks, all of them offered online. I decided to do the management course in the hope that I would pick up some new skills and get some new ideas, since these days I’m doing a lot more management rather than just purely technical work. Shaping budgets and policy is something new to me, so all the more reason I was eager to take the course.

The IT course is 10 weeks long as mentioned, so a week for every module. The entire course is run through GetSmarter’s VLE, which is a heavily modified version of Moodle. 5 of the 10 modules are tested via online quizzes of the usual fare i.e. multiple choice, True/False, pick the correct one etc. The other 5 modules are written assignments where you download a document with a case scenario in it as well as questions. From there you have to answer questions as well outline various scenarios, all while watching a line count per answer. Once completed, these documents are uploaded back into the VLE for marking.

I found that as the course went past the half way mark and into week 6, the content of the course became quite theoretical and abstract and dealt less with current trends and topics. Coming from a network administrator’s position in a school, a large amount of the terms and concepts I was exposed to were completely new to me. Changing my thinking to think along business lines proved to be quite a challenge, since the corporate world moves quite differently than the educational world. I know that out of the 10 modules, module 3 was definitely my least favourite, as it was incredibly densely packed with jargon and enormous amounts of theoretical knowledge.

Overall, I think the course is worth the money asked for it, extra studies are always good in jogging the brain out of its set ways. However, if you are new to network administration or IT, it’s definitely not the course for you – more vendor qualifications are appropriate in that case. This course is more for techies and admins who are moving up towards managing IT in their place of work, though as mentioned the course is almost completely focussed on the corporate world.

On an unrelated note, the course also showed me that Moodle can definitely work if enough effort is put into it – custom theme, disabling many end user features and so on. My experience is limited, but it’s been the best Moodle experience I’ve ever had.

Ddrescue to the rescue

September 20, 2014 Leave a comment

A few weeks back, thanks to the blue screen caused by Microsoft’s batch of faulty updates, I formatted a teacher’s class computer and redid it from scratch – this was before I managed to find the work around to fix the blue screen issues. The computer was running fine since then, until this past week. The teacher started complaining bitterly about how slow the PC had become. I checked for malware, as well as for any other crappy software that may have been causing the slow down. I found nothing. I asked the teacher to monitor the PC, while I investigated further.

A few days later, the teacher was even more frustrated with the machine. Now it was taking forever to start up, shut down and was hanging on applications. I looked through Event Viewer, only to discover ATAPI errors were being logged. Not just one either, there were dozens of errors. The moment I saw this, I knew that the hard drive was on the way out. While the SATA port could be faulty or even the cable, the odds of those being the culprits were rather low. Too many bad experiences in the past have taught me that it is almost always the drive at fault.

I procured a spare drive and decided the quickest fix was to simply clone one drive to the other. Using Clonezilla I tried to do the clone. On my first pass, about 75% of the way through the PC looked like it went to sleep and I couldn’t see any output on the monitor. I couldn’t revive the PC, so I rebooted and tried the procedure again. This time, it got up to about 97.5% before it crashed out. Based on what I saw, Clonezilla was hitting bad sectors, corrupt files or the mechanical weakness in the drive. Now I was getting worried, because any more cloning attempts could hasten the end of the faulty drive. Not only that, it was wasting time. Setting up the PC from scratch again was my last resort, since it would take hours. Before I gave up and did that, I remembered Ddrescue.

I had tried to use Ddrescue on my home computer more than a year ago when the hard drive holding my Windows 8 install died. Sadly, that drive was too damaged even for Ddrescue to be able to save. I was hoping that this hard drive of the teacher hadn’t yet hit that stage.

I ran Ddrescue and then waited as the drive literally copied itself sector by sector over to the new drive. What I wasn’t aware of is that Ddrescue doesn’t understand file systems – it just copies raw data from one drive to another. This means it will copy any file system, but in order to do so, it must copy every block on the disk. A tool like Clonezilla will understand a file system and only copy used data blocks, therefore saving lots of time by not copying essentially blank space.

Ddrescue did hit one patch of bad data, but was able to continue going, then came back at the end to try and pull out what it could. Thankfully, whatever bad data there was wasn’t too major, and Ddrescue completed successfully. Booting from the new drive was a success, and best of all, the speed was back again. I did run a sfc /scannow at the command prompt to check for any potential corrupt system files. SFC did say it fixed some errors, and I rebooted. Apart from that, it looks like I managed to save this system in the nick of time. The old hard drive was still under warranty, and has been returned to the supplier. He can return that drive and get a replacement for us, which will become a new hot spare for some other classroom.

When Windows Update goes wrong

Windows Update is usually a very reliable method of keeping Windows based computers up to date. Rough in the early days, it’s come a long way since then. Smooth and mostly transparent in the background, it isn’t often that bad updates slip through.

Unfortunately, during after August’s Patch Tuesday, such an event occurred. After a number of updates were either automatically approved or approved by myself, we had some computers blue screen and go into a reboot loop. Thankfully, out of almost 180 computers, only 5 have suffered the problem seen below:

WP_20140814_001

All of the affected computers were running Windows 7 x64 SP1 with all updates applied. The first 3 times this happened, I couldn’t find a cure for the problem and ended up wiping and redoing the computer from scratch. Later in the week, I found some instructions online on how to get out of the loop and get back into working order.

  1. Get into the Recovery Console either from install media or by letting the Repair your Computer wizard run after a number of crashes.
  2. Open up a Command Prompt and delete the FNTCACHE.DAT file located in C:\Windows\System32
  3. Reboot the computer, and you should now be able to get back into Windows.
  4. Delete the FNTCACHE.DAT file again, as it will have been recreated by Windows.
  5. Lastly, go to Windows Update in the Control Panel, then view Installed Updates. Remove KB2982791 and optionally KB2970228. The other 2 updates mentioned out there on the web only apply to Windows 8.1/Server 2012 and so are irrelevant to Windows 7 computers.
  6. Reboot after the patches are removed.
  7. As I said, it’s not often anymore that bad updates slip through all of Microsoft’s testing, but it does happen. Although it’s frustrating, I don’t intend to modify how I approve patches. I’d rather take the risk of something like this happening than get hammered by Alureon or Conficker or some other nasty because I ignored security patches.

Firmware update fun

A couple of years ago, flashing any device’s firmware was often a difficult, frustrating and sometimes downright dangerous task. Always hoping that the device wouldn’t get bricked due to some unknown bug in the firmware, or worse still, a power failure right in the middle of the flash.

These days for things like motherboards, it can be as easy as flashing inside Windows, or using the built in feature on the motherboard. Generally speaking, you no longer have to use MS-DOS and try to find floppy disks or use an alternative, it just works. Intel motherboards in particular are usually very straight forward when it comes to this: run the Express Update inside Windows. Windows reboots, the motherboard flashes itself, reboots and you are back into Windows. No other intervention required.

Thus it was a bit irritating a few weeks ago when I decided to flash some of Z68 motherboards to their latest (and last) BIOS version. I ran the Express Update inside Windows as I’ve done countless other times. Computer reboots, fails to flash the firmware and then goes back into Windows. No matter what I tried, the firmware would not update. My next step was to download the *.bio file from Intel’s website, place it on a flash drive and press F7 during boot, so that I can update the BIOS. This didn’t work as well:

WP_20140715_002

That leaves me only one option – use Intel’s Iflash tool. I don’t have a copy of MS-DOS lying around, and I didn’t feel like going through many hoops just to get a flash drive set up correctly. I discovered that Iflash works with FreeDOS, so I simply placed the files on a flash drive I have set up with Ultimate Boot CD, which includes FreeDOS. Run Iflash, the computer reboots, but then sits for a while doing nothing. I was about to reset the computer when I noticed the power led on the computer doing a slow pulse. I remembered that Intel motherboards generally do this when updating the firmware or when in sleep mode, so I let the process go on. Sure enough, after about 3 minutes, the computer rebooted by itself. The latest BIOS was now installed and working correctly.

Thankfully there was only about 5 computers to do this on. I’m not sure why this model motherboard was so fussy, but it’s done now.

Fixing Windows Update issues

August 3, 2014 1 comment

About three weeks ago, I approved a number of updates to be downloaded into WSUS for distribution on the school network. Among those updates was an update for the Windows Update client itself. I watched the WSUS console as the computers started reporting back and after a while I began to notice an odd pattern. 36 out of 39 computers in our main computer lab were not reporting in.

Taking a look at one of the affected computers in the lab, the cause of the computer not reporting in became clear: Windows Update Agent 7.6.7600.320 was failing to install repeatedly. Since this new version was required to download and install updated from WSUS, the computers would not be able to patch themselves until this Agent issue was fixed.

I tried numerous approaches to get the issue fixed: Uninstall anti-virus software, try installing updates at shutdown instead of through Windows Update in the Control Panel, run the System Update Readiness checker tool, run System File Check from the command prompt. Nothing worked. I was on the verge of preparing to wipe the lab and reimage the computers when I came across the answer.

Thanks to some vigorous internet scouring, I came across this Knowledge Base article on Microsoft’s website: http://support.microsoft.com/kb/2887535. Thankfully, the latest update agent was available to download right there from the article. I downloaded the 64 bit version and attempted to install the update manually on my affected lab computer. After a required reboot, I had success. Windows Update connected again and proceeded to download the now missing 17 updates and installed them. With this proving to be the solution, I went to each computer and installed the new update agent by hand. One by one, these computers were cured of the issue.

One computer however refused to install the updated agent. Checking the CBS Log file found in C:\Windows\Logs\CBS revealed that it thought it needed to be rebooted before updates could be installed. However, rebooting did not solve the problem. I’ve had issues in the past with Server 2008 where it got stuck on updates and needed a certain XML file to be deleted before it would boot again. Going to the location of the XML file, I couldn’t find the usual XML file. I did however find a reboot.xml file, which I viewed. This file pointed to a registry key that I assume was supposed to be deleted after the last round of updates. Since the key wasn’t deleted, the computer still thought it needed to be restarted. Deleting this key and rebooting solved the issue – I could now install the updated agent and install updates again.

At this point in time, I’m still not exactly sure why this lab of computers failed to install the update agent while the rest of the school did so without much fuss. About the only thing I can think of is that it’s somehow related to how the lab was cloned which was somehow causing an issue. Reading through the CBS logs doesn’t shed much light on the issue, since I don’t fully understand everything that’s captured in those log files.

I suppose this serves as a good reminder that while WSUS and Windows Updates in general normally just work, sometimes things can go wrong.

Follow

Get every new post delivered to your Inbox.