Keith's Tech Musings: August 2012

Monday 20 August 2012

Windows 8 - First thoughts and future telephone support issues

Now that Windows 8 has been released I've been having a play with it this weekend. In terms of usability I'm not entirely sure how I feel about it currently. On the one hand I'm sure in time I could come to get used to the new way of doing things, and the workarounds to enable the old ways, but on the other I can only imagine the pain and misery to come from trying to do telephone support for it.

The key problem I see is with the schizophrenic nature of Windows 8. Having two completely different UIs within a single OS will at best make things interesting, at worst confuse the hell out of people.

It's not helped by a lack of distinction between the two in regards to names. For instance, now that it's no longer officially called Metro, we have the Desktop and the Modern UI, but the latter isn't a name it's a description! The Desktop UI doesn't exactly look dated either, in fact I'd say it was positively modern. What will we call it in several years time when the "Modern UI" is anything but? It's not helped by many sources also referring to it as the Windows 8 UI, therefore throwing three different names into the mix.

Then there's Internet Explorer, with two essentially different applications using the same name. They function differently, have different options and ways of doing things, yet how do you easily differentiate them verbally? I've seen enough users struggle to understand what IE is at the best of times, but at least then you can tell them to look for the big blue "e". Now you also have to check which UI they're using, otherwise your instructions will be meaningless.

Internet Explorer also handles downloads differently, which could be fun for anyone using a remote support application (eg, getting the client to go to a URL, enter some details, and run a program to give you remote access to their machine). Since IE 9 the download dialog window has been tied to the window that initiated it. If the download was initiated via a popup (eg it provides the download link to click), and that popup closed once the download started, the download dialog would disappear once completed. Fortunately you could bring it back by pressing CTRL-J, and then choose to save or run the download. This still works in the desktop IE 10, but doesn't in Metro IE 10. Instead you get a dialog appear for 2 or 3 seconds giving you the run / save / cancel options, and if you don't respond in time it vanishes. Thus far I've been unable to find a way to retrieve it other than running the download again.

All that said, I have found a few handy features.

Windows + D - Show Desktop - Nothing new here, but importantly it will get you out of Metro to the normal desktop UI.

Windows + C - Charms menu - This is the Metro menu you get by hovering to the right hand corners to access things like settings, shutdown/restart etc.

Windows + Z - Tabs / Address menu in Metro IE and All apps from Metro start page - Again, saves messing around with tricky mouse positioning.

Finally, while there's no way to disable Metro on startup, there is a work around to make getting to the desktop easier. On the metro start screen, drag the Desktop tile up to the top left hand corner (eg top row, first column). Now after typing your password and pressing Enter to login you can immediately go to the desktop by pressing Enter a second time (since the top left tile is automatically selected).

http://social.technet.microsoft.com/Forums/en/w8itprogeneral/thread/66df8624-d6bc-46c8-a7a8-2600c6f12193

Sunday 19 August 2012

The joys of technet

Thanks to the wonders of a Technet subscription and a new computer I've been having a play with VirtualBox at home. While certainly more basic than VMWare or Hyper-V it does the job, I think it's easier to use than VMWare and of course Hyper-V won't run on Windows 7.

I'm definitely very tempted by Windows 8 since that includes Hyper-V Client, though due to the train reck that is the Metro interface I may well stick with this if it continues to do what I need, especially if the reviews of the latest release preview turn out to be accurate.

Perhaps it's sad, but I'm actually looking forward to being able to play with various things at home. All those things that have caught my interest, that I fancy looking at, but which have no justification to spend time looking at while at work.

Sunday 12 August 2012

When Linux load averages lie

To follow on from my last post, having data showing the load on your server throughout the day is great, but what do you do with it and how do you interpret the information?

Looking through the data I could see clear short periods where the load average would sky rocket, into double figures and into the 20's. From every article and post I'd seen it said that if the load divided by the number of cores/threads was more than one you have a problem. No ifs, no buts, you have a problem. I was seeing 20+ on a dual cpu server... oh dear!

The problem was I couldn't see an obvious cause. CPU usage wasn't high, the idle time was good, memory was fine with the swap file not being used, the disk queue wasn't long, and the process list didn't show anything to indicate an issue.

Fortunately I came across a few fantastic explanations of Load Averages that explained where I (and it seems many others) had been going wrong. I've linked to them all below, and I recommend reading them for more info, but the upshot is that it's not as cut and dry as people make out.

To quote from Jon Emmons blog, the load average "is the average sum of the number of processes waiting in the run-queue plus the number currently executing over 1, 5, and 15 minute time periods.".

The load average is far more complex than many people make out, and while it can be a good initial indicator of a problem it must be examined in conjunction with other factors. It doesn't allow for the fact that a process could be waiting for not just the CPU, but also disk or network IO, and doesn't allow for the priority of the running/waiting processes in the queue.

If you have a long running low priority process running for instance, that will always make way for more urgent time critical requests. In the mean time that process will sit in the queue, and will cause the load average to increase. Add some more of these low priority processes, for instance a backup job, and the load average will increase, which indicates a problem. Higher priority processes like email, websites etc will be handled immediately however, causing no delay for users, and as such the reported high load isn't really an issue.

So the key is that it's fine to track the load average to indicate a possible problem, but don't rely on it for proof that you have one. Always remember to check the other figures provided to see IF you have a problem, not necessarily WHAT the problem is.

http://www.lifeaftercoffee.com/2006/03/13/unix-load-averages-explained/
http://blog.mellowhost.com/confusing-server-load-average-explained.html
http://www.teamquest.com/pdfs/whitepaper/ldavg1.pdf
http://www.teamquest.com/pdfs/whitepaper/ldavg2.pdf

Tuesday 7 August 2012

Tracking Linux server load and processes over time

Tracking down the cause for high load on a server can be a challenge, especially when issues are happening when you can't keep an eye on it, but it's a challenge I've been facing recently. While there were no issues with the server's response, I discovered at certain times (often in the evenings) the load average would become high enough to cause Exim to briefly pause processing the mail queue.

After hunting round for a easy way to track the server's state I came across a simple script from Craig Edmonds that did the job. It very simply generates an email containing a variety of status information, including most importantly the process list data from Top, and sends that to you in an email. By scheduling the script to run every minute you get a snapshot of the servers state at regular intervals. Since the subject line includes the load average you can easily look through the messages, spot those times with a high load, and see what the server is doing.

In my case I adjusted the script slightly to include $todaydate in the subject line, since the above issue with Exim meant I couldn't always rely on the message being received in the correct order.

There was one problem I found with this solution. The script runs a single iteration of Top and inserts the output of that into the email, however as you can see from the Man page for Top :

       The top command calculates Cpu(s) by looking at the change in CPU time
       values between samples. When you first run it, it has no previous sam-
       ple to compare to, so these initial values are the percentages since
       boot. It means you need at least two loops or you have to ignore sum-
       mary output from the first loop. This is problem for example for batch
       mode. There is a possible workaround if you define the CPULOOP=1 envi-
       ronment variable. The top command will be run one extra hidden loop for
       CPU data before standard output.

each email I received had identical CPU data. While I could tell the server's load was high, I couldn't see what the state of the processor was at that time. I didn't fancy messing around with environment variables, so instead opted for a solution found here, and adjusted the line calling Top as follows :

    $process_list = shell_exec('top -b -n2 | awk "/^top/{i++}i==2"');

So I found a simple and easy way to track what's happening on the server, though of course with an email a minute it's not something I'll be running long term.

I wish I could say that was the end of it, but unfortunately this turned out to be the beginning of my struggle and confusion, brought on in no small part to the number of confused explanations of Load Average operations, but I'll discuss that in my next post.

References :

http://www.craig-edmonds.com/abuse-monitoring-script-for-cpanel/

http://www.unix.com/gentoo/77494-top-batch-mode-cpu-info-wrong.html