Friday, August 03, 2012

Scotgrid Calling

It has been a while since we last updated the blog which is generally a sign of being busy. Unfortunately we have encountered several infrastructure issues recently which needed to be repaired. Predominantly these revolved around the air conditioning units on the roof of the Kelvin Building. This work was completed a few weeks ago but as one thing has been fixed another issue identified itself in the form of a failing Air Handling Unit in 141. The knock on effect of this is that we can't take full advantage of the cluster servers located in the room and the overall cluster is running at two thirds capacity presently.

While, these events are less than optimal it has allowed us to plan the next set of cluster upgrades which will introduced another 256 job slots into the cluster and due to the new resilient network fabric we have developed the deployment of these services is no longer limited to one room supporting 10 gig interfaces.

Other developments also include the re-introduction of an independent control network and a new WAN testing platform Perfsonar. We will blog about this seperately shortly.

