Monday, March 24, 2014

The Three Co-ordinators

It is has been a while since we posted on the blog. Generally, this means that things have been busy and interesting. Things have been busy and interesting.

We are presently, going through redevelopment of the site, the evaluation of new techniques for service delivery such as using Docker for containers and updating multiple services throughout the sites.

The development of the programme presented at CHEP on automation and different approaches to delivering HEP related Grid services is underway. An evaluation of container based solutions for service deployment will be presented at the next GridPP collaboration meeting later this month. Other evaluation work on using Software Defined Networking hasn't progressed as quickly as we would have like but is still underway.

Graeme (left), Mark (center) and Gareth.

On other news, Gareth Roy is taking over as the Scotgrid Technical Co-ordinator this month. Mark is off for adventures with the Urban Studies Big Data Group within Glasgow University.And as Dr Who can do it, we can do. Co-ordinator Past, Present and Future all appear in the same place at the same time.

Will the fabric of Scotgrid be the same again?

Very much so.

Monday, October 14, 2013

Welcome to CHEP 2013

Greetings from CHEP 2013 in a rather wet Amsterdam.

The conference season is upon us and Sam, Andy, Wahid and myself find ourselves in Amsterdam for CHEP 2013. CHEP started here in 1983 and it is hard to believe that it has been 18 months since New York.

As usual the agenda for the next 5 days is packed. Some of the highlights so far have included advanced facility monitoring, the future of C++ and Robert Lupton's excellent talk on software engineering for Science.

As with all of my visits to Amsterdam, the rain is worth mentioning. So much so that it made local news this morning. However, the venue is the rather splendid Beurs van Berlage in central Amsterdam.

CHEP 2013

There will be further updates during the week as the conference progresses.

Busy Year

We haven't posted a great deal this year as there has been a huge amount going on within Scotgrid since January.

The main news this year has been Stuart's departure to Saint Andrews University from the Glasgow Scotgrid Team. Stuart's role within EGI, ROD and Grid Ops as well as the Glasgow site and his development work on MPI at Glasgow says a lot about his rather busy part-time role within Scotgrid. The word Factotum or "make everything" springs to mind when describing his input.
We all wish Stuart the very best at Saint Andrews.

The Glasgow site has suffered from issues with the coolant infrastructure since January. To mitigate this the University is upgrading both the power and air conditioning within the Kelvin Building. This work will include the installation of a Generator and UPS system as well as new air conditioning units. This is a long term project and will be completed by the summer of 2014.

ECDF has performed incredibly well since January and Durham, while suffering from air con and power issues earlier in the year is now relatively stable.

We have brought in additional VOs with the MVLS group at Glasgow University and are presently in discussions with other non-HEP groups such as bio-chemistry. The most technically challenging project is the proposed investigation into the Lairg Magnetic anomaly by the EarthSci group at Glasgow. This project is difficult due to the lack of network connectivity in the area where the data is being generated, we will report on this soon.

Our primary focus of research, outside of running the sites, have covered GPU work at ECDF, more efficient data management and deployment strategies at Glasgow and ECDF. Additionally, how we utilise containerisation and build smarter cluster restart environments has been investigated by Gareth at Glasgow. David Crook's has done excellent work around aggregating the multiple monitoring platforms that have sprung up within the Grid by utilising the Graphite package.

We have attended and presented in multiple conferences and public outreach events including one during the Edinburgh Festival.

So that is up to date in time for CHEP. Which is on this week. Still trying to work out how quickly the last 18 months went.

Tuesday, December 25, 2012

A Merry Christmas and a Happy New Year to all our followers, users and co-workers from Scotgrid Glasgow.
Keeping to a Physics theme, as always.

See you all in 2013.

Thursday, December 06, 2012

2012; A Grid Odyessy

We haven't published on the blog since September this year which is a bit remiss of us.
There are many reasons for this. Primarily we have been working through the final back log of the DRI grant until October. The expansion of the Glasgow site to 4000 cores, terra scale networking and changes to the disk farm have not been simple. Once the long standing issues with the internal data network were resolved with the upgrade to the Extreme Networks equipment, additional issues around the placement of data by DPM became evident. This was not a non trivial task to investigate. Stuart and Sam are in the process of developing a software patch for allowing a more sensible placement of data files within the cluster.

In addition to this work, we are currently considering software and hardware changes to our data storage architecture in the new year. More of this in January.

Again this year Glasgow has been plagued with infrastructure issues which have caused several major issues to the site's operation. We are now in a position where there is a major upgrade programme underway to deliver a more robust power, fire suppression and air conditioning system throughout the computer rooms.

While these combined issues have caused a large number of issues the Glasgow site saw a return to 100 % availability and reliability metrics for November on the WLCG accounting earlier this week.
Hopefully, this is how we will continue through the Christmas period and into 2013.

As the end of winter is upon us with the winter solstice being just over 15 days away and Christmas following shortly behind it we would like to wish everyone a Merry Christmas and a Happy New Year for all of those at Scotgrid.

Monday, September 03, 2012

All hands to the pumps, oh wait, that is Glycol

Unfortunately on Friday the Air Con fairy visited Glasgow and due to a faulty pressure valve decided to sprinkle some magic in one of our plant rooms by dumping liquid coolant onto the floor. We took emergency action and closed down the cluster as the heat being generated in 141 was going above 30 degrees centigrade. The faulty equipment and associated devices were replaced. Thankfully, there hasn't been any damage to the equipment and we will be coming out of downtime and going back into production shortly.

The Higgs appears

This year we haven't been great at keeping up with blog posts but there are many reasons behind this. We have installed a new network, additional cores taking us up to 4000 available job slots and have upgraded the server infrastructure throughout  GU Scotgrid's cluster. Also, we have fallen foul of infrastructure issues and have had problems with the old and replacement Air Con systems. Slowly, we are extracting ourselves from these issues and  recently Professor David Britton gave a lecture on the Grids role in the announcement made in July of this year at CERN during this years Turing Festival. A surprise appearance at the event was Professor Higgs himself.

Professors David Britton (left) and Professor Peter Higgs

Also presenting at the event were Professors Tejinder Singh Virdee and John Ellis of Imperial College and Dr Ben Segal from CERN. The event was one of the kickstart activites for the Turing Festival and enabled the public and academics to get a better over view of what has been involved in getting the experiments this far.

Friday, August 03, 2012

Scotgrid Calling

It has been a while since we last updated the blog which is generally a sign of being busy. Unfortunately we have encountered several infrastructure issues recently which needed to be repaired. Predominantly these revolved around the air conditioning units on the roof of the Kelvin Building. This work was completed a few weeks ago but as one thing has been fixed another issue identified itself in the form of a failing Air Handling Unit in 141. The knock on effect of this is that we can't take full advantage of the cluster servers located in the room and the overall cluster is running at two thirds capacity presently.

While, these events are less than optimal it has allowed us to plan the next set of cluster upgrades which will introduced another 256 job slots into the cluster and due to the new resilient network fabric we have developed the deployment of these services is no longer limited to one room supporting 10 gig interfaces.

Other developments also include the re-introduction of an independent control network and a new WAN testing platform Perfsonar. We will blog about this seperately shortly.

Thursday, May 24, 2012

CHEP Update

As we are into the 4th Day of CHEP a quick overview of the activities of Scotgrid, GridPP and the conference as a whole is now in order.

We have presented our posters and have generated interest in Storage, Job failures, Network Security and some of the work we have conducted with IPv6. Several potential collaborations with other sites and developers have resulted from these presentations. Andy and Wahid had several successful talks and there was a high volume of interest on the work being discussed.

From a GridPP perspective Chris Walker's poster on using Lustre for low cost petascale storage also generated a large volume of interest. Talks given by other members of the collaboration were equally well received.

The conference itself has covered the multiple developments within the field over the last 12 - 18 months with presentations investigating a variety of topics including federations for data, the future of CPUs/GPU, ultra high speed networking and common software architectures for the experiments.

The variety of techniques being deployed and approaches taken to Grid centric problems are always of interest.