Showing posts with label LCG CE. Show all posts
Showing posts with label LCG CE. Show all posts

Monday, February 21, 2011

The CE is dead. Long live the CE. Nos paenitet incommodo

As part of the on-going developments to the Scot Grid cluster at Glasgow, we have decommissioned our final LCG-CE, which resided on SVR021. The removal of this CE allows us to concentrate the support and development of two CE platforms; Cream and ARC. We are planning to conduct a series of tests around the three CREAM CE's we have deployed at Glasgow in an attempt to gain a better understanding of their maximum loading potential for running jobs and how to tweak them to gain the maximum efficiency from this service.

Additionally, we will be testing our availability metrics over the next month as the LCG-CE was one of the corner stones of Steve Lloyd's tests of our overall availability. This will now be monitored primarily through our SRM availability.

The reasons for decommissioning the LCG-CE are that we would be removing it at some point in the near future, all the big VO's do not have issues with submitting to Cream CEs and it simplifies our internal support requirements.

The new servers running Cream are svr008, svr014 and svr026.

Thank you LCG-CE and goodnight.

Friday, November 19, 2010

Second Cream CE for Glasgow First steps

We are currently in the process of installing a second Cream CE at Glasgow. This will replace one of the LCG CEs at Glasgow. As this is my first major service install since joining ScotGrid and the Gridpp project at the end of August I thought I would share the process for this type of service change with the wider community.


The first steps undertaken by myself was to drain the current LCG-CE to prepare it for the new install, the commands are shown below.



" For multiple CE's with shared queues. Edit the gip file on the CE you wish to drain. This blocks WMS submission: 

vim /opt/lcg/libexec/lcg-info-dynamic-pbs

change: push @output, "GlueCEStateStatus: $Status\n"
to: push @output, "GlueCEStateStatus: Draining\n" "


" on the batch machine: vim /etc/hosts.equiv comment out the machine you wish to stop accepting jobs and restart maui: 

svr016:~# cat /etc/hosts.equiv
svr021.gla.scotgrid.ac.uk
#svr026.gla.scotgrid.ac.uk "
 
However the GOCDB was not updated by myself to indicate scheduled 
downtime for this service change and after a GGUS ticket this was quickly
rectified.  We are waiting on the jobs to drain from the LCG-CE just now
 before continuing with the install early next week.