Friday, June 13, 2008
Much improved CE
Our bugbear in the past was always the lcg-CE, which was a service it was easy to overload and cause the site to hit meltdown (we have a lot of examples collected in http://scotgrid.blogspot.com/search/label/CE.).
A few months ago a new daemon, the globus cache marshal, was introduced which promised to substantially reduce the load on the old CE.
Recently we have had a few job spikes from local atlas and pheno users and I'm very happy to say that the CE seems much healtier than in the past. Having more than 1.5k jobs running and queued the load on the CE was modest and the CPU usage was < 20%.
This is a huge improvement over past performance and has removed a major source of site instability.