Tuesday, June 05, 2007

An Unhappy Night With Steve's Tests

I'm very pleased that Glasgow is the top site for Steve's ATLAS tests, but last night we seemed to have a miserable time, failing ~10/30 tests. These were all ABORTS from the RB, which was the IC RB in each case. And all of the successful jobs also came from the IC RB - so it wasn't that we had completely fallen out. And Glasgow's the only site affected, so I think it must be a site issue. However, there just isn't enough information in the logfiles to be able to tell why the job's aborting.

The first thing I checked was autofs (I added a new map yesterday), but this was ok. /home and /tmp are also fine. I'll have to dig into torque and see what I can find.

It really is annoying hard to pin these things down in the baroque dance which is EDG job submission...

