Thursday, May 13, 2010

A fistful of user jobs...

No ATLAS production to do in the UK, but we have a nice full cluster anyway, with more than 1000 user jobs running:


svr016:~# qstat -q

server: svr016.gla.scotgrid.ac.uk

Queue            Memory CPU Time Walltime Node  Run Que Lm  State
---------------- ------ -------- -------- ----  --- --- --  -----
q2d                --   48:00:00 48:00:00   --  134  12 --   E R
atlanaly           --   24:00:00 24:00:00   --  738 789 --   E R
atlprd             --   48:00:00 48:00:00   --    2  48 --   E R
q7d                --   168:00:0 168:00:0   --    0   0 --   E R
route2all          --      --       --      --    0   0 --   E R
q1d                --   24:00:00 24:00:00   --   94  16 --   E R
mpi                --      --    72:00:00   --    0   0 --   E R
atlas              --   24:00:00 24:00:00   --  470 312 --   E R
lhcb               --   48:00:00 48:00:00   --    0   0 --   E R
                                               ----- -----
                                                1438  1177

The atlanaly queue are jobs from the panda backend and the atlas queue takes WMS backend jobs.

Today the particular job mix was kind to the storage, with no overloads being seen, but it's something we constantly have to monitor to pre-empt problems.

Postscript: I had another look and realised that most of the WMS backend jobs were from hammercloud (Sam testing SSDs!). Seems that genuine user WMS jobs were about 20-30, with more than 1000 in the panda backend.

No comments: