Thursday, March 18, 2010

Corralling jobs in Maui.

Sometimes, when testing new hardware or software in a limited way, it is important to be able to arrange lightweight, temporary partitions of a cluster for only a given user.
Now, you could repartition the cluster nodes between a "normal" partition and a "testing" partition, but for most pbs/maui clusters (which don't have anything but the 'ALL' partition set), this involves changing configuration for all the nodes, rather than simply the nodes we care about. (And then changing it back when you're finished.)

You might also consider doing this with reservations - indeed, the maui manual suggests that a reservation locked to a user specified with an & prefix will force precisely the behaviour we want - locking the reservation and the user together. This appears not to work under empirical testing.

Instead, the solution we've found to work is (all in maui.cfg):

  1. Create a reservation for the user only.
    SRCFG[ssdnodes] PERIOD=INFINITY
    SRCFG[ssdnodes] STARTTIME=00:00:00 ENDTIME=24:00:00
    SRCFG[ssdnodes] HOSTLIST=node30[0-9]
    SRCFG[ssdnodes] USERLIST=ssp001
  2. Create a quality of service class with the property that it only runs on that reservation.
    QOSCFG[ssd] QFLAGS=USERESERVED:ssdnodes
  3. Make the user a member of that quality of service class only.
    USERCFG[ssp001] QDEF=ssd QLIST=ssd
(In this case, the configuration mutually restricts the user ssp001 and the nodes node300 to node309 to each other.)
This has the benefit that it also generalises to any number of users, as long as you add them to the reservation and the QoS class.

3 comments:

Arnau said...

Hi,

we do same kind of reservations but without QoS. Something like:

SRCFG[picsgm_64] GROUPLIST=atsgm,sgmcm,lhsgm,masgm,ctasgm,dtsgm,misgm,pasgm,picvosgm
SRCFG[picsgm_64] RESOURCES=PROCS:8
SRCFG[picsgm_64] PRIORITY=1000
SRCFG[picsgm_64] HOSTLIST=node
SRCFG[picsgm_64] STARTTIME=0:00:00 ENDTIME=24:00:00
SRCFG[picsgm_64] PERIOD=INFINITY

Is your QoS conf mandatory? Doesn't it work without it?

Cheers,
Arnau

Sam Skipsey said...

Well, a reservation like that, without a QoS or other limiting statement, will let only those groups run on those nodes, but doesn't (as far as we can determine with some testing here) prevent those groups from running on other nodes.
The second clause is what the QoS class enforces.

(For SGM job priorities, merely providing a dedicated resource is enough. What we were doing was attempting to re-implement partitions (which are poorly supported in maui) but with the existing tool set.)

Arnau said...

Ok, I understand your point.


*I forgot to mention that we also have an extra property for SGM node set by torque_submit_filter in CE.
So job goes to regular queue but with extra queue property. That is our limituing statement.

Thanks Sam,
Arnau