Now, I did run a distributed shell install of lcg-CA on these nodes, but with 100 batch workers I failed to notice that this had somehow failed on 2 of the nodes.
With cfengine we have built into the system:
packages:
grid::
lcg-CA action=install
glite-yaim action=install elsedefine=runyaim
So it will check every hour if lcg-CA is properly installed and install it for us if it is not.
Note also how we can define the "runyaim" class when yaim is first installed - this will run yaim automatically after the metapackage is installed (and then triggers the switching on of the batch system).
More details in the wiki soon...
No comments:
Post a Comment