Interactive Debugging on WNs

[Not strictly scotgrid, but figured the scotgrid blog has a higher readership than my personal ramblings]. How to get an interactive bash shell on the workernodes (with grid environment) to debug. Case in point, as part of ther certification of the SL5 x86_64 bit WN, I could lcg-cr fine on the command line, but not as a job.

WARNING - Trying this as a user without the site administrators assistance will probably lead to 'Bad Things' happening to your DN and the banned user list... You have been warned.

So - I wanted to get a shell to work out exactly what wasn't quite right.

On the Workernode:
1) install screen (yum install screen)
2) chmod 755 /var/run/screen
3) chmod +s /usr/bin/screen (yes, we know SUID is bad mmmkaaay.)
4) append to /etc/screenrc
multiuser on
acladd root

Then your jdl can simply invoke 'screen -dm'. root can then reattach to the session on the same workernode using screen -rx wnusername/pid... syntax, eg:

[root@vtb-generic-94 ~]# screen -r dteam013/
sh-3.2$ voms-proxy-info --all
subject : /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=aelwell/CN=671736/CN=Andrew Elwell/CN=proxy/CN=proxy/CN=limited proxy


Gotchas: Trying to be smart and put Executable = "/usr/bin/screen"; and Arguments = "-d -m"; doesn't help. Although the screen session launces as it should, the cleanup wipes all your proxy and other goodies.

Working with a noddy input sandbox of
screen -dm
sleep 3600

did the trick fine.

Stephen Childs said...

The i2glogin program works really well for interactive login over the grid. You start up the server on your UI then submit a job which connects back to the correct port. After waiting a little while for the job to get through the grid, you get a shell on the WN.

Something like this in your JDL:

Executable = "i2glogin";
Arguments = "-p $port:$UI_address -r -t -c /bin/sh";

RPM available at: