Wednesday, January 28, 2009

WMS purging fixed...

Ever since we've had our WMSs installed at Glasgow, we've observed that job purging appears broken. What's supposed to happen is that, when a user retrieves their job's output, the associated sandbox on the WMS is cleaned out. However, users of the ScotGrid WMSs were seeing:

bash-3.00$ glite-wms-job-output https://svr023.gla.scotgrid.ac.uk:9000/IfNak9XhD80im39v5JVGNw

Connecting to the service https://svr023.gla.scotgrid.ac.uk:7443/glite_wms_wmproxy_server

Warning - JobPurging not allowed
(The Operation is not allowed: Unable to complete job purge)

This ticket was raised and, eventually, we figured out that the WMSs need DN entries in /opt/glite/etc/LB-super-users relating to both WMSs. In addition to that, there's a bug which requires the DNs to be present in two slightly differing formats:

/C=UK/O=eScience/OU=Glasgow/L=Compserv/CN=svr022.gla.scotgrid.ac.uk/emailAddress=grid-certificate@physics.gla.ac.uk
/C=UK/O=eScience/OU=Glasgow/L=Compserv/CN=svr022.gla.scotgrid.ac.uk/Email=grid-certificate@physics.gla.ac.uk
/C=UK/O=eScience/OU=Glasgow/L=Compserv/CN=svr023.gla.scotgrid.ac.uk/emailAddress=grid-certificate@physics.gla.ac.uk
/C=UK/O=eScience/OU=Glasgow/L=Compserv/CN=svr023.gla.scotgrid.ac.uk/Email=grid-certificate@physics.gla.ac.uk

(compare emailAddress with Email)

Anyway, with these changes made (and a service gLite restart), the WMSs will now purge job output:

-bash-3.00$ glite-wms-job-output https://svr022.gla.scotgrid.ac.uk:9000/adlQbeXjpyURB3qpt-NQAA

Connecting to the service https://svr023.gla.scotgrid.ac.uk:7443/glite_wms_wmproxy_server

================================================================================
JOB GET OUTPUT OUTCOME

Output sandbox files for the job:
https://svr022.gla.scotgrid.ac.uk:9000/adlQbeXjpyURB3qpt-NQAA
have been successfully retrieved and stored in the directory:
/tmp/jobOutput/mkenyon_adlQbeXjpyURB3qpt-NQAA
================================================================================

No comments: