Wednesday, May 19, 2010

SGE and Lustre

On my list of things to do was install (Sun/Oracle) Grid Engine and get a CREAM CE submitting to it on my development cluster. So far I have SGE installed and running qsub jobs. I am documenting the experience for those who are interested here. I have opted for Lustre rather than NFS 3 as it is painfully ill-equipped for the task and we have a test Lustre instance to play with so why not go the whole hog.

the positives ...
1. The wealth of documentation on the Oracle page.
2. The interactive install is very easy to do.

and the negatives ...
1. The rpms default install location is /gridware & I can't seem to get my yum repo to use a --prefix like option. Something you can do with rpm. Ideas welcome?
2. The automatic install scripts having no debugging on them at all. When they fail they just fail silently with no output or logs. I have only managed an interactive install so far but I will try to set /bin/sh -x and see if it makes a difference. Hopefully I can get the automatic script running so that cfengine can deal with the install of the execution hosts rather than by hand.

2 comments:

Blissex said...

«my yum repo to use a --prefix like option. Something you can do with rpm.»

Some RPM packages are relocatable and some are not (flag in the metadata). Those that are not must be installed in the given location.

If the RPMs *are* relocatable (very very very few are) then you can just download them with YUM '--downloadonly' and then install them with RPM.

YUm itself supports only a subset of RPM's functionality, for example it does not allow two RPMs with different versions to be installed at the same time (it always installed with '-U' and does not allow installing with '-i').

Steve Traylen said...

There are SGE packages in EPEL called
gridengine which are probably more standard. Version 6.1u3-6 in .el5.