Thursday, December 20, 2007

ECDF Progress

We finally seem to be homing in on the problems at ECDF. Any job which forks off too many processes seems to die in the batch system. Launching a simple fork python job works fine at 20 children, but dies at 50. The same task at Glasgow runs happily with 100 children.

I can see there is no ulimit issue, but something is unhappy. We must track down if it is SGE or some gatekeeper weirdness.

