Friday, July 20, 2007

GridICE Ate My CPU...

After upgrading the CE yesterday, the CPU and load were rather high. Re-running YAIM had re-enabled the GridICE monitoring system, which had merrily decided to swallow an entire CPU itself.

When I switched if off CPU load on the CE dropped from 70% to 20%. (See ganglia plot - the difference is pretty obvious.)

Although GridICE gives some interesting monitoring information and aggregation at the Grid level, it's a duplication of information elsewhere (like gstat) and consuming a whole CPU is absurd.

I put in a GGUS ticket about this, but for the moment GridICE is disabled on our CE.


I've added this to cfagent.conf:

"/opt/gridice/monitoring/bin/*" signal=term
"/opt/edg/sbin/edg-fmon-agent" signal=term

