Monday, December 08, 2008

Analysis Challenge: Round 3

Last week's analysis challenge at Glasgow showed extreme load and sluggishness in the DPM (see the attached plots of awfulness). Although we managed a much better event rate we also suffered from incomplete processing and the DPM was a clear bottleneck.

I had a chat with JPB today who spotted the very high memory consumption of the dpm daemon - he thinks there's probably a memory leak and that this might be slowing things down. He also said it might be worth running more dpns daemons as these also do connection athentication.

So, to get ready for tomorrow I have:
  1. Allowed core dumps for the DPM and DPNS daemons.
  2. Increased the number of threads in the DPNS daemon to 60.
  3. Restarted all the daemons.
That last operation freed up about 3GB of memory!

If we still see problems tomorrow then at least we should have some good information for the developers to chew on.

