Friday, September 29, 2006

More Edinburgh storage-related news: One of our disks died during the night. Similar to the problem of failing SAM tests (see below) this happened during the Edinburgh<->Durham simultaneous read/write transfer test. It appears this has been quite a stressful operation for our site, although it should be noted that we lost power to the ScotGRID machines this week when the computing facility went down for maintenance. Unfortunately this happened a day earlier than we had originally been told. You can draw your own conclusions about that one.

The client tools for analysing the RAID setup confirm that a disk has broken, but also show that even though we have hot spares available to replace it, none of them have been used. The software is also highlighting a couple of other problems. Looks like some manual intervention is required.

No comments: