[Tracker-discuss] [issue504] Server error

Sun Jan 20 08:35:08 CET 2013

I worked on it quite a bit yesterday. I pushed the load average to 36
at times, which was almost entirely because of disk issues (Linux
deems processes in the disk queue as runnable, so a high load average
often points to a long disk queue). It should be better since about 7
hours ago. I'm still writing zeroes to the rest of the open space, in
an attempt to force the disk to swap the sectors (which it can usually
only do on a write). The raid array rebuilt to 95% last night before
it failed, so we're getting close to getting redundancy back. I highly
suspect a disk swap might still be necessary, but at the moment it
seems the disk that is not in the array is the better one, so I want
the array back in sync first.

At the moment I'm not having joy with smartmontools. The initial stats
showed some 28 bad sectors that were pending a swap, which isn't too
bad, but a full offline scan (which despite it's name can be done
while the disk is online) will take a full day.

The files we lost were almost all log files, even in the other virtual
hosts on that machine. One of postgresql's WAL logs also failed but I
could recover it from a previous copy. By simply doing a few
successive rsyncs I got all the data back.

regards,
Izak