Keeping my fingers crossed, but I replaced the SCSI controller (Tekram) in the server by one I had in my old-stuff-stash, an Adaptec 2940W. Yep, old. I had to enable “Load BIOS” on the card, but after that the system booted like I never changed a thing. For now the system seems stable, it’s running Ubuntu’s daily.find now. Heavy disk access would previously reboot the system, but so far so good.
On a side note, my desktop PC no longer runs Linux (OpenSuSE). I’ve had it with it. There’s always something that doesn’t work. Or that keeps crashing. Or that suddenly stops working until a reboot. Sound support sucks. After every kernel update I had to do a manual install of the NVidia drivers to get X running. Always waiting for the X64 versions, since the Linux community still thinks we all run 32-bit computers (same on Windows, but at least that let’s me run 32-bit programs without problems).
I bought a new videocard (a Sapphire HD7970 to be exact), bought an SSD (OCZ Vertex4 128GB) and switched to Windows 7. Everything works. Period. And I can now use Adobe software, so I bought Adobe Photoshop Lightroom to finally being able to organize my pictures (lastest count: over 26000) the way I like it (F-spot and Picasa are nice, but not more than that).
Main reason for the videocard was gaming of course. Racing in the first place, FPS as a close second. Bought Deus Ex Human Revolution (FPS-ish) today, more to follow.
The server reboots are not gone. They are less frequent, but certainly not gone. I suspect it has something to do with the old SCSI disk the root-fs is on, so in the days to come I will try to do a fresh install of a newer Linux on the disk that came from my desktop (500GB SATA, system now runs from 9GB SCSI). And then hope that removing the old noisy disk will solve the problem. Can’t think of what to replace else…
I just replaced the CPU cooler in the server, along with its (broken) mounting bracket. The old one was the stock AMD cooler, on a plastic bracket that came with the motherboard. One of the plastic hooks was broken a long time ago, so the cooler was in place, but only just.
I now installed a Cooler Master Hyper 212 EVO so hopefully the reboots are gone now. In the BIOS the CPU temperature dropped from previously 57 degrees (C) to now somewhere around 34-35 degrees. Much better.
The server reboots did not stop. Just 20 minutes ago, while checking logs and other possibly strange behaviour, the server just rebooted as I was typing the next “pg /var/log/…..” command. I entered the BIOS, to see if the system itself was healthy. CPU temperature went up from 57 to 61 in a couple of seconds whilst the system was doing nothing.
I unplugged the system, opened the case, and voila………DUST!
With the vacuum cleaner I sucked out the DUST as gently (lowest power) but as thorough as possible, so hopefully things go better from now on.
For some reason the server has been rebooting a couple of times per day since June, 5th. Not sure why, can’t find any strange things happening on the system. Hope it stays up now.