[rescue] Bad Sectors
Mark Brown
sunrescue at marknmel.com
Fri Jan 19 21:41:17 CST 2007
Patrick Giagnocavo wrote:
> On Jan 19, 2007, at 9:29 PM, Curtis H. Wilbar Jr. wrote:
>
>> self tests are only a small part of smart... it sounds like your just
>> getting the
>> driver 'power on' self test report... since they spin up and can see
>> the
>> media
>> they 'pass'. You want to look at error count reports, and any 'event
>> log' type
>> data that might be in there. Watch out for seek errors.... those spell
>> trouble...
>> (IMHO)
>>
>>
>
> I had a 200GB WDC drive that was being monitored by the Linux
> "smartctl" tools. It would report threshold changes ... every hour or
> so! Yet it ran without losing data for many months, until I retired
> the server it was in.
>
> I think that it would make sense to pay for the best IDE drive utility
> a person could find, if you are making a good/bad decision on
> potentially thousands of dollars worth of drives...
>
> --Patrick
>
>
I think all the advise so far is great! - I'll jump in here...
It has been my experience to run the vendor specific diagnostic/tools
software for each disk, and "Low Level Format " them. I think that a
low level format on recent IDE disks is simply an exercise for testing
read/write for each block, and if failure - remap the bad block. (my
opinion here may be different than reality...;-) )
Usually I run the drive utility as follows:
- a couple of times to capture all bad blocks
- cool down the disk - turn it off - come back the next day....
- a couple of times to capture all bad blocks if any.
- a couple of times that return clean.
If the defect list is growing rather than stabilizing., it would be
inversely proportional to my confidence level for that disk....
Then I do a format/newfs/mount them, and untar the perl source code a
whole bunch of times in to a whole bunch of directories. Do dircmp from
a known good reference copy, rinse, lather, repeat. I would probably
use Solaris 9 for this, with a current patch set. I use the same method
for UFS testing. I'd rather not talk about that though, I've seen too
many UFS bugs.....
I have had some 20gb and 80gb disks that are doing quite well now (and
some that are still screwed).
Mind you, I don't think they are doing any tasks that are "server"
related. They are living life as disks for Windows boxen, and on a test
Solaris Nevada x86/Athlon that I have been tinkering with.
If you take the time to back up your data to tape, and you wish to
assume the risk of these disks if they fail 1 week or 1 year from now -
they may turn out to be perfectly serviceable.
Good luck with your rescue!
/M
More information about the rescue
mailing list