[geeks] Murphy, instantiated

Thu Jun 4 15:11:32 CDT 2009

On Jun 4, 2009, at 03:07 , Nadine Miller wrote:

> Not having access to the raw stats nor the stats background to dig  
> into it even if I did, I can only say a set of 80K disks is more  
> data than I have in my 10+ years using consumer and enterprise disks.

That doesn't mean they had any idea what they were doing when they  
wrote the report.

I think their report is more reflective of their local methods of  
operation and equipment than it is of drives in general.

Too many other sites with even more experience than Google disagree  
with them for me to believe everything they said.

I see it as a report on Google mostly, with some useful information on  
drives.

It's useful data, but still effectively a single data point from a  
single company.

This kind of information should be collected and analyzed industry  
wide, and the hell with any manufacturer who is afraid of that.

> My personal experience disagrees with theirs as well, but who is to  
> say that it's not other environmental factors?

Like I said above, I think a lot of their report is more about  
Google's operational methods than any indication of drive  
characteristics.  For example, they use a lot of cheap equipment and  
I've heard from some people that they aren't always very careful about  
temperature control and maintenance.

Remember that one of Google's mantras about the data centers is to  
"use the cheap crap instead of the good stuff, and just replace it  
when it fails."

I would say that method of operation is still in the minority.

> I don't really worry about heat--if it gets too hot in my house,  
> I'll turn off the computer.

When I turn off my computer, I'm also turning off the flow of dollars,  
and I can't do that for remote systems and client machines.

I prefer to keep internal temperatures at 130F or below, and it seems  
to pay off.

Sometimes there is nothing I can do about it of course.

> My interest in the report was a) disks that show errors early have a  
> higher incidence of failure; b) related to that,  no correlation  
> between SMART reporting and failure.

I agree with this part, but it's not new information.

I thought that early errors == early failure was something we figured  
out a long time ago.

Most shops I've been in were pretty slack, but the good ones always  
did "burn testing" and other shakedown tests on new equipment.

SMART: I find it more useful than Google evidently did, but it's  
definitely not that great.  It has false positives, and there are some  
drive errors that it never seems to report, and some that it cannot  
report, like firmware errors.

My IBM Ultrastar 9GB drives ran for 6 years, and SMART reported  
spindle and heat failure at just over a year.

However, I'm not sure that is the fault of SMART, but rather the fault  
of the human who set the threshold levels.  SMART actually did its  
job, it just needed tuning.

I always check SMART when I can because it has saved my butt on  
occasion, and sometimes it is all you have.

Most filesystems have no testing ability at all, so if I really need  
to know, I have to write my own crypto signature checking that sits on  
top, and it can really take a long time to run.

> Which implies, if you are using ZFS, that you should be scrubbing  
> very regularly early in the disks' lives, to determine early errors  
> sooner, rather than later.  After 12 months, you can probably ease  
> off.

I do my scrubs weekly on new drives, and I'm indecisive about what to  
do after they appear to be "broken in".

I don't know if I should just keep doing that, or move to monthly or  
less frequently.  There isn't a lot of empirical knowledge on this  
yet, and it will vary from drive to drive as well.

> I liken this to something else I noticed these days--almost everyone  
> I know (or have read opinions from on various pc hardware forums)  
> tell you to stress test your RAM out of the gate with at least a 24  
> hr memtest run.  I never saw anyone recommending that outside of  
> builds for data centers 10 years ago.  The intent is to ferret out  
> the marginal components early so you can get them replaced via RMA  
> or before warranty runs out.  Doing stress tests on disks seems to  
> be just as logical.

Old news, but good to see people taking it to heart now.

> My observation: whatever the hell EMC uses in their arrays, they  
> fail a lot.  Yes, we beat those disks to death, but we have similar  
> (fewer) file systems on the NetApp in use, and I've seen no disks  
> fail in those since I started, nor have we had more than 1-2 disk  
> failures in our crappy Penguin boxes that get beat to death (MTAs  
> for example)--which I am sure are non-"enterprise" disks.

The EMC arrays I was last around in 2002 were always hot and vibrated  
a lot, and they also had high failures.  I don't know if it is their  
drives or their cabinets that are at fault.

I've not seen an EMC unit since then.

Another observation: at work we are seeing very high SSD drive  
failures.  In some cases the failure rates are so high they are  
effectively useless.

>> We have tons of storage, no really good way to back it up, and  
>> besides ZFS there is almost no move to try and counter the  
>> situation in any way.
>>
>> I figure eventually something has to change to goad the industry  
>> into doing something about it.
>
> People keep buying bigger disks, I doubt it will change.

The more they buy, the more we need it though.

> In the enterprise space, "de-duplification" is the new hotness.   
> NetApp and EMC are fighting over Data Domain, even though both  
> already have de-dup systems.  Greenbytes has a modified OpenSolaris  
> ZFS ("ZFS+") thing they are selling for the same purpose (Thumper  
> with the software pre-installed).

I think this is a distraction.  Unless duplication is a really serious  
problem, it is probably a benefit in most cases.

However, I do like the idea of Plan 9's crypto filesystems.  They  
eliminate duplicate at the block level by having a crypto signature as  
the address of each block.  Duplication is impossible.

They use this for their WORM filesystems.

In some ways that would be ideal for end-user systems.

> I think ZFS has gotten everyone to start looking at file systems in  
> a new way.  People are thinking more outside of the box now.

The users are... but the industry isn't.

It's pretty stupid that we have pretty much no pragmatic backup  
systems for modern storage.

It might be a hard problem to solve, but I don't see that we have a  
real choice.

-- 
Shannon Hendrix
shannon at widomaker.com