[rescue] Drive Replacement Question

Ahmed Ewing aewing at gmail.com
Wed Sep 5 22:32:58 CDT 2007


On 9/5/07, Brian Deloria <bdeloria at gmail.com> wrote:
> I've got a drive I need to replace in a V240, c1t1d0 which is a root
> mirror.  metastat still reports it as being ok, but I am debating whether to
> power the machine off and replace the disk, go through the process of
> booting the root device with less than 50% of the metainfo slices intact and
> just do a metareplace on it, or to delete the metadb off of the problem
> drive while the machine is still up in single user mode and hotswap the
> device with the machine still on.
>
> Most of the documentation that I've found simply refers to replacing the
> drive after it has failed completely, or to move the metainfo over to
> another slice/drive/add a new drive none of these are possible based on the
> configuration by my predecessor.  The situation that I'm in however doesn't
> quite fit into that and I was wondering if one way was riskier than the
> other or preffered.  Taking the machine offline in off hours is acceptible
> so I don't neccessarily need to do this hot.

Unfortunately, Sun's own private InfoDocs have historically been less
than clear on the best approach. IMHO, the most sure-fire way of
taking care of this (especially if metastat is not reporting a disk
failure for c1t1d0) is breaking the mirror down to the one known-good
side (you don't have to remove DiskSuite control altogether, just
detach submirrors to end up with one-way mirrors).

In other words: Delete the state database replica(s) on c1t1d0 with
metadb, remove its submirror metadevice(s) from the mirror(s) using
metadetach, clear the submirror metadevice(s) using c1t1d0 with
metaclear, then replace the disk however you choose, hot
(devfsadm/cfgadm) or cold. Assembly is reverse of
disassembly--partition the disk appropriately with prtvtoc | fmthard,
create submirror metadevices with metainit, and put the mirror(s) back
together with metattach. You can check resync progress with "metastat
| grep %".

There's a very important reason for taking the extra steps instead of
diving into quickly replacing the disk with metareplace. Depending on
if you're running Solaris 8 (unbundled DiskSuite) or 9 (bundled
Solaris Volume Manager) on that V240, there's some occasional
weirdness with the DevID functionality which is supposed to assign a
unique identifier to all disks used within metadevices (this allows
for disks to be shuffled around in different slots/buses without the
state databases losing coherency). If you simply try to replace the
disk and unfail it (using metareplace -e, for instance), the DevID
information may not be updated properly. Sun's own documentation is
far from clear on the exact circumstances under which it's safe to do
a quick metareplace -e vs the mirror break. Removing the failed disk
from the DiskSuite configuration altogether will guarantee the DevID
doesn't carry over from the old disk. (To be 110% sure, run
"metadevadm -u c1t1d0" after the physical replacement. This should be
redundant given that the DevID update should take place at the time of
the metainit/metattach of the good disk, but it never hurts.)

One final thought regarding Nadine's comment: the problem with an odd
number of state databases is that you're essentially guessing at which
root mirror will fail and putting all your eggs in that basket.
Regarding replicas on additional disks, I agree, but sometimes (like
in 1U servers) they simply aren't available. If you find it irritating
that there's no way for the server to achieve state database quorum
(>50%) during a failure or routine replacement because replicas only
exist on two root submirrors, add the following to /etc/system and
reboot for it to take effect:

set md:mirrored_root_flag=1

This will make the quorum requirement exactly 50% instead of >50%,
allowing you to boot fully on the survivor without intervention if you
do lose/detach a root submirror.

To Brian: good luck with this. To the rest of the list: feel free to
correct any boneheaded logic/syntax errors in the above.

Hope that helps,

-A



More information about the rescue mailing list