[SunRescue] Sun SCSI hard drive IDs

Tue Sep 14 08:39:08 CDT 1999

On Tue, 14 Sep 1999, Paul Pries wrote:
>Concerning the SCSI priority chain there is a reason for this swap... :)
>If I remember correctly, the higher the address the higher priority the device
>han on the bus. Having the system disk on id0 will give it the lowest priority
>thus decreasing system performance. Letting it have id3 should be a nice
>middle way thing, giving the system disk a better response, yet having some
>higher priority addresses left for devices that need a higher priority to do
>their job, like tape devices and other slow things.
>
>This might not be a big issue with todays fast disks, but once upon a a time... :)

  Umm no, that's not the way SCSI works...those are simply addresses, not
priorities per se.  The SCSI target ID has nothing at all to do with the
performance one can expect from a particular disk, despite the fact that this
is a popularly held belief.

  This conversation prompted me to dig through my "misc info" archives.  I knew
I had this somewhere.

  Attached below is the explanation of SunOS' rather dumb sd0<->sd3 swappage
from the guy on the SS1 design team who was responsible for the change.  Happy
reading.

                                -Dave McGuire

--------------------------
>From owner-port-sparc at NetBSD.ORG  Fri Jun  2 04:43:58 1995
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil t nil nil]
	[nil nil nil nil nil nil nil "Thierry BESANCON" "besancon at excalibur.ens.fr" nil nil "Re: A new user's comments" "^From:" nil nil nil nil nil nil nil]
	nil)
Received: from pain.lcs.mit.edu (pain.lcs.mit.edu [128.52.46.239]) by rocinante.digex.net (8.6.12/8.6.12) with ESMTP id EAA20633 for <mcguire at rocinante.digex.net>; Fri, 2 Jun 1995 04:43:58 -0400
Received: (from daemon at localhost) by pain.lcs.mit.edu (8.6.9/8.6.9) id EAA06070; Fri, 2 Jun 1995 04:02:38 -0400
Received: from excalibur.ens.fr by pain.lcs.mit.edu (8.6.9/8.6.9) with ESMTP id EAA06066 for <port-sparc at NetBSD.ORG>; Fri, 2 Jun 1995 04:02:33 -0400
Received: (from besancon at localhost) by excalibur.ens.fr (8.6.9/8.6.6) id KAA15841; Fri, 2 Jun 1995 10:00:55 +0200
Message-Id: <199506020800.KAA15841 at excalibur.ens.fr>
In-Reply-To: Theo de Raadt <deraadt at theos.com>'s message as of Jun  1,  5:12.
X-Organization: Laboratoire de Physique Statistique -- Ecole Normale Suprieure
                24 rue Lhomond, 75231 Paris Cedex 05, France
                tel: (33) 1 44 32 34 76; fax: (33) 1 44 32 34 33
X-Mailer: Mail User's Shell (7.2.5 10/14/92)
Precedence: list
X-Loop: port-sparc at NetBSD.ORG
From: besancon at excalibur.ens.fr (Thierry BESANCON)
Sender: owner-port-sparc at NetBSD.ORG
To: Theo de Raadt <deraadt at theos.com>, "Scott L. Burson" <gyro at zeta-soft.com>
Cc: port-sparc at NetBSD.ORG, bsd at inria.fr
Subject: Re: A new user's comments
Date: Fri, 2 Jun 1995 10:00:54 +0200

>> > - I have to say that I don't at all understand the purpose of numbering the
>> >   SCSI disks the way NetBSD does.
>> 
>> quite honestly, none of us understand the reason why SunOS numbers
>> it's disks the way it does :-) the scsi code has been replaced. some
>> of your complaints still hold for the default system, but it is
>> possible to build a kernel that (if you wanted to) would map your
>> disks in exactly the way SunOS does.

Here is some explanation about the numbering of SCSI disks on Sun workstations.

My 0.02$ contribution.

	Thierry

------------------------------------------------------------------------------

Article 43984 of comp.sys.sun.admin
Newsgroups: comp.sys.sun.admin
From: szh at zcon.com (Syed Zaeem Hosain)
Subject: Re: What is the history of /dev/sd0 at scs
Date: Tue, 27 Dec 1994 20:30:04 GMT

In article 060q at Cytel.CUEHere.Edmonton.AB.CA,  Rainer_Heilke at Cytel.CUEHere.Edmonton.AB.CA (Rainer Heilke) writes:
>Mike Heins (mike at cd.com) wrote:
>
>> I respectfully disagree.  The reason that Sun used ID 3 is that the
>> higher the ID, the higher the SCSI bus priority.  If you pound 2 or
>> more disk drives with I/O, you will notice a better throughput on the
>> higher ID.  Since disk 0 is where most users keep their swap and root
>> partitions, that makes sense.
>
>OK, that makes sense - up to a point. Why didn't Sun start numbering
>at the top end, then, with 6 (assuming card = 7) or 7 (assuming card
>is 0)? Why *3*?

Uh ... here is something that will shed some light. This was posted
here by Jim Craven (craven at cg.emr.ca) back in January of 1994. The text
is by mjacob at kpc.com - it was written in August of 1992.

                        Chapter 1
                Hardware Matrix Limitations

Back in 1988, we were doing the initial work for SparcStation 1. At that
time, the only external SCSI peripherals that Sun shipped were the
"shoeboxes" (which required some serious disassembly to change the SCSI
addresses on). The peripheral h/w guys swore up and down that customers
hated doing this. They gave me the supported matrix of SCSI addresses.
What fell out of all this was that the disk SCSI addresses would like
this:

        first internal disk             SCSI target id #3
        second internal disk            SCSI target id #1

        regular shoebox                 SCSI target id #0
                                        (tape at target id #4)
        expansion shoebox               SCSI target id #2

                        Chapter 2
                Unix unit numbers (part 1)

At that time, the SCSI 'config' file stuff for Sun machines
followed a somewhat limited and arcane mechanism. The modified
BSD config file semantics were overloaded such that a host
adapter was a "controller", a scsi device was a "drive", and
it followed the following rules obtained:

        1. Disk unit numbers went from 0..n.

        2. Drive numbers ("slave" numbers internally) took
        on the property of:

                SCSI target = slave >> 3
                SCSI lun = slave % 8

        3. The minor device byte for disks, after partition bits (3),
        left 5 bits, or 32 total disks allowed.

        4. In order to allow for overlap with the old [34]/260 internal
        drive arrangement (SCSI target 0, luns 0 and 1), a unit numbering
        scheme was used such that even numbers mapped to lun 0, odd numbers
        mapped to lun 1. For example, the config file lines here are:

disk            sd0 at si0 drive 000 flags 0
disk            sd1 at si0 drive 001 flags 0
disk            sd2 at si0 drive 010 flags 0
disk            sd3 at si0 drive 011 flags 0
disk            sd4 at si0 drive 020 flags 0
disk            sd6 at si0 drive 030 flags 0

                        Chapter 3
                Unix unit numbers (part 2)

SparcStation1 was going to have a quite a different SCSI subsystem. Most
of the devices would be self-identifying (cutting down the config file
considerably), but there was still the problem of binding unit number,
minor device byte, and a specific SCSI bus, target and lun address.

The semantics of the config changes I made seemed fairly simple and
straightforward. They were something along the lines of:

disk            sdN at scsibusM target J lun K

(in other words, don't overload the slave field; ask for what you
want by name!)

This was all well and fine, but it then left open the question of what
the various values of N should be (for sdN, etc..).

                        Chapter 4
                What's in a name, anyway?

Should unix device names be logical or contain some physical address
information to infer? This quickly became a very heated topic of discussion
within the SS1 team.

Mitch Bradley (the author of the OpenBoot prom), held out for the notion
that the true name of a device should be *completely* physical, and that
you would use simple aliases for convenience. In OpenBoot, version 2
and later, you can see this philosophy in the prom pathnames of devices.
For example, the default disk to boot from on an SS2 is:

        /sbus/esp at 0,800000/sd at 3,0:a

This has even made it into the OS for Solaris 2.0 (aka 'SunOS 5.0')
as 'devfs'.

Clearly this naming scheme can get incredibly awkward for more complicated
machines. The above example is about as simple as it can get. For this
reason we all had the notion that, ultimately, your boot device would
be called "Fred", and that when you booted the machine, you typed "boot
Fred", which would be an alias for the above.

This was all well and fine, but at the time (and a short amount of time it
was! we did all the major OS/Prom design *and* implementation work between
about March and October), we still had traditional unix device names sitting
in /etc/fstab, e.g., sd0a, sd0g, etc, so the above philosophy and discussion
was a "future", and not directly pertinent to what we released at first.
This left us back at the discussion as to whether a sd unit number should
be completely logical (and convenient), or contain enough information
to derive a physical address from.

My (poor) contribution at this point was the following argument:

        1. There aren't enough minor device bits to do full SCSI
        address encoding. This is because there are 64 possible
        target/lun combinations per SCSI bus.

        2. Users want something simple, not complicated. The first
        disk you use should be sd0. The second should be sd1, etc.
        If I had kept the old SCSI numbering scheme, I thought that
        the notion that the primary drive would be 'sd6' was unbearably
        stupid and awkward.

        3. Therefore, the sd unit numbers should be completely
        logical. The first disk (at target id #3) should be
        sd0, and so on.

        4. If this arrangement turned out to be a problem for someone,
        I insisted on and got *mostly* implemented a mechanism to
        change such a logical mapping. The config file you could
        always cha

In retrospect, I found #1 was false for two reasons:

        a. I confused the *name* of the device with its minor device
        number in /dev. There was no reason why a *name* like sd030a
        couldn't have been using (sd bus 0 target 3 lun 0 partition a),
        but with a completely arbitrary minor device number. After
        all, /dev/MAKEDEV would do all the dirty work.

        b. It simply didn't occur to me to use more than one major
        number in order to extend the width of the 'minor' device
        byte. I just plain forgot that System V had been doing this
        for some time, and I also just didn't pay attention to what
        Joe Eykholt was doing with IPI at the same time (solving
        a similar problem).

In retrospect, I found #2 was false, mainly for the reason that
I guessed that users would want that which *I* found simple, and
this is most certainly not the case. At SUG panels I got ripped
totally by pissed off users (and rightly, I suppose). I also
violated the principle of "least surprise", where something was
different from what customers expected. As a minor sidebar,
Sun Field service also ripped me up side the face because, in
their opinion, customers changed addresses of shoeboxes all
the time, and that we (engineering) should have made the internal
drives SCSI id 0 and 1 (resp), thus obviating any change to
numbering schemes whatsoever.

All in all, though, given the constraints of hardware, time,
and a desire to make things easier, the major mistake here was
#b above. If I had thought of this, and made the unix names
encode full physical information, I *think* everyone would have
been happier.

-- 
-------------------------------------------------------------------------
| Syed Zaeem Hosain          P. O. Box 610097            (408) 441-7021 |
| Z Consulting Group        San Jose, CA 95161             szh at zcon.com |
-------------------------------------------------------------------------