Altos 5000

Dick Dunn rcd at ico.isc.com
Tue Aug 28 04:38:21 AEST 1990


tyager at maxx.UUCP (Tom Yager) challenges some of my points about the 5000.
At the end, I've got some observations on unintended flames, but first
let's take the disk-mirroring ; he hit some shortcomings in my posting...

I had said:
> > ...disk mirroring is mostly a pawn in the
> > feature game.  It takes substantially more I/O bandwidth to do the double
> > output, and it doubles the cost of disk storage.  Why not spend only a few
> > bucks extra and buy reliable disks?
> 
> Even "reliable" disks eventually die.

True.  So do reliable controllers.

What I want to get at--and it's something I didn't say at all in my previous
posting--is that if you're looking for a certain level of reliability, it's
a lot harder than just tossing on extra disks and mirroring.  Let's have a
look at the overall reliability issues...although I'd remind everyone that
fault-tolerant system design is complicated, so we're bound to gloss over
things.

>You're running a service business, say, a distribution house. Your order entry,
> warehouse control, customer service--everything--is on the computer. You've
> got everything backed up like a good doobie. One of your drives gets smoked.
> Now, if you're mirrored, your system squawks at you but keeps running...

OK, this covers one failure mode: a disk dies.  There are two questions I
think we need to ask:
  - How likely is this failure mode relative to other failure modes?
  - Is there another way to get comparable recovery capability?
To the second question, I'll suggest "journaling" as providing a lot of
what you need, possibly at much less cost.  I'm more interested in the
first question.

I had pointed out that it takes extra I/O bandwidth to handle mirroring;
someone responded that if you have the right sort of controller, it will
write both disks at once for you.  OK, fine, now you've made the controller
a single-point-of-failure.  I've seen as many motherboard and controller
failures as disk failures.  I don't pretend my experience is typical, but
suppose that it might be.  The disks are not the only failure points in the
system.  Oh, and what about that controller, and the writing:  Are you
doing read-after-write on both disks to be sure you've got good copies of
everything?  Are you actually using both disks (always writing both, but
reading from whichever is free)?  There's a Catch-22 in the failure-mode
analysis here:  If you're using both disks (to improve performance) you
risk having an undetected failure on one disk give you inconsistent data
between them...which could quickly make a hash of the data on *both*
disks.  If you're essentially running on one disk and just writing the
other as a backup mirror, you're not getting the ongoing check that you
really need for reliability.

One of the troublesome aspects of mirroring is that it's just a redundancy
at one fairly low level.  This is inherently subject to a certain class of
error--if there's something just-plain-wrong somewhere, you may end up with
nothing more than two wrong copies of your data.  It's why I hinted at
journaling, because that at least gives you a second copy of the data in a
different format, constructed with a different algorithm.  (Recovery using
a journal takes longer, of course...but it recovers from a different class
of problems.)

> There are a lot of copies of Netware SFT in the hands of businesspeople who
> agree with me, and a large part of the fuss over the Systempro is for its
> mirroring and data guarding features.

People who can afford extra hardware and software, and who can't afford to
be down, will buy solutions that promise greater reliability.  That's as it
should be.  Does the solution make sense?  I'm not convinced.  I would
admonish everyone to bear in mind that being able to sell something is not
an inherent indicator of its worth.  In this case, I'm not arguing that
mirroring is worthless, but I do argue that it's inordinately expensive
and only addresses one small part of the overall reliability problem.  A
single system with mirrored disks on one controller has only one element of
redundancy.
_ _ _ _ _

Now, let's clean up something about the "insults":
> > > See the review of the System 5000 in the July issue of _UNIX WORLD_...
> > _UNIX_World_??  Oh, yeah...isn't that the magazine that just carried an
> > article about UNIX-based BBSes without a single word about either USENET or
> > ARPANET?  I think you need a stronger source of review than that.
> 
> Thank you for the insult. I can handle criticism as well as any writer, but I
> am not keen on those who badmouth my work without even having read it.

Tom:  I was *not* criticizing your article.  The complete truth of the
matter is this:  I saw the _UNIX_World_ which contains your article, but
what caught my eye first was the BBS article.  In my "carefully considered
professional (but personal) opinion" the BBS article was so poorly written
that it puts the magazine in a bad light.  In fact, my comments were
directed entirely at _UNIX_World_; at that point I hadn't seen your name on
the article.  Surely you realize that regardless of how well-researched
and well-written your article is, it will be judged in the context of the
magazine in which it appears..."birds of a feather" and all that.  The BBS
article was not the first of its ilk I've seen in _UNIX_World_, either.

Having now skimmed quickly through *your* article, I think it's a reason-
ably good one--concise, to-the-point.  I have some criticisms (primarily
the matter of using Neal Nelson benchmarks) but they're specific, and not
general dissatisfaction with the article.  I've also seen your writing in
Byte.  I think you're generally reasonably on-target, and in any case you
make an obvious effort to get it right.  Perhaps you can help _UNIX_World_
improve its reputation within the UNIX community.

> ...In any case, I'm objective: The original
> posting complained about the 5000's inability to run a certain OS product with
> which you have a mild involvement.

Come on, come right out and say it.  If you think I failed to be objective,
point out *where* I failed to be objective.  If you think I have some
vested interest in whether Interactive's UNIX runs on the Altos 5000, tell
me about it.  (Honestly, I don't care.  They're playing a different game
than Interactive is.)

> Regardless, I think your swipe at UNIX World, and my review, was out of line.

I think your complaint is out of line.  I did not take a swipe at your
review.  I didn't say *anything* about your review.  I made a comment about
the wisdom of using _UNIX_World_ as a reference.
-- 
Dick Dunn     rcd at ico.isc.com -or- ico!rcd       Boulder, CO   (303)449-2870
   ...I'm not cynical - just experienced.



More information about the Comp.unix.i386 mailing list