SCSI & IPI rates (was: Re: disk write throughput)

Larry McVoy lm at sun.eng.sun.com
Sat May 26 05:09:24 AEST 1990


In article <7563 at brazos.Rice.edu> I wrote:
$ > So the question is, can a SCSI disk/controller that is rated at 4
$ >MB/s handle 2 MB/s for at least one minute? Preferably through the file
$ >system, but we could live with writes to the raw disk if necessary. And
$ >what about SMD and IPI (or IPI-2)?
$ 
$ I do I/O performance engineering.  You won't see 2meg / sec through SCSI
$ on any 4.1 system.  The fastest that I have seen is 1.3 on an IBM 320 meg
$ drive (nice drives - we don't sell them yet).  This was with a modified
$ kernel - your rates will be lower.

I should have been more careful here.  While it is true that it's hard to
get to 2Mbyte/sec through scsi, it is by no means impossible.  You need to
have a drive that has that high of a data rate - you can compute the upper
bound on the data rate by the following [note: assumes only one head
active at a time, a reasonable assumption for many drives]

	Kbyte / sec = nsects / 2 * rpm / 60

where nsects and rpm are available in /etc/format.dat.  Note that this is
an upper bound, not a lower bound.  Many drives reserve some of those
sectors for sector slipping, effectively reducing the platter speed of the
drive.  For example, I found (commented out, what that means I can only
guess) a CDC Wren VII 94601-12G that rotates at 3596 RPM and has 80
sectors per track.  That works out to a max platter speed of 2398
Kbytes/second.  Getting 2 megs/sec off of this drive should be no sweat.
Putting it on is a little harder, since SCSI typically doesn't do 0
latency writes.  If you do your writes in larger chunks, say 120K at a
time, then you don't blow revolutions as often and can approach the 2 meg
rate.  It is critical that the writes be large;  naive OS implementations
send writes down to the drive in 8K (file system block size) chunks.  The
difference between that and 120K chunks is almost a factor of two.  On
experimental versions of SunOS I've rates go from 800K to 1300K on rates
just by increasing the block size.

The high order bit here is that you have to look at your drive to see what
data rate you can expect.  And expect slower writes then reads since most
drives have track buffers that are write through caches (i.e., help only
on reads).

$ IPI will do what you want - I've seen 2 megs through them with vanilla
$ 4.1.

I lied.  I went and tried it.  On SunOS 4.1 PSRA (shipping currently) with
CDC IPI 9720 drives, it's easy to get 2.2 or 2.3 megs / second on a single
drive, reads or writes.  Those drives have smart controllers and I believe
they have zero latency write ability so they don't have the problem of
blowing revs inbetween each write.

$ Note that the SunOS VM system is cool - your perceived performance will be
$ much higher than the disk rate for small (< 4meg) writes since the kernel
$ just copies in the data and tells you that it is done and then proceeds to
$ dribble it out to the drive.

Someone sent me mail and complained about NFS saying that this wasn't so.
He's right.  Doing writes to an NFS file is similar to doing writes to a
UFS file where the file was opened with O_SYNC (the difference is that an
NFS file will cache small writes for a short time).  Write performance
over NFS suffers due to the stateless nature of NFS.  It is a requirement
for correct operation that the data be on the server's drive before the
server says OK to the client.  If this were not so then you would be in
serious trouble each time a server crashed.

Larry McVoy, Sun Microsystems    (415) 336-7627       ...!sun!lm or lm at sun.com



More information about the Comp.sys.sun mailing list