Raw vs. block device

v.wales%ucla-locus at sri-unix.UUCP v.wales%ucla-locus at sri-unix.UUCP
Sat Jan 7 05:11:37 AEST 1984


From:            Rich Wales <v.wales at ucla-locus>

Jonathan --

Here is an attempt on my part to describe "block" and "raw" I/O in as
much detail as reasonably possible.  If I have inadvertently made some
misstatement, or left out some important feature, I trust one of the
other "veterans" on this list will correct me.

UNIX has two kinds of device interfaces:  "block", and "character" (also
called "raw").  I'll discuss here the "raw" interface first, since it is
the "lower-level" of the two, and since virtually all devices with block
interfaces will have a raw interface as well.

RAW (CHARACTER) DEVICE INTERFACE

    Generally speaking, the "raw" interface to a device gives you direct
    control over that device.  If you do a "read" system call on a disk
    via the "raw" interface, for example, you will generally invoke a
    single input operation on that disk to read your data.  (There may
    be exceptions here; for example, I once wrote a "raw" device driver
    for an RX02 floppy disk, and since this device can read or write
    only one sector at a time, I implemented long "read" or "write" re-
    quests via multiple I/O commands to the drive.)
    
    Raw I/O is "synchronous":  I/O operations are always done in the
    order requested.  There can never be more than one raw I/O request
    pending per device.  In 4.1BSD, this restriction is generally imple-
    mented by having the driver declare a single "buf" structure per
    device for all raw I/O on that device.  All raw I/O for the device
    goes through a routine called "physio" (in dev/bio.c); "physio" in
    turn checks and manipulates a "busy" status bit in the "buf" struc-
    ture, using the kernel's "sleep"/"wakeup" facility to force requests
    on a busy "buf" structure to wait.

    Raw I/O is generally subject to any requirements imposed by the
    hardware itself.  For example, if a given disk demands (as most do)
    that all I/O operations start on a sector boundary and comprise an
    integral number of full sectors, then you must observe this restric-
    tion when doing raw I/O on that disk.
    
    If you do try to read/write random amounts of data at random places
    on a disk via a raw interface, you are likely to get unpredictable
    results.  (In particular, a misaligned "write" is liable to trash
    innocent data.)  If the driver is well written and checks for this
    situation, you may get an explicit error, but you shouldn't in gen-
    eral depend on this.  This, by the way, is why you can't use "adb"
    on a raw device.
    
    In the case of my RX02 driver which I mentioned earlier, by the way,
    I chose to implement multi-sector "read"s and "write"s as a conve-
    nience to the user.  I could have forbidden them (because the RX02
    hardware doesn't support them) and have been perfectly within the
    philosophy of raw I/O interfaces by so doing.  My driver still re-
    quired all transfers to start on sector boundaries and comprise an
    integral number of full sectors, though -- and I explicitly tested
    for violations of this constraint before doing the I/O.

    Raw I/O on terminal lines is somewhat complicated by the use of the
    "clist" mechanism (see sys/prim.c).  Hence, terminal I/O may be to
    some extent asynchronous, even though a "raw" interface is in use.

BLOCK DEVICE INTERFACE

    The block interface (if one exists) to a device goes through a com-
    plicated buffering/caching scheme.  A number of buffers (each one
    1024 bytes long in 4.1BSD, or 512 bytes long in Version 7) are allo-
    cated by the kernel for block I/O.  Each buffer is labelled with the
    device (major/minor) and block numbers, so that repeated references
    to the same block do not result in actual "read" operations if the
    block is already in main memory.
    
    Each buffer has a "dirty" bit, so that the data is not written back
    to disk immediately upon the issuance of a "write" system call.
    Data is written back when the buffer is needed for another block
    (LRU caching strategy); when a "sync" system call is issued by a
    process; or when a block device is closed and (if it was mounted)
    unmounted.

    A "block" driver interface to a device is free to perform I/O opera-
    tions in any order it sees fit -- not necessarily the order in which
    "read" or "write" system calls were issued.  (Hence, while raw I/O
    is "synchronous", block I/O is "asynchronous".)  Most disk drivers
    use a queue of pending I/O requests for each drive, sorted in order
    by cylinder so as to allow the disk arm to sweep back and forth
    across the surface in "elevator" fashion.  In a "raw" interface, on
    the other hand, there is no need for a queue of pending requests,
    since by definition only one raw I/O request can ever be pending for
    any given device.

    The buffering scheme allows you to do I/O with arbitrary byte off-
    sets and byte counts, even if the device itself does not support
    such access.  For example, if you want to write a single byte in the
    middle of a block using the block interface, the kernel will read in
    the entire block and then change the single byte in question.  An
    I/O operation which spans multiple blocks (perhaps starting in the
    middle of one block and ending in the middle of another) is handled
    in a similar fashion.

    The block I/O mechanism is used by the routines which implement reg-
    ular file I/O, needless to say.

WHICH DEVICES ARE BLOCK?  WHICH DEVICES ARE RAW?

    In general, every device will have a raw interface.  Additionally,
    a device on which it would make sense to put a file system (i.e.,
    disks) will generally have a block interface.  Most tape drivers
    also have a block interface, although I have never had occasion to
    access a tape by anything but the raw interface.

    If you are doing a "dd" (byte-for-byte copy) of a large area of disk
    (say, for example, that you are moving a file system from one part
    of the disk to another), you should probably use the raw interface,
    since it is far more efficient than the block interface.  In partic-
    ular, large block sizes in "dd" can generally be handled by the raw
    disk interfaces, whereas the block interface will cut a large trans-
    fer down into 1K-byte chunks.

    Terminals have only a raw interface.  Also, such "funny" files as
    /dev/null and /dev/kmem are implemented via raw interfaces.  (Of
    course, you can still do I/O on /dev/kmem from random offsets and
    with random byte counts, since memory does not have the alignment
    restrictions that a disk does.)

DEVICE SPECIAL FILES AND RAW VS. BLOCK I/O

    There are two kinds of device special files in UNIX:  raw and block.
    The major device number of a device special file is associated with
    a set of device-driver routines via one of two tables in dev/conf.c:
    "cdevsw" ("c" = "character" = "raw") for raw devices, and "bdevsw"
    ("b" = "block") for block devices.  In particular, note that there
    is no necessary relationship whatsoever between "raw" major device
    number N and "block" major device number N.

I hope this covers your question adequately.  If not, let me know and I
will try and supply additional information.

-- Rich <v.wales at UCLA-LOCUS>



More information about the Comp.unix mailing list