single user while dumping

jorgnsn at qucis.queensu.ca jorgnsn at qucis.queensu.ca
Wed Nov 23 17:03:02 AEST 1988


We just got an Exabyte 8 mm tape drive to do unattended dumps during the
night.  Since we have a couple of nightowl users who are active at all
hours, we were also wondering whether dumps run on an active filesystem
can be trusted.  So I tried a few experiments.

I started dump, then suspended it at different points and created or
deleted files, then continued the dump and tried a full restore with the
resulting tape.  The upshot is that it is possible, though unlikely, that
activity on the filesystem could invalidate dumped data other than the
active files.  Everytime I deleted a directory during one of dump's first
three passes, dump would continue without complaint, but restore would
give me something like:

    <filename>: not found on tape
    .
    .        (similar messages)
    .
    expected next file 14340, got 14337
    .
    .        (similar messages)
    .
    cannot find directory inode 4
    abort? [yn] 

I would say `n', and then restore would dump core.  Many files, with no
obvious connection to the deleted directory, would be missing from the
restored file system.  These files were also missing from the listing you
get with restore t or restore i, so you would be able to tell you had a
bad dump if you checked the listing.

I thought that actually suspending dump to delete the directory might be
too severe a test, since from dump's point of view the directory is there
one split second, and completely gone the next.  So I also tried to run
dump on one terminal as root while I deleted directories on another as an
ordinary user.  The timing has to be just so (I tried to hit the return
key on my ``rm -r'' command just after the "DUMP:  mapping (Pass II)
[directories]" message), and the deleted directory has to be from the
right spot on the disk, but one time out of three, I did manage to spoil a
dump.

So now we're planning to unmount the filesystem on its server before
dumping it.  It looks like if you unmount the filesystem on the server
while it is still mounted on clients, users on the clients just get
messages about stale NFS handles when they try to look at the missing
filesystem (if anyone knows of any more drastic results of leaving a
filesystem mounted on clients when it is not mounted on the server, please
let me know).

John Jorgensen
jorgnsn at qucis.queensu.ca
jorgnsn at qucis.bitnet



More information about the Comp.sys.sun mailing list