Backup of a live filesystem revisited

System Mangler mangler at cit-vax.Caltech.Edu
Sun Dec 21 05:56:08 AEST 1986


In article <1226 at ho95e.UUCP>, wcs at ho95e.UUCP (#Bill.Stewart) writes:
> File-system based programs can work on live systems as long as the individual
> files are not changing.  They are slow but flexible, and do incremental dumps
> well.
>
> Disk-based backup programs are normally much faster, but are unsafe on live
> file systems;

I claim that both types are unsafe, for the SAME reasons.

In both cases, a file's inode is read (either by read, or by stat),
and based on that information the rest of the file is read.  Reading
the inode is an atomic operation, because the inode is completely
contained in one disk sector, so the inode will always be internally
consistent.  However, after the inode is read, the information that
it points to may be freed by a creat(), and scribbled upon, before
the backup program reads it.  The program will either get garbage,
or EOF, but in either case it has to write SOMETHING on the tape now
that it has committed itself by writing out a header saying that the
next st_size bytes are the contents of the file.

That's one kind of corruption, and probably not that bad.  It doesn't
matter that you got garbage, the file was being zapped anyway, and
will appear on the next backup tape.  The important thing is to not
bomb on it.

Another is when the file is removed/renamed between the time that it's
selected for backup and the time it actually gets read.  This is simple
to handle; just skip that file.

The insidious case, though, is when subdirectories get moved out of
a directory that hasn't been backed up yet, and into one that has
already been done or was being skipped.  That subtree won't be restored
at all, and won't be on a subsequent incremental tape either, because
the files didn't change.

Filesystem-based backup programs won't even know that they missed
something; disk-based programs will at least have a way to know
that something happened, because they will come up with all these
orphaned inodes.  Presumably, these should get linked into lost+found.
(I haven't looked to see what *actually* happens).

Dump has the additional advantage that all the directories are read
very early, so the window of vulnerability is smaller.

Sure, I've gotten bad dumps.  In large part I think this happened
because the system mangler before me changed dump to wait for a tape
mount between pass II and pass III, and at that time tape mounts
often took hours - creating a very large window of vulnerability.

> Disk-based backup programs are normally much faster,

Making it feasible to keep one's backups more up-to-date.

Don Speck   speck at vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck



More information about the Comp.unix.wizards mailing list