GNU-tar vs dump(1)

Fred Fish fnf at estinc.UUCP
Sun Jan 8 04:50:42 AEST 1989


In article <629 at mks.UUCP> egisin at mks.UUCP (Eric Gisin) writes:
>One of the potential problems with using tar or cpio for backups
>is that a sparse file (one with unallocated blocks)
>that uses little disk space will use more space in the backup.
> (example deleted)
>	  24 -rw-r--r--  1 egisin   100000004 Jan  3 17:59 big
>This file uses 24K on the BSD filesytem, and about 100M in a tar backup.

The problem with archive space consumed can be eliminated by compressing
(LZW) sparse files during the archive process.  This can be done totally
transparently to the user. The example 100Mb file compresses to about 200Kb.

During extraction, each block can be tested for the case of a block of
null bytes (after decompression), and seeks used to recreate the hole.
This test/seek is actually faster in practice than writing blocks of null
bytes.  I believe this is also independently confirmed by someone who
posted their results of modifying "cp" to create sparse files.

BRU (Backup and Restore Utility) uses both of these techniques, so this is
not just speculation.  A complete filesystem save and restore often
results in additional free space recovered from files that could be
sparse, but weren't.

-Fred
-- 
# Fred Fish, 1346 West 10th Place, Tempe, AZ 85281,  USA
# asuvax!nud!fishpond!estinc!fnf          (602) 921-1113



More information about the Comp.unix.wizards mailing list