Checkpoint/Restart (was "no subject - file transmission")

Gerard K. Newman gkn at ucsd.Edu
Sat Aug 18 06:42:02 AEST 1990


In article <13611 at smoke.BRL.MIL> gwyn at smoke.BRL.MIL (Doug Gwyn) writes:
>Any application that is EXPECTED to run for a long time should have
>interruptibility features built into it.  I did this back in 1967,
>and have little sympathy for people who are too lazy to deal with it.

True enough, but a minor nit:  suppose I am a more-or-less non-computer
literate type, who is using some canned commercial software (pick your
own favorite package -- there are lots of them) to do some lengthy
calculation.  In this case, it would be a real plus for the operating
system to provide some easy (even automatic) means for periodic
checkpointing of the job state.  Such systems exist, and many have
existed for quite some time.

I think it's a bit unfair for every user of a system to have to
invent a way to do this specific to their particular application.
In many cases it may not be possible (the above "canned software"
problem being an example).

I agree that adding this capability to many varieties of Unix may
require much skull sweat, especially to get it right.  But in the
environment here at SDSC (and in other places) checkpointing is a
remarkably useful feature.

Cheers,

gkn
San Diego Supercomputer Center



More information about the Comp.unix.wizards mailing list