Checkpoint/Restart (was "no subject - file transmission")

Mark Delany mdelany%hbapn1.prime.com at relay.cs.net
Thu Aug 16 15:43:02 AEST 1990


Mark Holcomb <mth at ROLF.STAT.UGA.EDU> writes:

> I've felt the need for need for a new tool that Sun doesn't have.

> Ever have a process that's been running for six weeks, and will
> need another week to finish when you need to make level 0 backups
> or would like to shut the computer down for a bad storm.

> I need a tool that would stop a running process and let it be
> restarted at a later date.

> I've thought of a couple of ways it might be done:

[A number of suggestions deleted]

...

A general solution would have to re-establish and re-position all open
files, sockets, message queues, pipes, semphores, shared-memory segments,
environment variables, (add your favourite externally visible entity) to
exactly the same state as they were previously.

Once you've done this, it's a simple matter of re-constructing your memory
image.

Finally, you have to hope that none of the code in your program has
stashed the PID or date away in memory somewhere as these may be different
when you next restart the prog :-)

Seriously, doing this in any substantive manner is difficult and I'm sure
it would be virtually impossible to bullet-proof it on UNIX.

When confronted with this array of problems, most people opt for
individualized, per-program solutions for those progs that run for long
periods.


Mark D.



More information about the Comp.unix.wizards mailing list