Sun Checkpoint procedure

Seth Robertson seth at ctr.columbia.edu
Mon Apr 9 07:46:15 AEST 1990


Greetings:

Below I have included a beta-test checkpointing program for Sun 3s.

Checkpointing, for those of you who do not know, consists of saving the
program state every so often so that if the program crashes you can
restart it from the last checkpoint.

Basically, what you have to do is insert a couple lines in your main()
and then select points in your program to do checkpoints (it should be
possible to set up an alarm to do it every hour or so, but I have not
tried this).  You need select these points carefully because the
process of checkpointing does have alot of overhead, so it is
important not to do it too frequently.

This checkpointing program is very good for compute-bound programs.
Programs that do I/O have problems because my checkpointing routines
_DO_NOT_DO_ANYTHING_ABOUT_FILES_ If you have open files, you MUST
reopen() them and relseek() them.  Also, programs that use signals
need to be careful.  Basically, if your program reads in data, thinks
about it for a few days, then spits it back out, this is for you.

Now for some restrictions.  It is currently working only on Sun 3s.
It compiles fine on Sun4s and (I believe) any Vaxen (Ultrix or BSD)
but the reason that it does not work is because of the Sun4s amd Vaxen
broken setjmp() routines.  On Sun4s, what needs to be done is for
someone to write an assembler routine to save all of the registers.
Especially the stack pointer.  I havn't done too much work on the
Vaxen, but the problem is pretty much the same.

I'm setting up a mailing list (which I envision to be very low volume,
but what do I know?) for future enhancements and the like.  The address
is "chkpnt-request at ctr.columbia.edu"

I would strongly request people to join the mailing list and report
the experiences they have.  I especially would like to hear from
anyone that gets it working on Sun 4s.

Enough of this, here is the code:


#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  Makefile README chkpnt.c chkpnt.h main.c
# Wrapped by seth at sirius on Sun Apr  8 17:42:56 1990
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'Makefile' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'Makefile'\"
else
echo shar: Extracting \"'Makefile'\" \(1154 characters\)
sed "s/^X//" >'Makefile' <<'END_OF_FILE'
X# Makefile attached to demo checkpoint/restore program
X#
X# The CPP Flag TIMEINFO will cause the c/r to print info on how long
X# it took to checkpoint a program.
X#
X# The CPP Flag AOUT will cause the c/r to look in the argv[0] to determine
X# (for a vax) what some exec headers are, and it will also cause the symbol
X# table to be put on the checkpointed program.
X#
XCC = cc
XTARGET_ARCH=
X#LINTFLAGS= -hx
X# For SunOS 4.x
XCFLAGS= -g -Bstatic
XCPPFLAGS= -DSUNOS4 -DTIMEINFO -DAOUT
X# For SunOS 3.x
X#CFLAGS= -g -Bstatic
X#CPPFLAGS= -DTIMEINFO
X# For 4.3 BSD
X#CFLAGS = -g -DBSD43 -DTIMEINFO
X#CPPFLAGS=  -DBSD43 -DTIMEINFO
X# For Ultrix
X#CFLAGS = -g
X#CPPFLAGS= -DULTRIX -DTIMEINFO
X
XCSRC= chkpnt.c main.c
XOBJS= $(CSRC:.c=.o)
X
XFORKS= rforkd rforkt
X
Xpoint:	$(OBJS)
X	$(LINK.c) -o $@ $(OBJS)
X
Xforks: $(FORKS)
X
X# You need to make flip so that the program won't overwrite its own text
Xflip:
X	cp cpnt cpnt-test
X
Xlint:
X	$(LINT.c) $(CSRC)
X
Xclean:
X	rm -f core a.out *~ *.o cpnt*
X
Xrcpout:
X	rcp chkpnt.[ch] main.c sethr at cunixc:check
X	rcp chkpnt.[ch] main.c sethr at cs:check
Xrcpcs:
X	rcp sethr at cs:chkpnt.[ch] .
Xrcpcc:
X	rcp sethr at cunixc:chkpnt.[ch] .
X
Xchkpnt.o: chkpnt.h
Xmain.o: chkpnt.hEND_OF_FILE
if test 1154 -ne `wc -c <'Makefile'`; then
    echo shar: \"'Makefile'\" unpacked with wrong size!
fi
# end of 'Makefile'
fi
if test -f 'README' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'README'\"
else
echo shar: Extracting \"'README'\" \(1412 characters\)
sed "s/^X//" >'README' <<'END_OF_FILE'
X
X
XHello.
X
XThis is a short preliminary document describing the use and restrictions
Xof the checkpoint routines.
X
XThe restrictions are:
X
X1) You have to reopen files manually.
X2) Any funky things like signal handlers and the like can cause the
X   routine to crash and burn or just not be able to be restored
X3) You have to schedule the checkpoints manually.
X
XTo use, you have to include the file chkpnt.h in your program.
X
XYou have to have argc and argv as arguemnts to your main procedure
Xif you select AOUT as a Makefile option.
X
XThe first statement after the main() variable declarations should
Xbe "CheckRestore()" which will restore the program if it has
Xbeen checkpointed.
X
XYou will probably want to change the name the program is saved to in
Xchkpnt.h
X
XIf the program did crash and does need to be restored, you need to
Xmove it to another name (because otherwise the program will dump
Xa new version over the old program text, and that can sometimes be
Xvery very bad...
X
XThe author is not responsible for anything that might happen because
Xof the use/nonuse of this routine.
X
X
XPlease send all questions/bug reports/enhancements to:
XSeth Robertson <seth at ctr.columbia.edu>
X
XOr to the mailing list:
X<chkpnt at ctr.columbia.edu>
X
XTo get added to the mailing list:
X<chkpnt-request at ctr.columbia.edu>
X
X                                        -Seth Robertson
X                                         seth at ctr.columbia.edu
END_OF_FILE
if test 1412 -ne `wc -c <'README'`; then
    echo shar: \"'README'\" unpacked with wrong size!
fi
# end of 'README'
fi
if test -f 'chkpnt.c' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'chkpnt.c'\"
else
echo shar: Extracting \"'chkpnt.c'\" \(16124 characters\)
sed "s/^X//" >'chkpnt.c' <<'END_OF_FILE'
X/*
X**  chkpnt
X**
X**  A routine to that checkpoints and restores a program
X**
X**  Copyright (c) 1989
X**  All rights are currently reserved.
X**
X**  Redistribution and use in source and binary forms are permitted
X**  provided that the above copyright notice and this paragraph are
X**  duplicated in all such forms.
X**
X**  THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
X**  IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
X**  WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
X**
X**  If you have any enhancements, please send them so that they can
X**  be incorporated into the next version.
X**
X**  This program is quite possibly supported by:
X**	Seth Robertson
X**	seth at ctr.columbia.edu
X**	Columbia University
X**	1220 S.W. Mudd
X**	New York, New York  10027-6699
X**	(212) 854-6475
X**
X**
X**  This has been sucessfully tested on:
X**	Sun 3 running SunOS 4.0.3
X**	Sun 3 running SunOS 4.0
X**	Sun 3 running SunOS 3.4
X**
X**  This has been unsucessfully tested on:
X**	Sun 4 running SunOS 4.0.3c
X**	Vax running Ultrix
X**	Vax running BSD 4.3
X*/
X
X#ifndef lint
Xchar copyright[] =
X"@(#) Copyright (c) 1989 Seth Robertson seth at ctr.columbia.edu\n     \
XAll rights reserved.\n";
Xstatic char ident[] = "@(#)chkpnt.c 0.5 Beta 89/11/25 SMI";
X#endif
X
X/*
X**	  The following note applies only to programs which do not
X**	  have AOUT defined.  AOUT tells the computer that it can
X**	  look at the filename argv[0] to find symbol tables and the
X**	  like. (Thus argv must be defined)
X**
X** NOTE:  All programs which use this *MUST* be compiled with static
X**	  linking.  This is because any and all symbols are lost when
X**	  the program is checkpointed.  For sun, compile with -Bstatic
X*/
X
X#include "chkpnt.h"
X#include <a.out.h>
X#include <errno.h>
X#include <sys/param.h>
X#ifdef BSD43
X#include <sys/fcntl.h>
X#else
X#include <fcntl.h>
X#endif
X#include <sys/dir.h>
X#include <machine/vmparam.h>
X#include <sys/types.h>
X#include <sys/timeb.h>
X#include <setjmp.h>
X#include <signal.h>
X#include <sys/user.h>
X#ifdef SUNOS4
X#include <alloca.h>
X#endif
X
X#ifdef BSD43
Xextern int errno;
X#endif
X
X#ifdef vax
Xextern int etext;
Xstatic u_long ETEXT;
X#endif
X
X#ifdef SUNOS4
Xtypedef void SigType;		/* SunOS 4.0 uses void */
X#else
Xtypdef int SigType;		/* Non 4.0 uses int */
X#endif
X
X#ifdef sun
Xtypedef void *DT;		/* Vaxen apparently don't have *void */
X#else
Xtypedef (char *) DT;
X#endif
X				/* Defines */
Xcaddr_t	brk(), sbrk(), alloca(), calloc(), malloc();
X
Xvoid Dbug();			/* Debug print */
X
X#ifdef AOUT
Xchar *_ProgName = (char *)NULL;	/* argv[0] */
X#endif
X
Xint _ChkPnt = 0;		/* Global (and thus static) checkpoint
X				 * variable.  This is needed to find
X				 * out if the program was checkpointed
X				 * so a restore can be run
X				 */
X
Xint DebugLevel = DEF_DEB_LEV;	/* Setup debugging */
X
X				/* This creates a signal stack that
X				 * the restore will need to use in
X				 * order to work
X				 */
Xstatic char SignalStack[STACKSIZE];
Xstatic caddr_t PSstack;		/* Pointer to saved stack */
Xstatic u_int StackSize;		/* Stack size */
Xstatic jmp_buf svdloc;		/* Non-local goto save */
X
X
X/*
X** Checkpoint
X**
X** Routine that checkpoints the current process at the current point
X** to the files SAVE_NAME and SAVE_STACK_NAME
X*/
Xint
XCheckpoint()
X
X{
X				/* Pointer to the current process's exec
X				 * structure
X				 */
X#ifdef sun
X  struct exec *UsrExec = (struct exec *) USRTEXT;
X#else /* sun */
X# ifdef vax
X  /* NOTHING */
X# else /* vax */
X  THIS IS AN ERROR!!!!  LEAVE ME ALONE
X# endif /* vax */
X#endif /* sun */
X
X
X  struct exec SvdExec;		/* Exec structure that the new checkpointed
X				 * program will have
X				 */
X
X  register int fd;		/* File descripter of output file */
X
X  static caddr_t Btext;		/* Bottom of text */
X  static caddr_t Theap;		/* Top of heap */
X  static caddr_t Tstack;	/* Top (low end) of stack */
X
X  u_long ChkRst = 0xfeedface;	/* Check for correctness */
X
X#ifdef AOUT
X  int aout;			/* argv[0] file */
X#endif
X
X#ifdef TIMEINFO
X  struct timeb first, second;	/* Starting and ending times */
X  int diffs, diffm;		/* Sec and millisec of diff */
X#endif
X
X#ifdef vax
X  ETEXT = PageBrk(&etext);
X#endif vax
X
X#ifdef TIMEINFO
X  ftime(&first);		/* Set starting time */
X#endif
X
X  Theap = sbrk(0);		/* Save top of heap */
X  Tstack = alloca(0);		/* Save top of stack */
X
X  Dbug(DEBUG,"Restore address is 0x%x\n",(DT)&ChkRst,(DT)0,(DT)0,(DT)0);
X
X  if (setjmp(svdloc))
X    {
X      Dbug(DEBUG,"TOS = 0x%x : OTOS = 0x%x\n",(DT)alloca(0),Tstack);
X
X      Dbug(DEBUG,"Restore address is 0x%x\n",(DT)&ChkRst,(DT)0,(DT)0,(DT)0);
X
X      if (ChkRst != 0xfeedface)
X	{
X	  Dbug(FATAL,"Restore constant 0x%x is not correct!!!\n",
X	       (DT)ChkRst,(DT)0,(DT)0,(DT)0);
X	  exit(RET_FAIL);
X	}
X
X      (void)brk(Theap);		/* Set the data segment back (it was at a
X				 * page break, which wastes space
X				 */
X
X      return(RET_RESTORE);	/* Return (unneeded) info that I am restored
X				 * version */
X    }
X
X				/* Compute the stack size */
X  StackSize = (u_int)((u_long)(USRSTACK) - (u_long)(Tstack));
X
X  Dbug(DEBUG,"Stack is %d bytes long, starting from 0x%x\n",
X       (DT)StackSize,(DT)Tstack,(DT)0,(DT)0);
X
X  if ((PSstack = malloc(StackSize)) == NULL)
X    {
X      perror("Chkpnt");
X      Dbug(SERIOUS,"Cannot save stack in memory!!!\n",(DT)0,(DT)0,(DT)0,(DT)0);
X      return(RET_FAIL);
X    }
X
X				/* Copy the stack to the newly created
X				 * memory locations
X				 */
X  bcopy(Tstack,PSstack,(int)StackSize);
X
X  Dbug(DEBUG,"Stack saved in memory to 0x%x\n",(DT)PSstack,(DT)0,(DT)0,(DT)0);
X
X
X  if ((fd = open(SAVE_NAME,O_WRONLY|O_CREAT,SAVE_MODE)) < 0 )
X    {
X      Dbug(SERIOUS,"Cannot open checkpoint text/heap save file %s\n",
X	   (DT)SAVE_NAME,(DT)0,(DT)0,(DT)0);
X      return(RET_FAIL);
X    }
X
X  Dbug(DEBUG,"Text/heap save file opened sucessfully\n",
X       (DT)0,(DT)0,(DT)0,(DT)0);
X
X#ifdef AOUT			/* Lets open the argv[0] file */
X  if ((aout = open(_ProgName,O_RDONLY)) < 0 )
X    {
X      Dbug(SERIOUS,"Cannot open argv[0] (%s) for special info.\n",
X	   (DT)_ProgName,(DT)0,(DT)0,(DT)0);
X      return(RET_FAIL);
X    }
X
X  Dbug(DEBUG,"argv[0] opened sucessfully\n",
X       (DT)0,(DT)0,(DT)0,(DT)0);
X#endif /* AOUT */
X
X
X  /*
X  ** Create the checkpointed program's exec header
X  **
X  ** Most of the entries are the same, with the exeception of
X  ** the data segment size and the symbol table.
X  **
X  ** The data segment size gets modified to reflect that the
X  ** data segment holds everything that was malloced or created
X  ** during the execution of the program so far as well as anything
X  ** that was there before.
X  **
X  ** The symbol table is set to zero to reflect that it is no longer present
X  */
X
X#ifdef vax
X#ifdef AOUT
X				/* The exec header isn't in memory.  Since
X				 * we will be using argv[0] anyway, we
X				 * might as well use the info.
X				 */
X  if (read(aout, &SvdExec,(size_t)sizeof(struct exec)) != sizeof(struct exec))
X    {
X      Dbug(SERIOUS,"Cannot read exec header\n");
X      return(RET_FAIL);
X    }
X
X  SvdExec.a_magic = ZMAGIC;
X/*  SvdExec.a_text stays the same */
X  SvdExec.a_data = PageBrk(sbrk((int)0)) - (u_long)(ETEXT);
X/*   SvdExec.a_bss stays the same */
X/*  SvdExec.a_syms stays the same */
X/*   SvdExec.a_ent stays the same */
X  SvdExec.a_trsize = 0;
X  SvdExec.a_drsize = 0;
X#else /* AOUT */
X
X				/* Well, the exec header is not apparently
X				 * in memory, so I have to guess.  The only
X				 * real problem is the a_bss, but I am not
X				 * happy about a_text and a_entry
X				 */
X  SvdExec.a_magic = ZMAGIC;
X  SvdExec.a_text = (u_long)ETEXT;
X  SvdExec.a_data = PageBrk(sbrk((int)0)) - (u_long)(ETEXT);
X  SvdExec.a_bss = 0;
X  SvdExec.a_syms = 0;
X  SvdExec.a_entry = USRTEXT;
X  SvdExec.a_trsize = 0;
X  SvdExec.a_drsize = 0;
X#endif /* AOUT */
X#else /* vax */
X#ifdef sun
X#ifdef SUNOS4
X  SvdExec.a_dynamic = UsrExec->a_dynamic;
X  SvdExec.a_toolversion = UsrExec->a_toolversion;
X#endif
X  SvdExec.a_machtype = UsrExec->a_machtype;
X  SvdExec.a_magic = UsrExec->a_magic;
X  SvdExec.a_text = UsrExec->a_text;
X  SvdExec.a_data = PageBrk(sbrk((int)0)) - (N_DATADDR((*UsrExec)));
X  SvdExec.a_bss = UsrExec->a_bss;
X#ifdef AOUT
X  SvdExec.a_syms = UsrExec->a_syms;
X#else
X  SvdExec.a_syms = 0;
X#endif
X  SvdExec.a_entry = UsrExec->a_entry;
X  SvdExec.a_trsize = 0;
X  SvdExec.a_drsize = 0;
X#else
X  THIS IS AN IMPORTANT ERROR!!!
X#endif
X#endif
X				/* Write the new exec header to the
X				 * checkpoint file
X				 */
X
X  if (write(fd, (char *)(&SvdExec), sizeof(SvdExec)) < 0)
X    {
X      Dbug(SERIOUS,"Cannot write exec structure to disk\n",
X	   (DT)0,(DT)0,(DT)0,(DT)0);
X      (void)close(fd);
X      return(RET_FAIL);
X    }
X
X  Dbug(DEBUG,"Wrote exec structure successfully\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X				/* Find beginning of text segment */
X#ifdef sun
X  Btext = (char *)(N_TXTADDR(*UsrExec) + SHSIZE);
X#else /* sun */
X# ifdef vax
X  Btext = USRTEXT;
X# else /* vax */
X  THIS IS AN ERROR!!!!  LEAVE ME ALONE
X# endif /* vax */
X#endif /* sun */
X
X  Dbug(DEBUG,"Text begins at 0x%x\n",(DT)Btext,(DT)0,(DT)0,(DT)0);
X
X				/* Write text segment to disk */
X#ifdef sun
X  if (write(fd, (char *)(Btext), (int)(SvdExec.a_text - SHSIZE)) < 0 && errno != EFAULT)
X#else /* sun */
X# ifdef vax
X				/* For a vax, there might be a zero filled
X				 * page up front
X				 */
X    lseek(fd,N_TXTOFF(SvdExec),0);
X
X  if (write(fd, (char *)(Btext), SvdExec.a_text) < 0 && errno != EFAULT)
X# else /* vax */
X  THIS IS AN ERROR!!!!  LEAVE ME ALONE
X# endif /* vax */
X#endif /* sun */
X
X    {
X      Dbug(SERIOUS,"Cannot write text segment to disk\n",
X	   (DT)0,(DT)0,(DT)0,(DT)0);
X      (void)close(fd);
X      return(RET_FAIL);
X    }
X
X  Dbug(DEBUG,"Wrote text segment successfully\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X
X  _ChkPnt = 1;			/* When/if program is restored, this
X				 * will make it go through restoral
X				 * process
X				 */
X
X#ifdef sun
X  Dbug(DEBUG,"Data begins at 0x%x\n",
X       (DT)(N_DATADDR((*UsrExec))),(DT)0,(DT)0,(DT)0);
X#else /* sun */
X# ifdef vax
X  Dbug(DEBUG,"Data begins at 0x%x\n",(DT)(ETEXT),(DT)0,(DT)0,(DT)0);
X# else /* vax */
X  THIS IS AN ERROR!!!!  LEAVE ME ALONE
X# endif /* vax */
X#endif /* sun */
X
X				/* Write text segment to disk */
X#ifdef sun
X  if (write(fd, (char *)(N_DATADDR((*UsrExec))), (int)SvdExec.a_data) < 0 && errno != EFAULT)
X#else /* sun */
X# ifdef vax
X  if (write(fd, (char *)(ETEXT), SvdExec.a_data) < 0 && errno != EFAULT)
X# else /* vax */
X  THIS IS AN ERROR!!!!  LEAVE ME ALONE
X# endif /* vax */
X#endif /* sun */
X    {
X      Dbug(SERIOUS,"Cannot write data segment to disk\n",
X	   (DT)0,(DT)0,(DT)0,(DT)0);
X      (void)close(fd);
X      return(RET_FAIL);
X    }
X
X  _ChkPnt = 0;			/* Reset */
X
X  Dbug(DEBUG,"Wrote data segment successfully\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X  /* AFTER THIS POINT, SVD_EXEC CAN NO LONGER BE RELIED UPON */
X
X#ifdef AOUT
X  lseek(aout,0,0);		/* Go to beginning of file (there might
X				 * have been i/o before this point)
X				 */
X
X  if (read(aout, &SvdExec,(size_t)sizeof(struct exec)) != sizeof(struct exec))
X    {
X      Dbug(SERIOUS,"Cannot read exec header\n");
X      return(RET_FAIL);
X    }
X
X				/* Go to beginning of sym_table */
X  if (lseek(aout,N_SYMOFF(SvdExec),0) >= 0)
X    {
X      char *symtbl = malloc((size_t)SvdExec.a_syms);
X
X      if (read(aout, symtbl,(size_t)SvdExec.a_syms) != SvdExec.a_syms)
X	{
X	  Dbug(SERIOUS,"Cannot read symbol table\n");
X	  return(RET_FAIL);
X	}
X
X      if (write(fd, symtbl, (size_t)SvdExec.a_syms) < 0)
X	{
X	  Dbug(SERIOUS,"Cannot write Symbol Table to disk\n",
X	       (DT)0,(DT)0,(DT)0,(DT)0);
X	  (void)close(fd);
X	  return(RET_FAIL);
X	}
X
X      free(symtbl);		/* Free up space wasted by symbol table */
X    }
X  else
X    {
X      Dbug(SERIOUS,"Cannot lseek Symbol Table\n",
X	   (DT)0,(DT)0,(DT)0,(DT)0);
X      return(RET_FAIL);
X    }
X
X  Dbug(DEBUG,"Symbol Table transfered to save file\n",
X       (DT)0,(DT)0,(DT)0,(DT)0);
X
X				/* Go to beginning of string_table */
X  if (lseek(aout,N_STROFF(SvdExec),0) >= 0)
X    {
X      char *strtbl;
X      u_int strsiz;		/* Size of string table */
X
X      if (read(aout, &strsiz,(size_t)sizeof(int)) != sizeof(int))
X	{
X	  Dbug(SERIOUS,"Cannot read string table size\n");
X	  return(RET_FAIL);
X	}
X
X				/* Allocate the storage */
X      strtbl = malloc((size_t)strsiz);
X
X				/* Go back to beginning */
X      lseek(aout,N_STROFF(SvdExec),0);
X
X
X      if (read(aout, strtbl,(size_t)strsiz) != strsiz)
X	{
X	  Dbug(SERIOUS,"Cannot read string table\n");
X	  return(RET_FAIL);
X	}
X
X      if (write(fd, strtbl, (size_t)strsiz) < 0)
X	{
X	  Dbug(SERIOUS,"Cannot write String Table to disk\n",
X	       (DT)0,(DT)0,(DT)0,(DT)0);
X	  (void)close(fd);
X	  return(RET_FAIL);
X	}
X
X      free(strtbl);		/* Free up space wasted by symbol table */
X    }
X  else
X    {
X      Dbug(SERIOUS,"Cannot lseek String Table\n",
X	   (DT)0,(DT)0,(DT)0,(DT)0);
X      return(RET_FAIL);
X    }
X
X  Dbug(DEBUG,"String Table transfered to save file\n",
X       (DT)0,(DT)0,(DT)0,(DT)0);
X
X  (void)close(aout);		/* Close argv[0] */
X#endif /* AOUT */
X
X  (void)close(fd);		/* Close executable */
X
X  free(PSstack);		/* Free up space wasted by stack */
X
X  Dbug(WARNING,"Checkpoint sucessfully completed\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X#ifdef TIMEINFO
X  ftime(&second);		/* Set end time */
X
X  diffs = second.time - first.time;
X  diffm = second.millitm - first.millitm;
X
X  if (diffm < 0)
X    {
X      diffm = -diffm;
X      diffs = diffs - 1;
X    }
X
X  (void)fprintf(stderr,"Taking %d.%d of real time\n",diffs,diffm);
X
X#endif
X
X
X  return(RET_SUCCESS);
X}
X
X
X
X/*
X** RestorePoint
X**
X** Procedure to restore the processes at the saved state.
X**
X** It does this by jumping to a signal handler with a private stack
X** this handler routine does the actual restore and then a longjmp
X** to go back to the restored point.
X*/
Xvoid
XRestorePoint()
X{
X  SigType RestSigHndlr();	/* Signal handler */
X
X  struct sigvec SigVec;		/* Signal Vector data */
X  struct sigstack SigStack;	/* Signal Stack data */
X
X  Dbug(DEBUG,"Restore running\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X				/* Setup info for SigStack */
X  SigStack.ss_sp = &SignalStack[STACKSIZE];
X  SigStack.ss_onstack = 0;
X
X				/* Setup info for SigVec */
X  SigVec.sv_handler = RestSigHndlr;
X  SigVec.sv_mask = 0;
X  SigVec.sv_onstack = 1;
X
X  if (sigvec(SIGUSR2,&SigVec,(struct sigvec *)0) < 0)
X    {
X      Dbug(FATAL,"Could not install Signal Vector Interrupt\n",
X	   (DT)0,(DT)0,(DT)0,(DT)0);
X      exit(RET_FAIL);
X    }
X
X  if (sigstack(&SigStack,(struct sigstack *)0) < 0)
X    {
X      Dbug(FATAL,"Could not install Signal Stack\n",
X	   (DT)0,(DT)0,(DT)0,(DT)0);
X      exit(RET_FAIL);
X    }
X
X  Dbug(DEBUG,"Restore ready for SIGUSR2\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X  if (kill(0,SIGUSR2) < 0)
X    {
X      Dbug(FATAL,"SIGUSR2 Kill failed\n",(DT)0,(DT)0,(DT)0,(DT)0);
X      exit(RET_FAIL);
X    }
X
X  sleep(5);			/* Lets wait a sec... */
X
X  Dbug(FATAL,"SIGUSR2 Kill didn't happen!\n",(DT)0,(DT)0,(DT)0,(DT)0);
X  exit(RET_FAIL);
X}
X
X
X/*
X** RestSigHndlr
X**
X** Routine to do the actual physical restoring of the stack and then to
X** do the longjmp
X*/
X
XSigType
XRestSigHndlr()
X
X{
X  signal(SIGUSR2, SIG_IGN);	/* Ignore the signal */
X
X  Dbug(DEBUG,"SIGUSR2 succeeded\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X  Dbug(WARNING,"Ready to test stack memory access\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X				/* If this works, then the
X				 * read should work.  This will
X				 * make the kernel set up any
X				 * stuff that is needed, which
X				 * it might not do during a system
X				 * call
X				 */
X  Dbug(DEBUG,"Stack is %d bytes long starting from 0x%x\n",
X       (DT)StackSize,(DT)(USRSTACK-StackSize),(DT)0,(DT)0);
X
X  *(int *) (USRSTACK-StackSize) = 0;
X
X  bcopy(PSstack,(char *)(USRSTACK-StackSize),(int)StackSize);
X
X  Dbug(DEBUG,"Copy of stack succeeded\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X  Dbug(WARNING,"Ready to longjmp\n",(DT)0,(DT)0,(DT)0,(DT)0);
X
X  /* NOTE!!!  You can return different values!!! */
X
X  longjmp(svdloc,1);		/* Continue Program */
X
X  sleep(5);
X
X  Dbug(FATAL,"What happened to the longjmp???\n",(DT)0,(DT)0,(DT)0,(DT)0);
X  exit(RET_FAIL);
X}
X
X
X/*
X** Debugging procedure
X*/
Xvoid
XDbug(dlevel,string,p1,p2,p3,p4)
X
Xint dlevel;
Xchar *string;
XDT p1, p2, p3, p4;
X
X{
X  if (dlevel <= DebugLevel)
X    {
X/*      if (dlevel <= SERIOUS)
X	perror();*/
X      (void)fprintf(stderr,string,p1,p2,p3,p4);
X    }
X
X  if (dlevel == FATAL)
X    {
X/*      (void)fputs("Bailing Out\n",stderr);*/
X      ;
X    }
X}
END_OF_FILE
if test 16124 -ne `wc -c <'chkpnt.c'`; then
    echo shar: \"'chkpnt.c'\" unpacked with wrong size!
fi
# end of 'chkpnt.c'
fi
if test -f 'chkpnt.h' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'chkpnt.h'\"
else
echo shar: Extracting \"'chkpnt.h'\" \(1495 characters\)
sed "s/^X//" >'chkpnt.h' <<'END_OF_FILE'
X/*
X** All rights are currently reserved.
X** (c) 1989
X*/
X
X/*
X** All programs which have AOUT defined *MUST* have 
X** main(int argc,char **argv)  (argv is needed.  It doesn't matter
X** whether you user **argv or *argv[])
X*/
X
X
X#include <stdio.h>
X
X#define SAVE_NAME "cpnt"
X#define SAVE_STACK_NAME "cpnt.stk"
X#define SAVE_MODE 0755
X#define SAVE_STACK_MODE 0644
X
X#define RET_FAIL -1		/* Failed */
X#define RET_SUCCESS 0		/* Normal exec */
X#define RET_RESTORE 1		/* Program was restore */
X
X#define DEBUG 3			/* Verbose */
X#define WARNING 2		/* Semi-important events */
X#define SERIOUS 1		/* Failure events (but not program exit) */
X#define FATAL 0			/* Panic events (program exit) */
X#define DEF_DEB_LEV 0		/* Default debugging level */
X
X#define STACKSIZE (8*1024)	/* 8 K */
X
X#ifdef sun
X				/* 0x1000 aligned */
X#define PageBrk(address) ((((int)(address) >> 12) + 1) << 12)
X#else /* sun */
X# ifdef vax
X				/* 0x400 aligned */
X#define PageBrk(address) ((((int)(address) >> 10) + 1) << 10)
X# else /* vax */
X  THIS IS AN ERROR!!!!  LEAVE ME ALONE
X# endif /* vax */
X#endif /* sun */
X
X
X#ifdef AOUT
Xextern char *_ProgName;		/* Pointer to argv[0] */
X#define CheckRestore() if (!_ProgName) _ProgName = *argv; if (_ChkPnt != 0) RestorePoint()
X#else
X#define CheckRestore() if (_ChkPnt != 0) RestorePoint()
X#endif
X
Xextern int _ChkPnt;		/* Global checkpoint variable */
Xextern int DebugLevel;		/* Debugging level */
X
Xint Checkpoint();		/* Checkpoint routine */
Xvoid RestorePoint();		/* Restore at point of check */
END_OF_FILE
if test 1495 -ne `wc -c <'chkpnt.h'`; then
    echo shar: \"'chkpnt.h'\" unpacked with wrong size!
fi
# end of 'chkpnt.h'
fi
if test -f 'main.c' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'main.c'\"
else
echo shar: Extracting \"'main.c'\" \(1133 characters\)
sed "s/^X//" >'main.c' <<'END_OF_FILE'
X/*
X** All rights are currently reserved.
X** (c) 1989
X*/
X
X#include <stdio.h>
X#include "chkpnt.h"
X
Xmain(argc,argv)
Xint argc;
Xchar *argv[];
X{
X  int sum = 0;			/* Sum information to be saved */
X  register x;			/* Counter */
X  int testit = 0xfeedface;
X
X  (void)printf("Hello.  This is program %s with %d arguments\n",argv[0],argc);
X
X  if (argc == 2)		/* Set debugging level */
X    DebugLevel = atoi(argv[1]);
X
X  CheckRestore();
X
X  for (x=0;x<10;x++)
X    {
X      register y;
X
X      if ((y = Checkpoint()) < 0)
X	{
X	  (void)fprintf(stderr,"Error in checkpoint\n");
X	  exit(-1);
X	}
X      if (y>0)
X	{
X	  (void)printf("This is program %s with %d arguments\n",argv[0],argc);
X	  if (testit != 0xfeedface)
X	    {
X	      fputs("testit failed on Restore\n",stderr);
X	      exit(-1);
X	    }
X	  else
X	    (void)printf("Restored at %d loop %d\n",sum,x);
X	}
X
X				/* Yes, I know sum overflows by some
X				 * gigantic amount, but who cares?
X				 */
X      for(y=0;y<1000000;y++)
X	sum += y;
X    }
X
X  (void)printf("And the sum is %d\n",sum);
X
X  if (sum != 653067456)
X    (void)printf("Which is incorrect\n");
X  else
X    (void)printf("Which is correct\n");
X}
END_OF_FILE
if test 1133 -ne `wc -c <'main.c'`; then
    echo shar: \"'main.c'\" unpacked with wrong size!
fi
# end of 'main.c'
fi
echo shar: End of shell archive.
exit 0



More information about the Alt.sources mailing list