Is HDB locking safe?

Jim Rosenberg jr at oglvee.UUCP
Tue Aug 7 01:18:53 AEST 1990


I have a program which I need to be mutually exclusive with uuxqt.  I've
employed what I thought was pretty straight forward HDB locking using a lock
file, but wasn't sure how to handle one particular problem, & now I don't see
how HDB can handle it correctly either.  HDB assumes that if the pid recorded
in the lock file no longer corresponds to an active process, the lock file is
defunct and can safely be removed.  I can't for the life of me figure out a
safe way of doing this.  You can tell if there's an active process for the pid
by giving it a kill() with a signal number of 0.  Now suppose you get back
ESRCH for errno and conclude that the process holding the lock is no longer
active.  What do you do?

"Elementary, my dear Watson, you remove the lock file!"  *** NOT SO FAST ***,
Holmes.  To unlink the lock file, the only thing you can supply to a system
call is the *name* of the file.  There is no way (so far as I know) to unlink
by i-number.  There's a narrow window in which another process may be doing
exactly the same thing.  You have no guarantee that the LCK.X file you just
unlinked is in fact the same inode as the one from which you read the pid that
you concluded is no longer active.

Here's an example.  This code is taken from pcomm 1.1, which is hideously out
of date, but I had it lying around; it's a good example of some code written
by somebody who took some care and *thought* he was doing the right thing:

static int
checklock(lockfile)
char *lockfile;
{
	...
	if ((lfd = open(lockfile, 0)) < 0)
		return(0);
	...
	if ((kill(lckpid, 0) == -1) && (errno == ESRCH)) {
		/*
		 * If the kill was unsuccessful due to an ESRCH error,
		 * that means the process is no longer active and the
		 * lock file can be safely removed.
		 */
		unlink(lockfile);
		sleep(1);
		return(1);
	}

In this code there is no guarantee that lfd and lockfile correspond to the
same file at the time of the unlink.

I've wracked my brains trying to think of a safe way to do this, and can't
think of one.  How does HDB do it??  Is HDB lock file handling *in fact
vulnerable* to this narrow window problem?  One thing I thought of was to link
the lockfile to a temp file, stat the temp file before the unlink, stat it
again afterwards; if the link count fails to go down you know you made a beeg
booboo and nuked an active lock file.  But then what?  You can't put the lock
file back -- you don't have a link to it.

Help!
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr at dsi.com                      /      /



More information about the Comp.unix.wizards mailing list