Synchronous I/O (non-deferred writes) available

was-John McMillan jcm at mtunb.ATT.COM
Fri Feb 24 04:45:08 AEST 1989


Earlier flames regarding UNIX(rg) I/O and databases suggested that
UNIX Block-I/O deferred-writes leave databases in unreliable states.
This was because of the unflushed cache contents at the time of
crashes.  (At least so far as I could identify amidst the flames.)

An E-mail item from John R. MacMillan [!] raised an interesting
point:  in <sys/file.h>, on the 3B1, there's a flag FSYNC.

Examining the sources, and testing, indicates:
	 NON-deferred (block until written) I/O is available.

Add the following line to 
	/usr/include/fcntl.h:

#define	O_SYNC   020	/* synchronous write option */ /* JCM */

Either OPEN(2) or FCNTL(2) can be used to set this file attribute.

This has been there for some time -- probably as long as there has
been an FSYNC entry in 'file.h'.  (As a kernel-repair person, there's
always the problem of living too close to the code to notice the
features!)

It is supported in SVR3, also. (In SVR3, however, there is NO need
to add the 'define'.)

Finally: remember, this is an abusable resource.  It consumes
disk-throughput by performing a write-through of the cache for
each write into the cache.

    zB.:
    	for (i=0; i<256; i++) write(fid, bfr , 4);
    Using O_SYNC, the above causes a minimum of 256 disk-accesses.
    Using the standard deferred I/O, the above require 1 disk-access.
    
    Another example:
    	for (l=0;l<1000;l++) {lseek(fid,0,0); write(fid, b, 4096);}
    Using O_SYNC:	Real=118.6s User=.05s Sys=8.7s (6386/135MB)
    			Real=100.8s User=.03s Sys=1.0s (3B1/67MB)
    Otherwise:		Real=  2.5s User=.04s Sys=2.4s (6386/135MB)
    			Real=  1.0s User=.01s Sys= .2s (3B1/67MB)

It's possible the database issues were flared in another newsgroup.
But it's the 3B1 users who may need to add the define -- and I haven't
the foggiest recollection of where it came up ... sigh.

Back under the bridge...

john mcmillan	-- att!mtunb!jcm	-- muttering for himself, ONLY



More information about the Unix-pc.general mailing list