v13i047: Boolean expression array evaluator

Rich Salz rsalz at bbn.com
Mon Feb 22 23:32:45 AEST 1988


In <417 at fig.bbn.com>, I published some software that "confused" EOF and
char's.  I noted the problem but missed one spot.  Bill Stewart sent
the author and me a nice explanation of *WHY* this is a problem.

With his permission, I'm posting his message to the net.
	/r$

# Bill Stewart, AT&T Bell Labs 2G218, Holmdel NJ 1-201-949-0705 ihnp4!ho95c!wcs

Jim Frost posted his bool-eval program to comp.sources.unix, with a
comment that lint thought it was non-portable because it compared a
char to EOF, and that he couldn't fix it because EOF was defined in <stdio.h>.

Rich $alz, the moderator, added a comment to do:
[  Change the calls to the cget() macro to use int, rather than char.
   See build() in build.c  --r$  ]
This isn't enough - there's also a comparison with EOF in a switch
statement in eval_file(), and the code is complex enough that a fix
isn't immediately obvious.

Jim - the reason it's non-portable is that EOF is *not a character*,
and redefining it will not fix the problem, it will make it worse!
On Vaxes and 680*0-based machines, characters are signe; when you
compare them to integers, they have values between -128 and +127.  On
other machines (including the AT&T 3B series), characters are unsigned,
with values between 0 and +255.  EOF is always -1, and well it should be.
The getc() routines, including fgetc(), and getchar(), return an INTEGER,
which either has the integer value corresponding to the charater it
got, or EOF if appropriate.  If you assign this value to a character on
an unsigned-character machine, EOF becomes 255.  When you compare this
to EOF (e.g. ( (c=getchar()) != EOF ) ), C promotes the character 255
to an int 255, which is of course different from -1.  Typical result is
an endless loop.

Why would you want a machine that did such silly things?  Because
either your company makes one, or because you sometimes use data that
really uses the charater 255.  I've been doing a lot of graphics
recently, and 255 (binary 11111111) is all-black (for bitmaps).
If I'm reading in data, and get to a black portion, I don't want my
program to falsely think it hit EOF - it probably just got to the
interesting stuff.  (I was on a non-AT&T machine, and was surprised to
find myself wishing I was on a machine with decent unsigned characters!)

How do you avoid this problem?  Make sure, when you read a value from
getc(), that you compare it for EOF *before* you store it in a
character - use int's.

============ in bool.h ================
#define cget(F,C) if (!feof(F)) {\
                    C= fgetc(F);\
                    if ((!no_print) && (C != EOF))\
                      printf("%c",C);\
                  } else
-- 
For comp.sources.unix stuff, mail to sources at uunet.uu.net.



More information about the Comp.sources.bugs mailing list