Initializing arrays of char

Flint Pellett flint at gistdev.gist.com
Tue Oct 9 01:11:16 AEST 1990


chris at mimsy.umd.edu (Chris Torek) writes:

>In article <15674 at csli.Stanford.EDU> poser at csli.Stanford.EDU
>(Bill Poser) writes:
>>Regarding the assignment of "12345" to char x[5] ... [K&R 2 says]
>>	...the number of characters in the string, NOT COUNTING
>>	THE TERMINATING NULL CHARACTER, must not exceed the
>>	size of the array. [emphasis mine]
>>Can anyone explain [why the ending '\0' is not counted]?

>This is a change in New (ANSI) C.  In Classic (K&R-1) C, a
>double-quoted string in an initializer context%, when setting the
>initial value of a character array, was treated uniformly as if it were
>a bracketed initializer consisting of all the characters, including
>the terminating NUL, in the string.  That is,

>	char x[5] = "12345";

>meant exactly the same thing as

>	char x[5] = { '1', '2', '3', '4', '5', '\0' };

>(and was therefore in error, having too many characters).

On AT&T 3B2 machines about 2-3 years ago, it did not produce a compile
error: I know, I lived through it.  See story below.

>The X3J11 committee decided# that this was overly restrictive, and
>relaxed the rule to `is equivalent to a bracketed initializer
>consisting of all the characters, including the terminating NUL if it
>fits'.  Thus

IMHO the committee blew it: their decision lets a programmer who will
only use a string in a non-null terminated manner (like with strncpy)
save 1 lousy byte, and opens the door for a ton of mistakes to get through.
I imagine their main motivation was compatibility, but I think this is
still a mistake: if I write it as a double quoted string, _I_ mean that
I want it null terminated.

Here is a real life example of the impact of this decision: for
about a week we had a 3B2 machine which kept crashing about once an
hour because of this!  We finally traced the problem through this chain,
at a cost of 20 minutes per reboot and anywhere from 10 minutes to several
hours chasing the problem at each step.
1. It always crashed because it ran out of swap space.
2. It was incorrectly set up so that one user could use up all the swap.
3. One particular program was always running when it crashed.
4. Performance hit bottom when that program was run, and you couldn't
   abort the program without killing it from another terminal.
5. Only certain functions within the program caused the crash.
6. We were able to keep the system from crashing by retuning, but we
   still had performance problems, and this program wasn't working: it
   appeared to be in an infinite loop.
7. The critical routine that killed us (reduced to the part that mattered)
   eventually was this:

char foo[5] = "abcde";	/* NOTE: no room for terminating '\0' char */
char bar[]  = "fghi";	/* NOTE: declared immediately behind the foo array */
sprintf(bar,"%s",foo);	/* copy foo into bar: other tweaking omitted */

The problem was introduced by a maintenance change correction to the
string in foo, making it 1 longer but forgetting to fix the length of 5.
That, coupled with the fact that array bar followed immediately behind
array foo, which no longer was NUL terminated, turned the sprintf into
an infinite loop chasing it's own tail.

If C thinks this feature is useful, they __at least__ ought to generate
a warning message, because 99 times out of 100 it's going to be a bug,
not an intended use, and it is VERY hard to spot an error of this nature
when looking at the code-- it "looks" right.
-- 
Flint Pellett, Global Information Systems Technology, Inc.
1800 Woodfield Drive, Savoy, IL  61874     (217) 352-1165
uunet!gistdev!flint or flint at gistdev.gist.com



More information about the Comp.lang.c mailing list