Cryptic C code?

haahr at siemens.UUCP haahr at siemens.UUCP
Tue Aug 13 05:21:00 AEST 1985


Relevant code: (Kernighan & Ritchie, chapter 5, page 105)

     strcpy(s, t)  /* copy t to s; pointer version 3 */
     char *s, *t;
     {
 	 while (*s++ = *t++)
 	    ;
     }
  

Bob Crane (tektools!bobc) writes:
>    ... [text from K&R, everyone owns it, no point quoting again] ...
> 
> Yeaacch!!!!!!  It was still very cryptic to me the tenth time that I read
> it!!!  A friend explained it to me by saying that the character in the
> 'while' expression is converted to an int and that the NULL character has
> an ascii value of 0 so the test will exit when the NULL character is
> encountered.
> 
> I have trouble believing that the above has advantages of great
> speed OR readability over:
> 
>    strcpy(s,t)  /* copy t to s; pointer version 2 */
>    char *s, *t;
>    {
>       while ((*s++ = *t++) != '\0')
> 	 ;
>    }
> 
> Does anyone out there support the author by saying that Version 3 of
> 'strcpy' is better than Version 2?

I do.  Why?  Read on.

Doug Gwyn (brl-tgr!gwyn) responds:
> I think case 2 is certainly more readable, but as the book says, you
> need to learn to read things like case 3 since a lot of code is like
> that.  More usually one will see something like
> 	char *s;
> 	...
> 	while ( *s++ )
> 		...
> This really is a standard C idiom, although I don't recommend writing
> code that way.  I personally prefer to distinguish between Boolean
> expressions (such as comparisons) and arithmetic expressions, using
> strictly Boolean expressions as conditions.  Thus:
> 	while ( *s++ != '\0' )
> 
> Tests for NULL pointers and flags often are written
> 	if ( p )
> 		...
> 	if ( flag & BIT )
> 		...
> rather than
> 	if ( p != NULL )
> 		...
> 	if ( (flag & BIT) != 0 )
> 		...
> (I prefer the latter.)  Get used to it..

The two pieces are different in terms of the abstraction presented.
	while (*s++ != ANYTHING) ...
This code looks for some character in a string.  In C, the character '\0'
is the character after the last character in a string, so when you find
that character, you have reached the end of the string.  It is an idiom that
we have all gotten used to, knowing to look for '\0'.  On the other hand:
	while (*s++) ...
loops through a string until the first 'false' character.  Now what does
falseness for a character mean?  A logical (and, in the case of C, correct)
interpretation is that we have reached the end of a string.

With the case of (p != NULL) I can understand Doug's argument a little bit
better, because NULL is a better abstraction for pointer to nothing (i.e.
end of a list) than '\0' is for end of a string.  But code like
	while (p->next) ...
says "while there is a pointer after p on the list" very clearly.

The (flag & BIT) comparison is also easier for me to understand than
the explicit test because it allows me to forget about the low-level
bit-twiddling that is going on, and worry about the actual test.

Now, the hard case is the one Bob brought up.
	while (*s++ = *t++) ...
looks very much like the
	while (*s++ == *t++) ...
one would expect from strcmp or similar functions.  I think a comment
or something in this case is much more help than the explicitly redundant
comparison against zero.  This is a matter of personal preference.  The
reason I wouldn't put the "!= '\0'" in this code is that it doesn't tell
you anything, unless you are used to a convention that says something like
"thou shalt always compare everything but explicit tests to 0."  But
putting in a '\0' test won't even make lint complain on the one where it
doesn't belong.  Again, with the possible confusion brought up because
of the = and == operators, maybe one should take special care with tests like
this one.

The idea of abstracting a test beyond explicitly testing for zero is nice
and C is not the first language to do it.  Bjarne Stroustrup recognized this
and included in C++ (as part of the general overloading capability), the
ability to overload comparisons, and retained the convention that an if
is an implicit comparison against zero.  Any class can be the object of
an if or while and the appropriate comparison operator is called.  A
conditional of the form
	while (cin) ...		// cin is the stream associated with stdin
while fail when there is no more input on cin.  Exactly what one would
expect.  While
	while (cin.state != eof && cin.state != fail) ...
(or whatever it is exactly -- I forget) tells you explitly what it is
doing, it tells you more than you normally need to know.

Because the values that fail tests in C (null pointer, character beyond end
of string) are logical and consistent, they provide a nice abstraction beyond
worrying about what should be implementation details (i.e. '\0' is the
end of a string, NULL is the pointer to nothing).

					Paul Haahr
					..!princeton!macbeth!haahr



More information about the Comp.lang.c mailing list