Why nested comments not allowed?

Derek R. Foster dfoster at jarthur.Claremont.EDU
Tue Feb 27 14:30:00 AEST 1990


In article <16023 at haddock.ima.isc.com> karl at haddock.ima.isc.com (Karl Heuer) writes:
>In article <4601 at jarthur.Claremont.EDU> dfoster at jarthur.Claremont.EDU (Derek R. Foster) writes:
>>I wasn't saying that these characters should never be used as string data.
>>I said that they should not be placed LITERALLY in a string, since they may
>>be mistaken (by the parser) for comments.  [So, when it is desirable to
>>put them in a string constant, they should be encoded] in some way that
>>breaks up the /* and */ pairs ... my favorite so far is this:
>>  #define CS "/""* "
>>  #define CE " *""/"
>>  printf("before comment"CS"comment"CE"after comment");
>
>If we're talking about C, then of course this is not necessary since the
>scanner already knows about strings.  Therefore, I assume we're talking about
>a hypothetical language which is a lot like C, but which has nestable comments
>and therefore must worry about the interaction between comments and strings.

NO!!!!!!!
Just because the scanner knows about strings doesn't mean this can't still
cause problems! With or without nested comments! In perfectly ordinary C!

Try this:
printf("some C code"); /* that code could be printf("/* with comments */"); */

question:What is the result of trying to compile the above?

answer:(assuming no nested comments, although it is perfectly easy to
construct a different example which will fail with nested comments)

printf("some C code"); "); */
                       ^
UNTERMINATED STRING CONSTANT

The scanner only knows about strings THAT ARE NOT STARTED INSIDE
COMMENTS. It can't recognize strings inside comments, because it can't
assume that comments contain valid, parsable C code. As someone pointed
out, if it did do this, it would crash on something like

  /* These are quotation marks --> " */

in the input file.

The scanner has to interpret strings and comments based on whichever it
thinks it is in the middle of. For instance:

1) If it is neither parsing a string nor a comment, and it finds a ", then
   it is now parsing a string. Otherwise, if it finds a /* then it is now
   parsing a comment. Otherwise, it continues parsing neither.
2) if it is parsing a string, it will continue to do so until it finds a " .
   (note: /* and */ are treated as STRING DATA under this condition)
3a) if it is parsing a comment and the compiler does not allow nested comments,
    it will continue to do so until it finds a */ . (note: " is treated like
    all other characters and simply IGNORED under this condition.)
3b) if it is parsing comments, and the compiler allows nested comments, it
    will continue to do so until it has encountered the same number of */
    as it has /* . (note: " is treated like all other characters and simply
    IGNORED under this condition.)

>Surely if such a language existed, it would also have the escape sequences
>`\*' (literal star) and `\/' (literal slash), so you could simply write
>  printf("before comment/\* comment *\/after comment");
>(note that `\?' was added to ANSI C for essentially this reason).

I think this is a VERY good idea. I wasn't talking about a hypothetical
language, however, and unfortunately, I haven't been able to
find any documentation for a feature like this in C, which seems like 
rather a glaring omission, in my opinion. I definitely prefer this method
to the one I showed above -- I just can't find any documentation that
states definitively that \* or \/ are allowable escape sequences.

>Karl W. Z. Heuer (karl at ima.ima.isc.com or harvard!ima!karl), The Walking Lint

Please! If people are going to post on this topic, realize the following
facts! I have stated them multiple times!

1) Whether or not C parses comments in strings DOES NOT BEAR ON MY ARGUMENT.
2) Whether or not C comments can be nested DOES NOT BEAR ON MY ARGUMENT.
3) In fact, part of my argument is that if you've written your code
   well, nesting/not nesting comments WILL NOT MATTER TO THE PARSER. 
   If people wish to argue about the aesthetics of nested comments vs.
   #ifdef/#endif, that's fine, but arguments like

"But with nested (or "but with non-nested...") comments, how can I do 
  printf("/* some horrible abomination with lots of /* and */ */");

   are NOT MEANINGFUL, since the same problems exist with/without nested
   comments, just in different situations! And it's the code-writer's fault!
4) If you write code that depends on whether comments are nested or not,
   it's not the compiler's fault, it's YOURS, and you deserve what you get!
   Encode /* and */ before putting them in strings!

If you aren't sure you understand the above, please reread this
posting before posting one of your own on this topic. If you still don't
understand, please e-mail me.

P. S. Karl - please understand that this is not directed at you personally.

Derek Riippa Foster



More information about the Comp.lang.c mailing list