Standard C Digest - V2 #6

Orlando Sotomayor-Diaz osd7 at homxa.UUCP
Wed Jan 9 13:31:42 AEST 1985


ANSI Draft of Proposed  C Language Std.

Mail your replies to the author(s) below or to cbosgd!std-c.
Cbosgd is reachable via most of the USENET nodes, including ihnp4,
ucbvax, decvax, hou3c....  Administrivia should be mailed to 
cbosgd!std-c-request.

ARPA -> mail to cbosgd!std-c at BERKELEY.ARPA (+++ NOT TO INFO-C +++)

**************** mod.std.c  Vol. 2 No. 6  1/8/85 ********************

Today's Topics:
		reply to Arnold's comments on the X3J11/84-161 draft
----------------------------------------------------------------------

Date: 8 Jan 85 06:20:06 CST (Tue)
From: cbosgd!ihnp4!utzoo!henry
Subject: reply to Ken Arnold's comments on the X3J11/84-161 draft
References: <596 at homxa.UUCP>

Toward the end of his contribution, Ken makes a key observation that will
be relevant when discussing his specific points (emphasis added):

> ...  The standard should not break existing
> code, *except where such code takes advantage of bugs, non-standard
> extensions, or implementation or machine dependent code (such as asm
> statements)*.

I think we are all in agreement on that.  Not breaking correct code was
a major objective of the committee; it is my contention that they have
in fact achieved it, and Ken's points are adequately rebutted by his own
observation.  In detail...

> Not scanning strings in the program for macro names is proper and
> current.  However, not scanning strings in the token sequence in the
> macro definition will break several current programs, including the
> 4.2bsd operating system (c.f. "CTRL" in <sys/ttychars.h>).  ...

To quote the K&R C reference manual (henceforth "CRM"), section 12.1
(emphasis added):

	A [#define] causes the preprocessor to replace subsequent
	instances of the identifier with the given string of tokens...
	Each occurrence of a [macro parameter] is replaced by the
	corresponding token string from the call...  *Text inside a
	string or a character constant is not subject to replacement*.

In other words, replacement inside strings -- be it for macros or
macro parameters -- is a non-standard extension.  It's a "feature" of
the Reiser C preprocessor, which is omnipresent in Unix C compilers
but not in others.  The closest thing we have to an implementation-
independent standard for C is the CRM, which explicitly outlaws replacement
inside strings.

I agree that this will break a number of things, including 4.2BSD.  How
sad.  Those programs, including 4.2BSD, were implementation-dependent
to begin with, and the authors have no right to cry about it.  It should
be clear from this that I disagree with the committee's expressed intent
to add such a capability later.  The current draft standard's neat new
string-concatenation convention (adjacent string literals -- note this
is literals only -- are concatenated at compile time) eliminates the
need for in-string replacement as a way to build filenames out of #defined
pieces, which to my mind was the only real need for in-string replacement.

> 	"As indicated by the syntax, a token must not follow a #else or
> 	#endif directive before the terminating new-line character.
> 	However, comments may appear anywhere on any source line,
> 	including on a preprocessor directive."
> 
> This breaks many existing programs, including rmail, deroff, diction,
> efl, eqn, learn, lint, nroff, refer, struct, troff, uucp, and ingres.

Interestingly enough, I find *no* occurrences of the trouble-causing
syntax in rmail, deroff, eqn, learn, lint, nroff, refer, struct, troff,
or uucp on my system.  A quick inspection of the System V sources (we
have, but don't run, System V) also comes up empty.  So, this change
breaks Berklix and only Berklix programs; everybody else has been
following the CRM, which makes no provision for trailing tokens on
#else and #endif.  This is a non-standard and implementation-dependent
extension.

I have no personal objections to this one, although I think the syntax
ought to be specific (i.e., one identifier only) rather than wide-open
(any random tokens).

> 	The implementation may further restrict the significance of an
> 	*external name* (an identifier that has external linkage) to
> 	six characters and may ignore distinctions of alphabetical case
> 	for such names.  ...
> 
> ...  If one is asking
> someone to rewrite a compiler (and many of the extensions would require
> some extensive modifications to existing compilers), asking them to
> modify a loader is not too much to add.  ...

As various people (including me) have pointed out, modifying (say) the
OS/360-aka-MVS linker is politically impossible, however desirable and
technically-simple it may be.

I also note that the CRM addresses this point with a (partial) list of
implementations, and it is obvious at a glance that "six chars monocase"
is the lowest common denominator.  For those wishing to write portable
code, the conclusion is clear.

I agree that a lot of code was written under the old pdp11 assumptions,
and minimally-conforming implementations of the standard will break such
code.  I observe, however, that minimally-conforming-to-the-CRM implemen-
tations which break such code already exist.  So the standard is not
making the situation any worse.  For the reason mentioned two paragraphs
up, the standard is not in a position to make things any better.

P.S.:  To rebut an unpleasant misinterpretation that has been going
around, I do *not* like identifier-length limits.  Any identifier-length
limits.  The time has long since passed when there was any technical
justification for them, if indeed there ever was.  But it is of great
importance that the ANSI C standard be widely accepted, and that cannot
happen if none of the major manufacturers can implement it fully without
breaking hundreds of other programs.  Standards are necessarily compromises;
"can I live with it?" is a much more important question than "do I like it?".
I don't like it, but I think we can live with it.  I'm getting tired of
people who refuse to grasp the distinction.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry
--------------------------------------
End of Vol. 2, No. 6. Std-C  (Jan. 8, 1985  22:30:00)
-- 
Orlando Sotomayor-Diaz	/AT&T Bell Laboratories, Red Hill Road
			/Middletown, New Jersey, 07748 (HR 1B 316)
Tel: 201-949-9230	/UUCP: {ihnp4, houxm}!homxa!osd7  



More information about the Mod.std.c mailing list