Standard C Digest - V2 #6
Orlando Sotomayor-Diaz
osd7 at homxa.UUCP
Wed Jan 9 13:31:42 AEST 1985
ANSI Draft of Proposed C Language Std.
Mail your replies to the author(s) below or to cbosgd!std-c.
Cbosgd is reachable via most of the USENET nodes, including ihnp4,
ucbvax, decvax, hou3c.... Administrivia should be mailed to
cbosgd!std-c-request.
ARPA -> mail to cbosgd!std-c at BERKELEY.ARPA (+++ NOT TO INFO-C +++)
**************** mod.std.c Vol. 2 No. 6 1/8/85 ********************
Today's Topics:
reply to Arnold's comments on the X3J11/84-161 draft
----------------------------------------------------------------------
Date: 8 Jan 85 06:20:06 CST (Tue)
From: cbosgd!ihnp4!utzoo!henry
Subject: reply to Ken Arnold's comments on the X3J11/84-161 draft
References: <596 at homxa.UUCP>
Toward the end of his contribution, Ken makes a key observation that will
be relevant when discussing his specific points (emphasis added):
> ... The standard should not break existing
> code, *except where such code takes advantage of bugs, non-standard
> extensions, or implementation or machine dependent code (such as asm
> statements)*.
I think we are all in agreement on that. Not breaking correct code was
a major objective of the committee; it is my contention that they have
in fact achieved it, and Ken's points are adequately rebutted by his own
observation. In detail...
> Not scanning strings in the program for macro names is proper and
> current. However, not scanning strings in the token sequence in the
> macro definition will break several current programs, including the
> 4.2bsd operating system (c.f. "CTRL" in <sys/ttychars.h>). ...
To quote the K&R C reference manual (henceforth "CRM"), section 12.1
(emphasis added):
A [#define] causes the preprocessor to replace subsequent
instances of the identifier with the given string of tokens...
Each occurrence of a [macro parameter] is replaced by the
corresponding token string from the call... *Text inside a
string or a character constant is not subject to replacement*.
In other words, replacement inside strings -- be it for macros or
macro parameters -- is a non-standard extension. It's a "feature" of
the Reiser C preprocessor, which is omnipresent in Unix C compilers
but not in others. The closest thing we have to an implementation-
independent standard for C is the CRM, which explicitly outlaws replacement
inside strings.
I agree that this will break a number of things, including 4.2BSD. How
sad. Those programs, including 4.2BSD, were implementation-dependent
to begin with, and the authors have no right to cry about it. It should
be clear from this that I disagree with the committee's expressed intent
to add such a capability later. The current draft standard's neat new
string-concatenation convention (adjacent string literals -- note this
is literals only -- are concatenated at compile time) eliminates the
need for in-string replacement as a way to build filenames out of #defined
pieces, which to my mind was the only real need for in-string replacement.
> "As indicated by the syntax, a token must not follow a #else or
> #endif directive before the terminating new-line character.
> However, comments may appear anywhere on any source line,
> including on a preprocessor directive."
>
> This breaks many existing programs, including rmail, deroff, diction,
> efl, eqn, learn, lint, nroff, refer, struct, troff, uucp, and ingres.
Interestingly enough, I find *no* occurrences of the trouble-causing
syntax in rmail, deroff, eqn, learn, lint, nroff, refer, struct, troff,
or uucp on my system. A quick inspection of the System V sources (we
have, but don't run, System V) also comes up empty. So, this change
breaks Berklix and only Berklix programs; everybody else has been
following the CRM, which makes no provision for trailing tokens on
#else and #endif. This is a non-standard and implementation-dependent
extension.
I have no personal objections to this one, although I think the syntax
ought to be specific (i.e., one identifier only) rather than wide-open
(any random tokens).
> The implementation may further restrict the significance of an
> *external name* (an identifier that has external linkage) to
> six characters and may ignore distinctions of alphabetical case
> for such names. ...
>
> ... If one is asking
> someone to rewrite a compiler (and many of the extensions would require
> some extensive modifications to existing compilers), asking them to
> modify a loader is not too much to add. ...
As various people (including me) have pointed out, modifying (say) the
OS/360-aka-MVS linker is politically impossible, however desirable and
technically-simple it may be.
I also note that the CRM addresses this point with a (partial) list of
implementations, and it is obvious at a glance that "six chars monocase"
is the lowest common denominator. For those wishing to write portable
code, the conclusion is clear.
I agree that a lot of code was written under the old pdp11 assumptions,
and minimally-conforming implementations of the standard will break such
code. I observe, however, that minimally-conforming-to-the-CRM implemen-
tations which break such code already exist. So the standard is not
making the situation any worse. For the reason mentioned two paragraphs
up, the standard is not in a position to make things any better.
P.S.: To rebut an unpleasant misinterpretation that has been going
around, I do *not* like identifier-length limits. Any identifier-length
limits. The time has long since passed when there was any technical
justification for them, if indeed there ever was. But it is of great
importance that the ANSI C standard be widely accepted, and that cannot
happen if none of the major manufacturers can implement it fully without
breaking hundreds of other programs. Standards are necessarily compromises;
"can I live with it?" is a much more important question than "do I like it?".
I don't like it, but I think we can live with it. I'm getting tired of
people who refuse to grasp the distinction.
Henry Spencer @ U of Toronto Zoology
{allegra,ihnp4,linus,decvax}!utzoo!henry
--------------------------------------
End of Vol. 2, No. 6. Std-C (Jan. 8, 1985 22:30:00)
--
Orlando Sotomayor-Diaz /AT&T Bell Laboratories, Red Hill Road
/Middletown, New Jersey, 07748 (HR 1B 316)
Tel: 201-949-9230 /UUCP: {ihnp4, houxm}!homxa!osd7
More information about the Mod.std.c
mailing list