Request for Comments

Barry Shein bzs at bu-cs.UUCP
Tue Jul 1 04:58:48 AEST 1986


Any comments? This is taken from INFO-VAX (mod.computers.vax):

------------------------
Path: bu-cs!harvard!caip!think!nike!styx!YALE.ARPA!LEICHTER-JERRY
From: LEICHTER-JERRY at YALE.ARPA
Subject: Re: main() and entry points in C
Date: Thu, 26-Jun-86 08:50:52 EDT

To: John Owens <edison!jso%virginia.csnet at CSNET-RELAY.ARPA>
In-Reply-To: John Owens <edison!jso%virginia.csnet at CSNET-RELAY.ARPA>, Mon, 23 Jun 86 09:34:42 edt

In general, I agree with what you say.  A couple of small comments:

    C does *not* guarantee that address 0 doesn't contain anything, but
    that's another discussion.
C DOES guarantee that the integer constant 0, cast to any pointer type, will
never be equal to a pointer to any actual object of that type.  In principle,
the cast could change the bit pattern; it almost never does - certainly it
does not on a VAX.  Thus, _start == NULL.  Most users will never see this,
but an implementer of _start() would.  (Minor point, but the fact is there IS
an inconsistency - nothing keeps you from doing an extern void _start() and
looking at the resulting pointer.)

    --  Well, getting there.  I tried
    --  "ld foo.o /lib/libc.a".  No errors!  Running a.out produces "Hello
    --  world", followed by an access violation.  Adding an explict exit(0)
    --  fixes that.
    
    This is certainly not supported.  You were lucky.  It's dependent on
    the implementation of the exec(2) system call whether or not you'll
    get your command line arguments this way.
Actually, I've since been informed that, while argc is passed correctly, argv
is screwy and envp isn't there at all.
    
    --  [...] but are you
    --  still going to bet that the first routine WON'T end up as the entry
	point?
    
    I won't bet on anything if the loader isn't invoked properly....
That gets to the crux of things:  The "proper" way to invoke the loader is
undocumented - you must use cc.  How then do you deal with a program written
in multiple languages?  Basically, you ask a wizard....

I find it rather amusing that Unix, which (quite properly) argues for separate
modules with separate functions, and clean interfaces between them, glues the
loader and the C compiler together in a very ad hoc, undocumented way!  (Side
comment:  You at least understand what Unix is doing here.  I had a couple of
other correspondents on this issue who had no real idea what was going on, and
ended up effectively claiming that the loader really is part of the compiler.
If that's the case, (a) it's going to be very hard to deal with multiple
compilers, ever; (b) it becomes hard to justify why the loader doesn't do more
to help the compiler/user out - e.g., check for type clashes in external
function calls.  This would have been trivial to do if the implementers had
wanted to, with minimal overhead, and much faster than lint.  Yes, it would
have required additional facilities in C - argument definitions as in ANSI C -
but then the language, compiler, and loader were developed by the same people
at the same time.  As for those other correspondents, their lack of knowledge
didn't slow them down a bit in defending their incorrect religeous state-
ments....) 

    --  At worst, I was claiming that a lot of non-portable C code got written
    --  under Unix (since K&R certainly contains nothing to indicate that
    --  there can be an entry point other than main()).  And if you don't
    --  believe THAT, then you haven't looked at much Unix code.
    
    That code you've been looking at is going to have a hard time being
    ported to most UNIX systems then, much less any other system with a C
    compiler.  I've been porting, adapting, and randomly mangling C code
    for UNIX from a variety of sources for years, and haven't run into a
    single program that doesn't have an entry point of main().  Would you
    refer me to such a program that I might have access to, like something
    from USENET, a USENIX tape, or a System V or BSD distribution?
If you read more closely what I said, you'll see that I didn't claim to have
any examples of this kind of thing...I just claimed that, somewhere out
there, they were likely to exist.  I know the people who did the VAX C com-
piler and run-time support, and they've tried really hard to be compatible
with Unix.  Unfortunately, that can be very hard to do, since Unix programs
make use of a lot of undocumented "features".  For example:  There is abso-
lutely nothing in any definition of C that says that in:

	f(a,b)
	int a,b;
	{	int *x;

		x = &a + sizeof(int);
		...
	}

x will point to b.  In a field-test version of VAX C V2.0, this was NOT true.
(The VMS procedure-call spec says that the argument list is owned by the
CALLING procedure, which may place it in read-only memory, re-use it, etc.;
the CALLED procedure may only read it.  In that version of C, if you ever
took the address of a formal argument, the value passed was copied to a
temporary cell on entry, and the address you got was of the temporary.  As
far as documented C semantics are concerned, this is a completely correct
implementation - but it prevents you from screwing with the caller's argument
list.)  Anyway, cries of pain came from all over:  Despite the existence of
varargs - which WAS provided with that release, BTW - it turns out that there
are LOTS of C programs that assume you can scan through an argument list this
way.  So the final version of V2 put things back as they were, requiring a
waiver of conformance with this aspect of the procedure-call spec.  (As it
happens, VAX C (currently) always builds argument lists on the stack and then
discards them, so you can screw around to your heart's content - but try it
with a FORTRAN caller, and things get really weird....)

Anyway, given that Unix programmers have historically grasped at ANYTHING they
can find the least justification for in the documentation - or no justifica-
tion at all - "compatibility" has to mean "put in EVERYTHING you can, even if
you can't think of anyone who's using it.  Someone will come along who wants
it, some day...."  Since the "entry point is the first routine" IS, in fact,
documented - even if only for wizards! - supporting it couldn't hurt....

    	John Owens @ General Electric Company
							-- Jerry
-------



More information about the Comp.lang.c mailing list