sizeof(int) on 16-bit Atari ST.

Thu Nov 16 01:49:15 AEST 1989

Thanks to everyone that replied to my question about sizeof(int),
both in news articles and by mail.

I wanted to know if there were any reasons,
other than making it easier to compile non-portable code,
for having sizeof(int)==sizeof(long) on the 16-bit Atari ST.

No one could come up with any other reason
(although maybe some think they did :-).

Below are excerpts from most of the responses.
If yours isn't there, it only means that I didn't disagree
with what you said.

If anyone disagrees with my comments below,
it is probably best to continue by mail or follow up to group
comp.lang.c since this topic is now about C programming and
not directly related to the ST.

====

> From: stephen at oahu.cs.ucla.edu (Steve Whitney)
> Organization: UCLA Computer Science Department
> 
> You can always use this simple hack:
> #define int short
> But you have to be careful with constants.  If you pass 3 as an argument
> to a routine expecting a 16 bit integer, it will pass 00 00 00 03 instead
> of 00 03.  To get around that, pass your constants as (short)3 instead.

No. For "func(int x)", or if there is no prototype in scope,
"func( (short)3 )" will convert 3 to a short, and then convert that
short back to an int before putting it on the stack.

The same thing will happen with non-constants too.
"func( (short)x )", if x is type char, short, or int, will convert the
value of x to a short, and then convert this possibly truncated value
back to an int before putting it on the stack.

> Of course if you haven't written your program yet, just write stuff to
> use shorts instead of ints.

I use shorts when I want shorts, longs when I want longs,
and ints when I don't care which I get.
That is what ints are supposed to be for.
So that if it doesn't matter whether it is long or short
the compiler will use the most efficient type.

====

> From: ron at argus.UUCP (Ron DeBlock)
> Organization: NJ Inst of Tech, Newark NJ
> 
> This  isn't a flame, just a reminder:
> 
> If you want 16 bit ints, you must declare short ints.
             ^
             at least
> If you want 32 bit ints, you must declare long ints.
             ^
             at least
> If you just declare int, the size is implementation dependent.
Correct.

The rule is chars are at least 8 bits, shorts are at least 16 bits,
longs are at least 32 bits, and
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long).

Note that I sometimes use a machine that has 36-bit shorts
since the code to access 2-byte objects is very inefficient
and not worth implementing since most programs ask for shorts
for the wrong reason (i.e. except for explicitly non-portable
code, shorts should only be used in very large arrays).

> You're right that a 16 bit int makes more sense in a 16 bit architecture,
> but DO NOT depend on all compilers to do it that way!

I don't.
I rely on ints being at least as big as shorts and hope that the
compiler will give me the appropriate size for the current architecture.
For many ST C compilers, it doesn't.  This makes the program a little
slower, but since I write portable code it still works fine.

====

> From: apratt at atari.UUCP (Allan Pratt)
> Organization: Atari Corp., Sunnyvale CA
> 
> > "so that badly written code will still work ok"
> 
> I *do* consider this a valid reason.

Sorry.  It is certainly a valid reason if one is faced with having
to compile non-portable code.  When I said "not valid", I meant not
valid with respect to my question.  I was simply asking if anyone knew
of any *other* possible reasons.  So far, no one has come up with one.

> Personally, I consider "int" to
> be an evil data type when the issue of portability comes up.

Funny.  Personally I consider "int" to be the best thing for portability.
If fact that is its sole reason for existing.
If you don't use "int" for portability, what do you use it for?

One should only use int in cases where it doesn't matter whether
the compiler generates code for long, short, or something in between.
What is wrong with int is the way so many programmers have misused it.

> But look
> at UNIX and all the libraries meant to look like UNIX: malloc, fwrite,
> etc. all take "int" arguments, because "int" is "big enough" on those
> machines.  A 16-bit library will probably work, but you can't malloc /
> write more than 32K at a time.  Thanks.  MWC suffers from this, as does
> Alcyon and some Mac-derived compilers (but for a different reason).

True, but the ANSI standard version of C fixes all or most of these problems.
That's why I'm using GNU C instead of one of the older non-standard compilers.

====

> From kirkenda at jove.cs.pdx.edu  Sat Nov  4 00:41:24 1989
> From: Steve Kirkendall <kirkenda%jove.cs.pdx.edu at RELAY.CS.NET>
> Organization: Dept. of Computer Science, Portland State University; Portland OR
> 
> GCC has been ported to Minix-ST, and that version has a 16-bit int option.
> Unfortunately, it has a few problems (eg. sizeof(foo) returns a long value,
> which breaks code such as "p = (foo *)malloc(sizeof(foo))" because malloc()
> doesn't expect a long argument).

That problem will go away in general as more compilers
conform to the standard.

If minix is using the GNU compiler and the GNU library,
there shouldn't be any problem.  Under ANSI C, malloc()'s argument
is supposed to be type (size_t), which is the same type as the result
returned by sizeof.  For an ST, this is probably typedefed to
(unsigned long).  I don't know why the minix version shouldn't work.

> As I said before, the speed difference is about 20% overall.

I didn't mean to claim that everything would be twice as fast.
I was simply curious as to what people thought they were gaining
by chosing an option that made things at best no slower and at
worst a lot slower.

====

> From iuvax!STONY-BROOK.SCRC.Symbolics.COM!jrd  Thu Nov  2 17:19:05 1989
> From: John R. Dunning <iuvax!STONY-BROOK.SCRC.Symbolics.COM!jrd>
> 
>     From: watmath!rbutterworth at iuvax.cs.indiana.edu  (Ray Butterworth)
>     This reminds me of something I've been wondering about for a while.
>     Why does GCC on the ST have 32 bit ints?
> 
> GCC is written with the assumption that modern machines have at least
> 32-bit fixnums.

Fine.  But it shouldn't assume that those 32-bit integers will
have type (int).
When I write code that needs more than 16 bits of integer,
I ask for type (long), not for type (int).

> (As near as I can tell, all GNU code is.  That's
> because it's mostly true) GCC will not compile itself if you break that
> assumption

Then I'd say the compiler is badly written
(though there are of course varying degrees of badness,
and of all the bad code I've seen, this stuff is pretty good).
You should be able to write code that doesn't rely on (int) being
the same as (long) or being big enough to hold a pointer,
without any loss of efficiency.  i.e. on architectures where (int)
and (long) are the same, it will generate exactly the same machine code.

>   > Surely 16 is the obvious size considering it has 16 bit memory access.
> 
> Nonsense.  It makes far more sense to make the default sized frob be a
> common size, so you don't need to worry about it.  For cases you care
> about speed, you optimize.

Change "Nonsense." to "Nonsense:", and I'll agree.

The whole point of type (int) is so the compiler will optimize for me
when I don't know what architecture I am writing for.
And if I am writing portable code there is no way I should know what
architecture I am writing for.
If I want an integer value that is more than 16 bits, I'll ask for (long).
If I want an integer value that doesn't need to be more than 16 bits,
I'll ask for (int).  The compiler might give me 16, it might give me 32,
or it might give me 72; I don't really care.  The important thing is that
the compiler should give me the most efficient type that is at least 16
bits.

>   > (Note that I don't consider
>   >  "so that badly written code will still work ok"
>   >  as a valid reason.)

> Fine, turn it around.  Why is it valid for things that know they want to
> hack 16-bit frobs to declare them ints?

It isn't.  They should be declared int only if I want them to be
at least 16 bits.

> To avoid being 'badly written code', they should declare them shorts.

No.  If I want something that is *exactly* 16 bits, 
I am in on of two different situations:
1) I am writing machine specific code.
2) I am writing a special numerical algorithm.

In the first case, my code is obviously non-portable
so it is fine to use short, or char, or whatever type other than (int)
it is that will guarantee me 16 bit integers on this architecture.
Such code should of course be isolated from the rest of the portable
code, and documented as being very architecture specific so as to
minimize the amount of work required to port the program to a
different architecture.

In the second case, I can still write portable code,
but I have to be very careful about what assumptions I make.
e.g. #define SIGNBIT(x) (0x8000 & (x))
makes a big assumption about int being 16 bits.
But  #define SIGNBIT(x) ( (~(( (unsigned int)(~0) )>>1)) & (x) )
will work fine regardless of the size of int, and will generate
the same machine code as the first macro when int is 16 bits.
Coding for portabability may require a little extra effort,
but it doesn't mean the result has to be any lesss efficient.

>   > Every time anything accesses an int in GCC, it requires two memory
>   > accesses.  Most programs are full of things like "++i" or "i+=7"
>   > or "if(i>j)", and such things take approximately 100% longer when
>   > ints are 32 bits.
> 
> If you don't optimize them or aren't careful about how you declare them,
> sure.

But I did optimize; I declared them (int) expecting the compiler to give
me the integral type that is at least 16 bits long and is the most
efficient for the current architecture.  On an ST, that is 16 bits.

I hope you're not suggesting that by "optimizing" I should have
declared it (short) knowing that I am working on an ST.
That is known as writing bad, non-portable code.

On some some word addressable machines for instance, there are
perfectly correct compilers for which the code for accessing a short
can be far less efficent that the code for accessing a long.
e.g. 3 or more instructions instead of 1.

>   > Has anyone made a 16-bit GCC and library and done a comparison?
> 
> Yes, it's for sure faster to use shorts than ints.

That's what I was objecting to.
By the original definition of (int),
using int should be exactly as fast as using the faster of (short) or (long).
i.e. regardless of architecture, the best choice should be (int).

> Rather than griping, why not just use the -mshort switch?  It's designed
> for just this kind of thing.  It's even documented.

Sorry, I wasn't intending to gripe (although I know I often sound like it).
I also wasn't wanting to know how to make it behave the way I wanted.
I was simply wondering why the default behaviour was the way it was.

I think the GNU compiler and libraries are the best available for the ST,
and I certainly can't complain about the price.

I really was curious to see if anyone had any reason why they would
want a 32-bit (int) on an ST, other than for the obvious reason that it
makes it easier to compile badly written code (i.e. code that makes
non-portable assumptions), and so far no one has.

====

> From: hyc at math.lsa.umich.edu (Howard Chu)
> Organization: University of Michigan Math Dept., Ann Arbor
> 
> For compatibility - using the 32 bit int mode, I can take just about
> any source file off a Sun, Usenet (comp.sources.unix), etc., type
> "cc" or "make" and have a running executable without ever having to
> edit a single source file.

i.e. so you can compile code that makes some non-portable assumptions,
namely that sizeof(int) == sizeof(long).
i.e. so you can easily compile badly written code.
I already know about that reason.  I spend half my time at work
trying to get all-the-world's-a-VAX Berkely programs to work on
other hardware.

> >Surely 16 is the obvious size considering it has 16 bit memory access.
> Yep. So obvious, in fact, that every version of ST GCC that's been
> posted has been sent out with *both* 16 *and* 32 bit libraries. What
> are you complaining about?

The first version I had didn't have the 16-bit library.
I have since obtained updates.

====

> From: 7103_300 at uwovax.uwo.ca
> From Eric R. Smith
> 
> The most up-to-date version of the libraries (both 16 and 32 bit) are
> available on dsrgsun.ces.cwru.edu, in ~ftp/atari. If you don't have
> these libraries, get them; thanks to some nice work by Dale Schumacher
> (dLibs), Henry Spencer (strings), Jwahar Bammi (lots of stuff), the
> people at Berkeley (curses and doprnt) and yours truly, they're a big
> improvement on the original GCC library.

Yes.  I just picked these up last week.  Thank you all.

(Of course gulam still dropped a couple of bombs the first time I
 used "more", so I aliased it away and use vi.)