Floating point puzzle

Richard A. O'Keefe ok at quintus.uucp
Mon Aug 8 11:16:14 AEST 1988


In article <3117 at emory.uucp> riddle at emory.uucp (Larry Riddle) writes:
>Notice that x and y, which have been declared as floats, and thus have
>a 32 bit representation (according to the manual this obeys IEEE
>floating point arithmetic standards), both are printed the same in hex,

This is >>C<< remember?  Floats are 32-bits IN MEMORY, but when you
operate on them or pass them to functions they are automatically
converted to double.

Since you specifically mention the Sun-4, I suggest that you read your
copy of the Sun Floating-Point Programmer's Guide.  In particular, if
you really want to pass 32-bit floating-point numbers in float format,
you will need the "-fsingle2" compiler option.  (I haven't tried this
on a Sun-4, but it works fine on Sun-3s.)

The two flags are

	-fsingle	If the operands of a floating-point operation are
			both 'float', the operation will be done in single
			precision instead of the normal double precision.

	-fsingle2	Float arguments are passed to functions as 32 bits,
			and float results are returned as 32 bits.  (Useful
			for calling Fortran functions.)

TRAP:  floating-point constants such as 1.0 are DOUBLE precision, so if the
compiler sees float x; ... x+1.0, it will do the addition in double
precision.  In such a case, I do	float one = 1.0; ... x+one...



More information about the Comp.std.c mailing list