Floating-point non-exactness
Paul Hilfinger
hilfingr at tully.Berkeley.EDU
Fri Aug 3 17:19:13 AEST 1990
Rick Jones asks
> Is there a general solution to this problem without continuously rounding
> everything? I believe it's possible by applying a "maximum reliable
> precision" concept to FP numbers.
I don't see how to do this sensibly. Presumably, you'd want to say
that fpcmp(x,y)==0 if x and y are within a few units in the last
place. How few? Consider first the case where x and y are the
representations of literals xl and yl. Assume the machine rounds
literals deterministically to the nearest representable FP number.
Then it would be reasonable to define
fpcmp(x,y) == 0 iff x == y
(Here, "x" means "the FP number representing xl", etc.) This is
sensible because, indeed, x != y ==> xl != yl.
Suppose now that x is the representation of the literal xl and y is
the representation of the result of yl1 * yl2. On VAXes and IEEE
machines, it would then be sensible to say that fpcmp(x,y) == 0 iff x
and y differed by no more than (I think) 1 ulp (maybe 2; it's late).
This continues; every new expression supplying x and y determines a
new "sensible".
> Can you ensure that "fpcmp (a, b)" and "fpcmp (a-b, 0.0)" yield the same
> result when a and b are very close?
The subtraction loses information, so "no". Let's suppose that we
carry 4 binary digits of significand. You certainly want to say that
fpcmp(0.125, 0.25) == -1. But to satisfy your condition, this
requires that fpcmp(-0.125, 0.0) == -1. That in turn requires that
fpcmp(1.125,1.25) == -1 (since 1.125-1.25 is also -0.125) [these are
all multiples of powers of 2, so the arithmetic is generally exact].
However, 1.125 and 1.25 differ by only one in the last place; thus, you are
forced to consider two FP numbers different if they differ in the last
place---in other words if they aren't == according to built-in FP
equality.
I assume in this response that you want an answer suitable for floating point,
and that therefore it is unacceptable to say that fpcmp(x,y)==0 if the
*absolute*
error |x-y| is less than some threshold. If absolute error were acceptable for
some reason (it does not seem to be for your intended application), then we
could make fpcmp(x,y) and fpcmp(x-y,0.0) compatible, as other responders
have indicated.
Paul N. Hilfinger, Associate Professor (415) 642-8401
Computer Science Division Hilfinger at Berkeley.EDU
Dept. of Electrical Engineering and Computer Sciences
University of California
Berkeley, CA 94720
Paul N. Hilfinger, Associate Professor (415) 642-8401
Computer Science Division Hilfinger at Berkeley.EDU
Dept. of Electrical Engineering and Computer Sciences
University of California
Berkeley, CA 94720
More information about the Comp.lang.c
mailing list