Example pgm that is 3X slower on SS2 than on SS1+
Mark G. Johnson
mark at mips.com
Sat Feb 2 11:41:46 AEST 1991
Here is a short C program that runs 3 times faster on a SS1+ than on a
SS2. Mesured with /bin/time,
Compiled with SPARCstation 2 SPARCstation 1+
--------------------------------------------------------------------
cc -O prog.c -lm 29.3 user sec 9.5 user sec
/* ************* code follows *************** */
#include <stdio.h>
#include <math.h>
/* compute the first 1000 digits of PI = 4arctan(1) */
main()
{
long d = 4, r = 10000, n = 251, m = 3.322*n*d;
long i, j, k, q;
static long a[3340];
for (i = 0; i <= m; i++) a[i] = 2;
a[m] = 4;
for (i = 1; i <= n; i++) {
q = 0;
for (k = m; k > 0; k--) {
a[k] = a[k]*r+q;
q = a[k]/(2*k+1);
a[k] -= (2*k+1)*q;
q *= k;
}
a[0] = a[0]*r+q;
q = a[0]/r;
a[0] -= q*r;
printf("%04d%s", q, i & 7 ? " " : "\n");
}
}
/* ************* end of code *************** */
The reason for this unexpected slowdown is rather obscure: the subroutines
that implement multiplication and division get placed in a very unlucky
spot in the SS2's cache. In the SS1 they get plopped in a less dangerous
area.
This can be seen by having the compiler and/or the OS move the
multiplication and division subroutines to new positions. Or, by
finagling the *other* subroutines (e.g. fp math) that might get in the way
of the mult and div subroutines. When you do this, the SS2 becomes faster
than the SS1+.
Compiled with SPARCstation 2 SPARCstation 1+
--------------------------------------------------------------------
cc -O prog.c -lm 29.3 user sec 9.5 user sec
cc -O -Bstatic prog.c -lm 5.2 user sec 8.9 user sec
cc -O prog.c 5.5 user sec 9.5 user sec
Mark Johnson
MIPS Computer Systems, 930 E. Arques M/S 2-02, Sunnyvale, CA 94086
(408) 524-8308 mark at mips.com {or ...!decwrl!mips!mark}
More information about the Comp.sys.sun
mailing list