SPARC divide - really really slow!
djones at awesome.Berkeley.EDU
djones at awesome.Berkeley.EDU
Wed Dec 27 15:45:36 AEST 1989
I was faced with a program which ran as fast on a SUN 3/60 as it did on a
SUN 4/280, when there should a factor of 2-3 difference if you believe the
MIPS rating.
Using profiling "cc -pg", it became evident that the source is the SPARC
divide instruction -- I gather there is none. This is, of course, part of
the RISC strategy. I'm still just a bit surprised that SUN/SPARC hasn't
figured out a way to get integer divisions done a little faster on a SUN
4/280 than on a SUN 3/60!
I was amused to see some of the "functions" that gprof found using up all
my CPU time. I gather the code checks to see if the numbers are
"not_really_big", or "not_too_big" to do the division (ahem) faster.
So are we stuck with this poor multiply/divide performance in SPARC, or is
this shortcoming being addressed? Heck, would it be faster to hand off
these operations to the Floating Point chip?
% cumulative self self total
time seconds seconds calls ms/call ms/call name
13.9 106.87 36.19 divloop [4]
13.8 142.71 35.84 divloop [5]
3.3 162.28 8.69 divide [10]
3.3 170.84 8.56 not_really_big [11]
3.2 179.13 8.29 divide [12]
3.1 187.27 8.14 not_really_big [13]
3.0 203.11 7.71 end_regular_divide [15]
2.9 210.67 7.56 end_regular_divide [16]
2.5 223.95 6.50 9326374 0.00 0.00 .rem [18]
2.3 229.85 5.91 9326374 0.00 0.00 .div [20]
1.6 239.22 4.27 got_result [23]
1.4 242.88 3.66 got_result [24]
0.6 248.88 1.69 do_regular_divide [25]
0.6 250.43 1.55 do_regular_divide [26]
0.5 251.65 1.22 end_single_divloop [27]
0.5 254.04 1.19 end_single_divloop [29]
0.2 256.09 0.62 4 155.02 155.02 .urem [33]
0.1 257.94 0.38 do_single_div [36]
0.1 258.32 0.38 do_single_div [37]
0.1 259.03 0.36 5 71.01 71.01 .udiv [39]
0.1 259.38 0.35 not_too_big [40]
0.1 259.64 0.27 not_too_big [41]
0.1 260.28 0.17 single_divloop [45]
0.0 260.35 0.07 single_divloop [48]
0.0 260.51 0.01 zero_divide [55]
More information about the Comp.sys.sun
mailing list