System/6000 statistics cf. other machines

Wed Feb 14 10:58:06 AEST 1990

Went to an interesting meeting this afternoon.  It wasn't non-disclosure,
so I guess I can talk about some of the stuff that was mentioned. :-)

First off, raw #'s:

the SPECratio scale had to be extended to 80... :-)

100x100 matrix multiply, in Fortran.  Figures are in MFLOPS

ETA-10P			75.6
System/6000 aka rios	18.2 *
Titan			6.7
SparcStation-1		1.4
DECstation 3100		0.8

(*) if they coded the problem by hand, they got around 40 MFLOPS

>From my notes:

Can do 1 floating point, one int and one branch instruction in one
cycle.  No mode bit. 32b instructions.  "varlen VLIW".  no explicit
pipe.  no delay brach op (conventional branching).  no quash.
basically sequential machine, relies on compiler for max performance.

single precision slightly slower than double. :-)

interger P and float P have internal queueing.

Inst Cache handles relative addresses and PC 

Loop closing branch -- special loop countdown register that is decremented
*while* branch is going.  therefore, can close loop in effective 0 time
(while executing next instruction ?)

line sizes are 128, Inst cache 4K, data 128K (?)

no pre-fetch (try to fix in compiler for now)

memory mapped I/O

penalty for cache miss: 2-4 cycles (speaker couldn't remember exactly)
(some mention of being worried about stride n misses by somebody)

PLA compiler ported from 370.

1 FP operation per cycle, including new multiply & add -> register.
FP is 151 bit precise for last instr.

--
J. Eric Townsend
University of Houston Dept. of Mathematics (713) 749-2120
jet at karazm.math.uh.edu
Skate UNIX(tm).