Longjmping back, and back again; Coroutines in C

Sat Dec 9 12:17:34 AEST 1989

In article <576 at kunivv1.sci.kun.nl> ge at kunivv1.sci.kun.nl (Ge' Weijers) writes:
| kenmoore at unix.cis.pitt.edu (Kenneth L Moore) writes:
| 
| >Yup. This is state of the art computater (sic) architecture. This idea
| >arose along with RISC (Reduced Instruction Set Computer) but can be used
| >on RISC or CISC (Complex Instruction Set Computer) machines.
| 
| And a sorry state it is. I'd rather have twice the number of registers,
| a store-multiple-registers instruction, and NO register windows.
| It's just as fast, because you mostly save registers that contain
| garbage. The SPARC was designed with the assumption in mind that nobody
| uses recursion anyway. This might be true for C programmers.

You should check out the Am29000.  It has 192 visible registers,
load-multiple and store-multiple instructions, and the register
windowing scheme is implemented in software, so that the allocated
register window size is exactly tailored to the frame size required by
the function.

| >Remember that RISC is the result of a statistical study that showed that
| >99% of the instructions used on a CISC machine were a sub-set of some 30
| >(out of maybe 256? on an IBM 360) commands. Also, 10 instructions
| >accounted for 80% and 21 accounted for 95%.
| 
| This has nothing to to with register windows.

True.

| >Another aspect that became apparent during these studies was that much
| >of the overhead in a processor was consumed in keeping track of
| >subroutine calls and returns.
| 
| Recent studies have also shown that you can do the same in software,
| using a simple analysis.

It isn't simple.  You need inter-procedural register allocation
("universal"), and even that is a static analysis -- it doesn't take
into account the dynamic nature of the call tree at runtime, like
hardware register windows do.

| >The original RISC-I guys had a lot of left over chip area and decided
| >that the thing to do with the extra area was to make extra registers.
| >And they shrewdly decided to use the registers to facilitate subroutine
| >handling.
| 
| It was a good idea as an experiment. It's a pain if your recursion goes
| 500 deep. But because the SPARC has no other way to save a lot of
| registers in reasonable time, you're stuck with it.
| It also introduces quite a bit of overhead when the OS has to switch
| contexts.

Actually, recursion isn't the problem -- it's large, quick changes in
the stack depth.  In fact, many times recursive routines are better on
register-windowed machines, because they only have to save some cached
portion of the stack if they overflow the cache.  Consider a dynamic call
chain (of a recursive routine) which looks like:

      1
      	2			2
	  3   3   3   3   3   3   3
	    4   4   4   4   4       4

a non-register-windowed machine must save and restore registers on
each function call and return, while the register-windowed machine
could run without any register state saved or restored to/from memory.

	-- Tim Olson
	Advanced Micro Devices
	(tim at amd.com)