Longjmping back, and back again; Coroutines in C

Ge' Weijers ge at kunivv1.sci.kun.nl
Mon Dec 11 23:25:10 AEST 1989


tim at nucleus.amd.com (Tim Olson) writes:
>You should check out the Am29000.  It has 192 visible registers,
>load-multiple and store-multiple instructions, and the register
>windowing scheme is implemented in software, so that the allocated
>register window size is exactly tailored to the frame size required by
>the function.

This sounds a lot better. One of the problems I have with the SPARC is that
it saves 16 registers, even in the simplest routines. This gives 
pedestrian performance compared to routines that don't overflow the register
window set. Especially on useless routines calculating Fibonacci numbers,
or (worse) functional programs doing spine evaluation on a list.
P.S. how long does context switching take, with a full window?

>It isn't simple.  You need inter-procedural register allocation
>("universal"), and even that is a static analysis -- it doesn't take
>into account the dynamic nature of the call tree at runtime, like
>hardware register windows do.

If you have register windows that only save the minimum amount of registers
necessary then in the worst case you do the same amount of work as
if you use a store-multiple instruction to save your registers.
I think my opinion was based on the SPARC only, not a bad design, but
suboptimal for certain uses. A shame you don't see many 29000 systems around,
except in the controller market. I recently saw a FDDI controller board for
the PC, featuring a 29000 and quite a bit of memory. Looks like the
equivalent of a rubber dingy with a Rolls Royce Merlin engine :-)
(the PC being the dingy, of course)
Interprocedural analysis is not too hard for the kind on (non-typical)
programming languages I have to deal with. It's positively easy, when compared
to C. You don't have to invalidate any register contents after a procedure
call, as there are no global variables.

>Actually, recursion isn't the problem -- it's large, quick changes in
>the stack depth.  In fact, many times recursive routines are better on
>register-windowed machines, because they only have to save some cached
>portion of the stack if they overflow the cache.  Consider a dynamic call
>chain (of a recursive routine) which looks like:

>      1
>      	2			2
>	  3   3   3   3   3   3   3
>	    4   4   4   4   4       4

>a non-register-windowed machine must save and restore registers on
>each function call and return, while the register-windowed machine
>could run without any register state saved or restored to/from memory.

4-deep recursion is hardly any recursion. I'm talking about 30-300,000
deep. If you store 6 unnecessary longwords/call, you use to much stack,
and too many cycles. The performance of the SPARC drops a factor of 2 if
you have a lot of recursion.

I think we agree.

>	-- Tim Olson
>	Advanced Micro Devices
>	(tim at amd.com)


Ge' Weijers


Ge' Weijers                                    Internet/UUCP: ge at cs.kun.nl
Faculty of Mathematics and Computer Science,   (uunet.uu.net!cs.kun.nl!ge)
University of Nijmegen, Toernooiveld 1         
6525 ED Nijmegen, the Netherlands              tel. +3180612483 (UTC-2)



More information about the Comp.lang.c mailing list