A BETTER way to do Multidimensional arrays in C

Rob Warnock rpw3 at redwood.UUCP
Tue Jan 15 20:14:19 AEST 1985


+---------------
| A few months ago I realized there was a better way to do multidimensional
| arrays in C.  This way is not new; in fact, it is discussed on page 110
| of Kernighan and Ritchie...
| 	Steve White
| 	Cornell Univ. Dept. of Physics
+---------------

If my memory serves me, in the Algol-60 community these arrays of pointers
to arrays are called "Illiffe vectors" or "dope vectors". The PDP-10 Algol
compiler used them, and due to the PDP-10's ability to indirect-and-index
through memory, you could access a multi-dimensional array in ONE instruction,
if the subscripts were in registers (as they often are in inner loops).

(The best I could do on a 68000 was 3n+1 instructions, where "n" is the
number of dimensions in the array, and 4n+1 if you don't want to destroy
the subscripts. Could someone who knows the VAX comment on the use of
Illiffe vectors on that machine?)

For example, suppose we have a three-dimensional array declared "A[17,12,25]".
Then let memory location "A_" be the first word of a 17-word array of addresses
of the second-level pointers. In each element of "A_" the indirect bit will be
turned on, and the index field will name (say) register "T2". To assist the
description, call the (normally nameless) second-level arrays A_0 through A_16.
Then "A_" is:

	OPDEF	Z	[0]	;allow the index and indirect syntax to work.

	A:	Z	@A_0(T2)	;A_0 indexed by T2, then indirect
		Z	@A_2(T2)	; through memory.
		...
		Z	@A_16(T2)
	
Then the second-level pointers are 12-word vectors (using A_13 as an example):

	A_13:	Z	A_13_0(T3)	;note that the 2nd level uses T3
		Z	A_13_1(T3)	; and since it's the last level
		...			; of pointers, no indirect bit ("@").
		Z	A_13_11(T3)

And on the final level are just the array elements themselves:

	A:				;also the address of the "real" A
	A_0_0:	 BLOCK	25		;size of third dimension
	A_0_1:	 BLOCK	25
		 ...
	A_13_11: BLOCK	25
	A_14_0:	 BLOCK	25
		 ...
	A_16_11: BLOCK	25

Now put the first subscript in register T1, the second in T2, and the third
in T3. Then to load register T0 with A[T1,T2,T3], simply say:

		MOVE	T0, at A_(T1)	;could also be an ADD, FMUL, etc.

That's nice!

On a 68000, the same "instruction" is:

		lshl	#2,d1		;multiply indices by 4
		lshl	#2,d2		; (plus more work if you needed to
		lshl	#2,d3		;  save them for later)
		addl	#A_,d1
		movl	d1,a0
		addl	a0@,d2
		movl	d2,a0
		addl	a0@,d3
		movl	d3,a0
		movl	a0@,d0

(Unfortunately, when I tried this on the C compiler I use, even with "-O"
it didn't realize you could "addl a0@,d?". It did "movl a0@,d0" instead,
and then added the index to d0, before moving it to a0. Oh well...)

ANyway, Illiffe vectors are the way to go for *speed*. They do cost you
a size penalty, which can be small if the last dimension is large. How
messy would it be to get the compiler to allocate the vectors for you?
(...for static arrays only, please!) Or if you don't want to fool with
the compiler, a simple pre-processor (driven from pragmats) could compute
the vectors and the initializers for them.

Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404



More information about the Comp.lang.c mailing list