a style question

Michael Meissner meissner at osf.org
Wed Oct 3 04:04:11 AEST 1990


In article <1990Oct2.151644.1581 at phri.nyu.edu> roy at phri.nyu.edu (Roy
Smith) writes:

| 	Are there actually any machines in which a compare-and-branch for
| inequality is any faster or slower than a compare-and-branch for less-than?
| It seems to me that either should take one pass through the ALU to do the
| comparison and set some flags, so they should both take the same amount of
| time.  I'm basing my assumption on experience with pdp-11 type machines,
| but I find it hard to imagine any other machine being significantly
| different.  Maybe if you had an asynchronous ALU?

Yes, any computer based on the MIPS chipset (MIPS, DECstation, SGI) is
faster to do a branch on both equality and inquality, than for the
other comparison operators.

| 	The only scenario I could think of would be a RISC machine which
| has only two branches; branch-on-equal, and branch-on-less-than.  The
| compiler could generate an appropriate stream of instructions to simulate
| any possible branch condition from just those two, and some streams might
| end up being longer than others, but that sounds pretty strange, and very
| un-orthogonal.

Mips does not have a branch on a < b (unless b is 0).  It has a set
register to 1 if a < b instruction (& 0 otherwise).  Thus to do the
branch, you set the scratch register to be the value of a < b, and
then do a branch on that register being zero.  It does have a direct
instruction to branch if two registers are equal or not equal.

For example, consider the following program:

	int i, j, k, l, m, n;

	void foo(){
	  if (i < j)
	    k++;

	  if (l == m)
	    n++;
	}

It produces the following code after running it through the compiler
and assembler:

		foo:
	File 'test-branch.c':
	0: int i, j, k, l, m, n;
	1: 
	2: void foo(){
	3:   if (i < j)
	  [test-branch.c:   4] 0x0:	8f8e0000	lw	r14,0(gp)
	  [test-branch.c:   4] 0x4:	8f8f0000	lw	r15,0(gp)
	  [test-branch.c:   4] 0x8:	00000000	nop
	  [test-branch.c:   4] 0xc:	01cf082a	slt	r1,r14,r15
	  [test-branch.c:   4] 0x10:	10200005	beq	r1,r0,0x28
	  [test-branch.c:   4] 0x14:	00000000	nop
	4:     k++;
	  [test-branch.c:   5] 0x18:	8f980000	lw	r24,0(gp)
	  [test-branch.c:   5] 0x1c:	00000000	nop
	  [test-branch.c:   5] 0x20:	27190001	addiu	r25,r24,1
	  [test-branch.c:   5] 0x24:	af990000	sw	r25,0(gp)
	5: 
	6:   if (l == m)
	  [test-branch.c:   7] 0x28:	8f880000	lw	r8,0(gp)
	  [test-branch.c:   7] 0x2c:	8f890000	lw	r9,0(gp)
	  [test-branch.c:   7] 0x30:	00000000	nop
	  [test-branch.c:   7] 0x34:	15090005	bne	r8,r9,0x4c
	  [test-branch.c:   7] 0x38:	00000000	nop
	7:     n++;
	  [test-branch.c:   8] 0x3c:	8f8a0000	lw	r10,0(gp)
	  [test-branch.c:   8] 0x40:	00000000	nop
	  [test-branch.c:   8] 0x44:	254b0001	addiu	r11,r10,1
	  [test-branch.c:   8] 0x48:	af8b0000	sw	r11,0(gp)
	8: }
	  [test-branch.c:   9] 0x4c:	03e00008	jr	r31
	  [test-branch.c:   9] 0x50:	00000000	nop
	  0x54:	00000000	nop
	  0x58:	00000000	nop
	  0x5c:	00000000	nop

>From looking at the way the bits are set down, the blez (branch on
less than or equal zero) and bgtz (branch on greater than zero)
instructions could have been defined as ble and bgt, since the second
register field is required to be 0 (and register $0 is hardwired to
0).  I suspect the decision may have been due to chip real estate, and
the fact that equality comparisons happen more frequently in real
programs.
--
Michael Meissner	email: meissner at osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?



More information about the Comp.lang.c mailing list