Efficient use of lighting models

Thu Oct 25 23:01:07 AEST 1990

Hello,

  after having read the current discussion on the efficient use of
the lighting model (Scott Kahn/Kurt Akeley) I've got a question.

  Using a 70/GT (and other models) we know that multi-colored surfaces
(every vertex using a different material) are slower than uni-colored
surfaces (all verteces using the same material). Our current algorithm
for the multi-colored stuff looks alike:

  bgntmesh;
  for all vertices {
    if(newmat != oldmat) { lmbind(MATERIAL,newmat);oldmat=newmat;}
    /* the oldmat/newmat stuff is just to avoid unneccessary lmbind's */
    n3f(normals);
    v3f(coordinates);
    }
  endtmesh;

 The question arises wether it is faster to use just one material and
change its properties using lmcolor/cpack like this:

  lmcolor(LMC_DIFFUSE);
  lmbind(MATERIAL,template);
  bgntmesh;
  for all vertices {
    if(newcol != oldcol) { cpack(newcol);oldcol=newcol;}
    /* the oldcol/newcol stuff is just to avoid unneccessary cpack's */
    n3f(normals);
    v3f(coordinates);
    }
  endtmesh;
  lmcolor(LMC_COLOR);

 This of course would allow to change only one property (i.e DIFFUSE)
in this loop (two for the case of LMC_AD). If you HAVE TO change more
properties at once (and can't simulate that by changing one property)
I believe you have to insert lmcolor commands inside the loop. As I
understand Kurt, this is not desirable. I'm really interested in an
answer to this problem, as it eventually would force a desing-
decission for our software. We have also observed that the speed
differences between uni/multi-colored surfaces (using the first
algorithm) vary dramatically when using different graphics platforms.
Especially the VGX seems to have problems on that algorithm. Is this
observation true and is there an answer to this problem that covers
all SGI graphics machines (PI, PI/TG, GT, GTX, VGX)?

 A second question arises for the LMC_AD mode. How does it work? Does
it set AMBIENT and DIFFUSE to the same RGB-values?

 Next I have a suggestion. Very often you find combinations like:

   n3f(normals);
   v3f(coordinates);

I may overestimate the overhead for the function calls and the interaction
between CPU and graphics-pipeline, but would a combination of both
routines (call it nv3f) not result in some performance gain?

 Finally I've got a question concerning the memory alignment of normal
and coordinate data. From the release notes I know that for the GTX
this kind of data has to be quadword aligned to get best performance.
We are currently allocating normals and coordinates in onedimensional
arrays (float) of this form:

   x0,y0,z0,x1,y1,z1,....

and pass the adress of the xn-element to n3f/v3f. Should we better
allocate additional (dummy)-space for the w-elements to get the
best performance on the gtx? As this would mean 33% more memory usage
for vertices and normals, we like to avoid it if possible. How large
is the performance loss if one does not use quadword aligned data.
What are the effects on other machines (esp. VGX)?

  This mail has gotten longer than I've thought in the beginning.
"Sorry" to everybody who is not interested in the topic.

Regards
Martin Knoblauch

TH-Darmstadt
Physical Chemistry 1
Petersenstrasse 20
D-6100 Darmstadt, FRG

BITNET: <XBR2D96D at DDATHD21>