Using Macros

Wed Aug 8 23:34:33 AEST 1990

In article <362.26be9dcc at astro.pc.ab.com> yoke at astro.pc.ab.com (Michael Yoke) writes:
[question about how to define multiple statement macros that can be used
 much like functions, esp. *require* a semicolon after their invocation]

My first recommendation: Don't write such macros, use a function.
My second recommendation: Same as first.

Still reading on? So you seem to be pretty sure that you *must*
do it with macro. So read my third recommendation, but read it
carefully and completly:

The trick is to embrace the statements with something that needs
a trailing semicolon. Basically, there are two such methods.

#define multi_stmt_macro1()  if (1) { put your stuff in here } else
or
#define multi_stmt_macro2()  do { put your stuff in here } while (0)

NOTE: In both cases you write *no* trailing semicolon in the #define.
So an invocation of this macro must supply the semicolon:

	multi_stmt_macro1() ;
or
	multi_stmt_macro2() ;

expands exactly to one execution of the statements you have written
in the definition.

There is one reason why the second form is preferable. Consider the
case, where you inadvertently forgot to supply the trailing semicolon.

	multi_stmt_macro1() /* ; forgotten here */
	x = y;

The following statement (here: x = y;) will *never* be executed (some
compilers and lint may warn you, but if you prefer `super-silent'
compilations - or lint-ations - you have possible put some fake-stuff
in or around your macro so that you still get no message). On the other
hand
	multi_stmt_macro2() /* ; forgotten here */
	x = y;

gives you what you deserve: A syntax error.

The above way to define a macro is the "most function-like" I know,
but it has still some drawbacks. The most important is that you
should DESCRIBE THE TRICK IN A COMMENT, so that the poor programmer
who has to maintain your code one day will not have too hard a time
to understand what you have done. 

Because of that I prefer using the trick in two steps, as the following
code excerpt shows:

/*
 * The following definitions of BMAC and EMAC are supplied for the
 * purpose to write multi statement macros that resemble true
 * functions as close as possible. Note that the flow-control
 * statments here have the sole purpose to require a trailing
 * semicolon after the invocation!
*/
#define BMAC do{
#define EMAC }while(0)
/*
 * Use BMAC to start the definition of a multi statement macro.
 * Use EMAC to end the definition: Here is an example:
 *
 * #define MULTI_STMT_MACRO(macro_parameters) BMAC\
 *	stmt;\
 *	stmt;\
 * EMAC
*/

................ many lines
................ may be here

#define A_MACRO() BMAC .............. EMAC

................. again many lines

#define B_MACRO(a,b) BMAC\
	............\
	............\
EMAC

The big advantage here comes when a later maintainer reads the code.
Generally, maintenance-reading may start anywhere in the middle, and
the programmer will eventually need to look what A_MACRO really does.
After locating the #define A_MACRO, he or she finds two other things
that are probably not known: BMAC and EMAC. Locating these two directly
stumps onto the comment which explains their purpose. So you have only
*one* place where you can put your explanations, even if you use the
trick several times. Furthermore, the example in the comment near the
definition of BMAC and EMAC will help a maintainer who reads the code
"top down" and encounters BMAC and EMAC before he or she has seen the
actual usage.

Finally some words to the wise: Those block macros are somehow
"poor man's inline function". They have all the disadvantages of
inline code (more space, no recursion, ... - of course they have
advantages too, otherwise you would not use them) *plus* the common
disadvantages of macros vs. functions. Besides that they may not
return a value important pitfalls are:

1) Macro arguments you write more than once in the replacement text
   are expanded more than once, so side effects may take place
   more than once, WHICH IS NOT AT ALL VISIBLE FROM THE INVOCATION!
   (We just tried to make the invocation look like a function.)
2) If you need local variables, you must avoid to give the same
   names as parameters in the invocation. (See following example).

As I allready recommended, such macros should not be used very often.
In fact, I can see only two places where they sometimes may be
desirable:

a) You have determined with some profiling tool that the overhead
   for a certain function call is too expensive compared to the
   work the function does and you want to "inline" the function
   code as painless as possible (or maybe choose between a true
   function and inlining with a #define Option at compile time).
b) You have a common algorithm which you want to make "type independent".

In both cases you should ensure that the invocation takes care to
avoid the pitfalls described under 1) and 2). I usually choose
upper-case-only names for such pseudo-functions. At least a moderatly
experienced later maintainer will be warned thru that, as C-functions
traditionally use lower-case-only or mixed-case names.

Finally I have a `real life' example for b) - hopefully without typos :-)

#define PRINTARRAY(fmt, npl, arr) BMAC\
int _i_ = 0;\
while (_i_ < sizeof arr / sizeof arr[0]) {\
	printf(fmt, arr[_i_]);\
	putchar((++_i_ % npl) ? ' ' : '\n');\
}\
EMAC

With this macro you can print any(%) array "arr" as long as it consists
of elements for which you can give a printf-format specifier "fmt".
You get "npl" elements per line (seperated by a blank). The only problem
with this is that you can NOT print an array named "_i_". (Of course
this is the reason why I gave the loop counter this strange name and
didn't call it "i" or somthing similar common.

%: Note that with `T a[100][200]' and `struct { T b[100]; } s'
PRINTARRAY still works for `a[33]' and `s.b', but that it fails
in more complex cases, eg. with `T z[100]' for `z + 1', which is
still valid for C, but produces a syntax error when it expands
to `sizeof z + 1 / sizeof z + 1[0]'. The syntax error could be
avoided by enclosing `arr' in brackets - as it is generally advisable
in macro replacement text - but in the above case I prefered the
failing over getting a completly meaningless results (meaningless
for my purpose). Of course, a clever programmer might specify `1 + z'
which causes no syntax error but will fool PRINTARRAY badly.
-- 
Martin Weitzel, email: martin at mwtech.UUCP, voice: 49-(0)6151-6 56 83