Import variables in to awk.

Arnold D. Robbins {EUCC} arnold at mathcs.emory.edu
Fri Nov 17 11:00:53 AEST 1989


OK. Hopefully this is the definitive word on how things work.

V7 awk (old awk, /usr/bin/awk on Suns and other 4.3 based machines)

	awk '....' a=1 b=2 file c=3 file

	a is set to 1, b to 2, then the files are read and no more
	assignments are done.  This feature was undocumented On my Sun,
	the value of a and b are NOT available in the BEGIN block.
	After the first file is read c gets set to 3. Then the next
	one is read.

S5R3.n, n >= 1 nawk (new awk)

	awk '....' a=1 b=2 file c=3 file

	a is set to 1, b to 2, and those values ARE available in the
	BEGIN block.  Then the first file is read, then c is set to 3,
	then the second file is read. The value of c is NOT set in the
	BEGIN block.

	There are inconsistencies here, since conceptually the assignments
	are done when it goes to do a file open, and it "notices" that it's
	really a variable assignment.  But a and b are assigned before
	any program execution begins, while files aren't opened until
	after the BEGIN block has been run.  Note that the assignment of
	c is done correctly, after the BEGIN block.

GNU Awk 2.11 and S5R4 nawk

	awk -v z=26 '....' a=1 b=2 file c=3 file

	z is set to 26 before the BEGIN block is executed.  Then
	the BEGIN block is run. a is set to 1, b to 2, the first file
	is opened and processed, then c is set to 3, and then the
	second file is processed.

Unfortunately, people had come to rely on the way nawk did assignments
before the BEGIN block was run.  But yet the behavior was inconsistent.
So, to have our cake and eat it too, ALL assignments that are where
file names are supposed to be are done after the BEGIN block.  But,
to make a variable be available in the BEGIN block, the new -v option
was added.  You must supply a -v option for each variable to be assigned.

It is important to note that normal assignments are done AT THE TIME they
would have been opened as a file; don't expect c to be set while the
first file is being processed.

This is something that took some discussion and hammering out between
the GNU people (me and David Trueman), Brian Kernighan at Bell Labs
(and Al Aho through him), and Randall Howard at MKS.

In fact, when Brian first changed his awk to be consistent he got the loudest
complaints about needing variable assignments to happen before the BEGIN 
block was run (Hi Tom!).  Adding a command line option was the best compromise
we could come up with -- the text of the awk program does not change,
just the command line to invoke it, and everyone felt that while it
wasn't particularly pretty, we could all live with it.

(I mentioned the S5R4 awk above; I can't promise this, but I do know that
Brian has made his version of awk, which works as described above, available
to them for inclusion is S5R4.  Perhaps someone doing S5R4 at AT&T can
let us know if it made it in.  He also should have gotten his version
to the toolchest, but I don't know about that for sure either.)

GNU Awk 2.11.1 (version 2.11 at patchlevel 1) has been sent to
comp.sources.unix and should be appearing there shortly. Some version
of gnu awk will be in 4.4 BSD, when that comes out.

***

There is the separate question, "what if I have a filename with an `=' in it?"
The short answer is "don't do that".  It should perhaps be possible to come
up with a simple and consistent rule.  I don't know what that rule is
right now though, since we haven't given it a lot of thought yet.  But
I suspect you can look for a change in gawk 2.12 to address this.

Any more questions, class? :-)
-- 
Arnold Robbins -- guest account at Emory Math/CS	| Laundry increases
DOMAIN: arnold at emory.mathcs.emory.edu			| exponentially in the
UUCP: gatech!emory!arnold  PHONE: +1 404 636-7221	| number of children.
BITNET: arnold at emory	   				| -- Miriam Hartholz



More information about the Comp.unix.questions mailing list