self-printing C programs

utzoo!decvax!ittvax!swatt utzoo!decvax!ittvax!swatt
Wed Nov 3 22:18:40 AEST 1982


Regarding ihuxr!lew's challenge to write a self-printing program
in canonical C style:


:::::::::::::::
introspect.c:
:::::::::::::::
	char *lines[] ={
		"char *lines[] ={",
		"	0",
		"};",
		"",
		"#define BSLASH	'\\'",
		"#define NL	'\\n'",
		"#define DQUOTE	'\"'",
		"#define HT	'\\t'",
		"#define SQUOTE	'''",
		"#define EOS	'\\0'",
		"main()",
		"{",
		"	register char **lp;",
		"",
		"	puts (lines[0]);",
		"	for (lp = lines; *lp; )",
		"		putq (*lp++);",
		"	puts (lines[1]);",
		"	puts (lines[2]);",
		"	for (lp = &lines[3]; *lp; )",
		"		puts (*lp++);",
		"}",
		"",
		"putq (s)",
		"char	*s;",
		"{",
		"	putchar (HT);",
		"	putchar (DQUOTE);",
		"	for (; *s != EOS; s++) {",
		"		if (*s == BSLASH || *s == DQUOTE)",
		"			putchar (BSLASH);",
		"		putchar (*s);",
		"	}",
		"	putchar (DQUOTE);",
		"	putchar (',');",
		"	putchar (NL);",
		"}",
		"",
		"puts (s)",
		"char	*s;",
		"{",
		"	for (; *s != EOS; s++) {",
		"		if ((*s == BSLASH || *s == SQUOTE)",
		"		    && (s[-1] == SQUOTE && s[1] == SQUOTE))",
		"			putchar (BSLASH);",
		"		putchar (*s);",
		"	}",
		"	putchar (NL);",
		"}",
		0
	};

	#define BSLASH	'\\'
	#define NL	'\n'
	#define DQUOTE	'"'
	#define HT	'\t'
	#define SQUOTE	'\''
	#define EOS	'\0'
	main()
	{
		register char **lp;

		puts (lines[0]);
		for (lp = lines; *lp; )
			putq (*lp++);
		puts (lines[1]);
		puts (lines[2]);
		for (lp = &lines[3]; *lp; )
			puts (*lp++);
	}

	putq (s)
	char	*s;
	{
		putchar (HT);
		putchar (DQUOTE);
		for (; *s != EOS; s++) {
			if (*s == BSLASH || *s == DQUOTE)
				putchar (BSLASH);
			putchar (*s);
		}
		putchar (DQUOTE);
		putchar (',');
		putchar (NL);
	}

	puts (s)
	char	*s;
	{
		for (; *s != EOS; s++) {
			if ((*s == BSLASH || *s == SQUOTE)
			    && (s[-1] == SQUOTE && s[1] == SQUOTE))
				putchar (BSLASH);
			putchar (*s);
		}
		putchar (NL);
	}
:::::::::::::::

This one uses nothing but "putchar" to do output -- no "printf"
nonsense.  It also uses approved C definitions for character
constants.  The only parts of the "lines" table that require any
special editing are in the first 9 lines; all the rest can be produced
by running "sed" commands on the program proper.

When compiled "a.out | diff - introspect.c" produces silence, as does
"cb <introspect.c | diff - introspect.c".

It uses no special knowledge of ASCII, so it should work on any machine
which supports C, and has an equivalent of "putchar".  It is 98 lines,
using a reasonable C source formatting convention (please no flames on
that!).

The interesting thing about this approach is for every line of code you
have, there will be one line of data, plus 6 (three for the table
pro/epi-logue, and three more for the string versions of the same).

Given typical C formatting conventions, you have to be able to produce
the special characters '\n' and '\t'.  Given C string conventions, you
also have to be able to produce '"', '\\', and '\''.  Absent simply
using integer constants, I can't think of a way to do this without a
routine to reconstruct a string as the compiler would have seen it.
The minimum size gambit therefore boils down to how compact a function
you can write to reconstruct C strings for these 6 characters.

	Not ashamed to admit this took me 4 hours,

		- Alan S. Watt



More information about the Comp.lang.c mailing list