printf("%s", s) considered slow

Chris Torek chris at mimsy.umd.edu
Wed May 2 18:29:35 AEST 1990


>Chip Salzenberg <chip%tct at ateng.com> gripes:
>>Aargh!  Why do people use 'printf("%s", s)' when 'fputs(s, stdout)' is
>>faster on every C implementation known to humankind?  Gerkghd...

In article <E4?05k7 at cs.psu.edu> flee at shire.cs.psu.edu (Felix Lee) writes:
>Hmm.  Isn't fputs slower in some generation of BSD?  In fact, I'm
>pretty sure of it.

In 4.1 and 4.2 BSD, yes (and previous releases as well).  The old fputs()
code was, in essence,

	while ((c = *ptr++) != 0)
		if (putc(c, outf) == EOF)
			return EOF;
	return 0;

Since there are three (count 'em three :-/ ) different kinds of output
buffering, and since the C compiler could not tell that putc()---which
expanded to a horrible ?: expression including a call to _flsbuf()---
did not generally switch among them, this expanded to long and slow code
paths.

>The last time I looked at fputs it used putc wrapped in a while loop,
>which is a lot of needless work inside the loop---using strlen and
>fwrite is almost certainly faster.

Unfortunately, in those same implementations fwrite() itself was just
another loop around putc().

This Has Been Fixed (whether in 4.3BSD or only in 4.3-tahoe, I cannot
recall).

>And _doprnt (printf) on the VAX is hand-coded assembly that avoids the
>needless work.  Well, I think it does; I never looked at it closely.

I did; it does, and it does not.  It was not terribly well coded, but
it did move characters generally more efficiently than did the old
fputs and fwrite.

>Anyway, an optimal printf("%s",s) is maybe a dozen more machine
>instructions than an optimal fputs(s,stdout).  Unless you're printing
>thousands of strings, I can't see it making a significant difference.
>But then, I doubt that anyone has an optimal stdio library . . .

I am trying.  I think I am getting there.  printf() will still be
a fair amount slower, however.  Among other things, fputs expands to
the sequence

	call strlen.
	set up single output vector.
	call __sfvwrite:
		if nothing to write, return early.
		check to make sure this stream can be written; if
		    not, return an error.
		if unbuffered: for each vector, write it.
		if fully buffered: for each vector, append it to the
		    partly-full buffer, or write one block directly,
		    or put the remaining less-than-a-block into the
		    buffer, whichever applies best; write the buffer
		    as necessary when it becomes full.
		if line buffered: for each vector, act as on fully
		    buffered files, but stop at newlines and do
		    fflush() as necessary to cause complete lines
		    to get emitted as they occur.
		return any error code.
	return any error code.

while printf("%s") now expands to the sequence

	set up varargs stuff.
	call vfprintf:
		check to make sure this stream can be written.
		check to see if this is an unbuffered Unix output
		    stream, in which case it should be `optimised'
		    (via a secondary function that uses a temporary
		    buffer); here it is not.
		set up return value and io vector information.
		examine format, note %, stop examining format.
		make a vector out of any text leading up to the % (here
		    none, hence no vector).
		set defaults for precision, field width, etc.
		switch on format, case is 's'.
		undo any sign flag ("%+s" should not print a sign).
		check precision; since it is unspecified, call strlen
		    (if precision given, call memchr instead).
		break from case (go to common field-output code).
		check for prefixes (sign, leading blanks or 0s, etc),
		    setting up output vectors for them (none here).
		set up output vector for the field itself.
		check for suffixes (trailing zeros/blanks/etc),
		    setting up output vectors for them (none here).
		call __sprint:
			call __sfvwrite:
				__sfvwrite acts just as before.
			reset io vector information.
			return any error indication.
		    if error, stop (but no error here).
		examine format, note '\0', stop examining format.
		make a vector out of any text between the last %s and
		    this point (here none, hence no vector).
		since there are no vectors, do not bother with sprint.
		return the number of characters written.
	return what vfprintf returned.

(At a minimum, this seems like `about sixteen more things' than necessary
in fputs()---a bit more than Felix's 12, anyway.)

vfprintf() is a terribly complicated function, what with all the
different formats and flags allowed.  (And people complain that it does
not do enough!  If your printf correctly handles formats like
"%#0-1000g" and "%.400f", consider yourself lucky.)  All the
`io vector' goo above is a great simplification of the previous
mess (even if it does involve testing for newlines that can be
guaranteed not to be present---this could perhaps be improved).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.lang.c mailing list