Are 3B1 "pipes" really slower than molasses?

Thad P Floryan thad at cup.portal.com
Tue Nov 27 17:26:14 AEST 1990


Yet another chapter in the saga of the ongoing "Don't shoe-shine MY data!"

While investigating why the tape backup operation on the 3B1 is so s-l-o-w,
even with double-buffering techniques, I finally pinpointed what appears to
be the cause: PIPES.  Pipes are used to transfer data to "tapecpio" in all
the supplied shell scripts, and pipes are typically used to pass data from
a "find" (i.e. "find * -print | cpio -oc > whatever").

"Piping" was the ONLY thing in common with all my testing, so I decided to
instrument some pipe runs and see what gives.  Seems the 3B1 pipes leak bits
out into the Great Bit Bucket or sumtin'.  This is the first time I've ever
had something "bad" to say about the 3B1.  And this "problem" affects more
than just backups, it affects ANYTHING using pipes, so this should be of
interest to you no matter what system you're using.

Specifically: the BEST performance observed is approx. 35 KBytes/Second between
two processes which are piped together.  Adding more "drains" to the "pipe"
worsens performance.  I tested 4 UNIXPC systems, ranging from 4MB RAM/85MB HD
to 1MB RAM/10MB HD, and the results are all in the same ballpark: 35-36 KBytes
per second.

Perhaps there's something I'm just not seeing, or perhaps some "ktune" params
are not obvious.  I'm working on the assumption that "pipes" are a performance
bottleneck on the UNIXPC and so I went and grabbed some tape utils from site
wsmr-simtel20.army.mil to see if a non-piped tape backup/restore program can
improve performance.  This will take some time to checkout, so in the meantime
here are two things I'm asking:

1)	Enclosed are my test programs, a Makefile, and a shell-script to
	exercise the tests.  Try them on your system.  If the results are
	substantially different, please post them along with your present
	"ktune" parameters (you get these by: "su; ktune -d").  By results
	"substantially different" I mean you're getting 200 KBytes/Sec or
	something else radically different from my results (below).

2)	If you know of ways to improve pipe performance, please post them.
	I don't recall any discussions of this "problem" mentioned in this
	newsgroup before, so maybe I've opened a new "can-of-worms" here;
	wouldn't be the first time and definitely won't be the last!  :-)

Enclosed with this posting is a "shar" of my test suite.  You may need to
change the "gcc" in the Makefile to be "cc", but I tried both with no change
in the observed performance.  If nothing else, you may find the timing code
in "recv.c" interesting.  To run the tests, do either:

	$ ./test.sh	(OR)	$ nohup ./test.sh &

That second form places its output in a file named "nohup.out".  In all cases,
the output will look something like:

	$ ./test.sh

	send  <n>  |  recv

	100000 characters received in 2.783 seconds for 35928 CPS
	200000 characters received in 5.833 seconds for 34285 CPS
	300000 characters received in 8.350 seconds for 35928 CPS
	400000 characters received in 11.933 seconds for 33519 CPS
	500000 characters received in 14.100 seconds for 35460 CPS
	1000000 characters received in 28.200 seconds for 35460 CPS

	send  <n>  |  pass  |  recv

	100000 characters received in 5.566 seconds for 17964 CPS
	200000 characters received in 10.333 seconds for 19354 CPS
	300000 characters received in 16.200 seconds for 18518 CPS
	400000 characters received in 21.200 seconds for 18867 CPS
	500000 characters received in 26.733 seconds for 18703 CPS
	1000000 characters received in 53.050 seconds for 18850 CPS

If you see any flaws in my testing techniques, I'd appreciate knowing about
them, too.  But I've checked this out quite thoroughly and I'm convinced that
what I'm seeing with the results (above) is the actual piping throughput.

The "ktune" parameters on my systems are (the comments are my annotations):

	# ktune -d
	nbuf 100	#number of system buffers for block devices
	ninode 400	#number of memory-resident inodes at one time
	nfile 300	#number of files open on system at one time
	nproc 100	#number of processes existing at one time
	ntext 75	#number of text structures allocated in kernel
	nclist 150	#number of clist buffers available
	npbuf 16	#number of buffer headers in the raw I/O pool
	ncall 32	#number of callouts allowed in the kernel
	nttyhog 1024	#number of chars in tty buffers before implicit flush

Some other systems I've already tested with the same suite include (with the
results for 1,000,000 chars in both tests rounded to nearest 1000):

	HP-9000/840 (Spectrum RISC), HP-UX 3.01,  240000 CPS and 120000 CPS
	HP-9000/350 (Motorola 68030), HP-UX 7.0,  156000 CPS and  85000 CPS

Thad

Thad Floryan [ thad at cup.portal.com (OR) ..!sun!portal!cup.portal.com!thad ]

---- Cut Here and unpack ----
#!/bin/sh
# This is a shell archive (shar 3.32)
# made 11/27/1990 05:18 UTC by thad at thadlabs
# Source directory /u/thad/Filecabinet/WORK/pipe-test
#
# existing files WILL be overwritten
#
# This shar contains:
# length  mode       name
# ------ ---------- ------------------------------------------
#    485 -rw-r--r-- Makefile
#    247 -rw-r--r-- pass.c
#    824 -rw-r--r-- recv.c
#    332 -rw-r--r-- send.c
#    411 -rwxr-xr-x test.sh
#
if touch 2>&1 | fgrep 'amc' > /dev/null
 then TOUCH=touch
 else TOUCH=true
fi
# ============= Makefile ==============
echo "x - extracting Makefile (Text)"
sed 's/^X//' << 'SHAR_EOF' > Makefile &&
X# 3B1 makefile for pipe speed testing
X#
XCC	=	gcc
XCFLAGS	=	-O
XLDFLAGS	=	-s
XLIBS	=	/lib/crt0s.o /lib/shlib.ifile
XNAME1	=	send
XOBJS1	=	send.o
XNAME2	=	recv
XOBJS2	=	recv.o
XNAME3	=	pass
XOBJS3	=	pass.o
X
Xall	:	$(NAME1) $(NAME2) $(NAME3)
X
X$(NAME1):	$(OBJS1)
X		$(LD) $(LDFLAGS) -o $(NAME1) $(OBJS1) $(LIBS)
X
X$(NAME2):	$(OBJS2)
X		$(LD) $(LDFLAGS) -o $(NAME2) $(OBJS2) $(LIBS)
X
X$(NAME3):	$(OBJS3)
X		$(LD) $(LDFLAGS) -o $(NAME3) $(OBJS3) $(LIBS)
X
Xclean	:
X		rm -f $(OBJS1) $(OBJS2) $(OBJS3) core *~
SHAR_EOF
$TOUCH -am 1126050290 Makefile &&
chmod 0644 Makefile ||
echo "restore of Makefile failed"
set `wc -c Makefile`;Wc_c=$1
if test "$Wc_c" != "485"; then
	echo original size 485, current size $Wc_c
fi
# ============= pass.c ==============
echo "x - extracting pass.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > pass.c &&
X/*	pass.c
X *
X *	just passes/handoffs chars from stdin to stdout until EOF for testing
X *	the speed of pipes on the system.
X *
X *	Thad Floryan, 26-Nov-1990
X */
X
X#include <stdio.h>
X
Xmain()
X{
X	int	c;
X
X	while ( (c = getchar()) != EOF ) putchar(c);
X
X}
SHAR_EOF
$TOUCH -am 1126045490 pass.c &&
chmod 0644 pass.c ||
echo "restore of pass.c failed"
set `wc -c pass.c`;Wc_c=$1
if test "$Wc_c" != "247"; then
	echo original size 247, current size $Wc_c
fi
# ============= recv.c ==============
echo "x - extracting recv.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > recv.c &&
X/*	recv.c
X *
X *	just receives chars from stdin until EOF for testing the speed
X *	of pipes on the system.
X *
X *	Thad Floryan, 26-Nov-1990
X */
X
X#include <stdio.h>
X#include <sys/param.h>		/* for def of HZ */
X#include <sys/types.h>
X#include <sys/times.h>
X
Xmain()
X{
X	extern long times();
X
X	long startime, endtime, elapsed;
X	struct tms timebuf;
X	long	numchrs = 0;
X
X	startime = times(&timebuf);	/* get start time in HZ units */
X
X	while ( getchar() != EOF ) ++numchrs;
X
X	endtime = times(&timebuf);	/* get completion time in HZ units */
X
X	if ( (elapsed = endtime - startime) != 0L )
X	{
X	    printf("%d characters received in %d.%03d seconds for %d CPS\n",
X		numchrs,
X		elapsed / HZ,
X		((elapsed % HZ) * 1000L) / HZ,
X		((numchrs * HZ) / elapsed ));
X	}
X	else
X	{
X	    printf("Insufficient timer resolution for supplied input\n");
X	}
X}
SHAR_EOF
$TOUCH -am 1126045390 recv.c &&
chmod 0644 recv.c ||
echo "restore of recv.c failed"
set `wc -c recv.c`;Wc_c=$1
if test "$Wc_c" != "824"; then
	echo original size 824, current size $Wc_c
fi
# ============= send.c ==============
echo "x - extracting send.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > send.c &&
X/*	send.c
X *
X *	just sends argv[1] number of characters out for testing the speed
X *	of pipes on the system.
X *
X *	Thad Floryan, 26-Nov-1990
X */
X
X#include <stdio.h>
X
Xmain(argc, argv)
X	int	argc;
X	char	*argv[];
X{
X	long	numchrs;
X
X	numchrs = atol(argv[1]);	/* dismiss error checks for now */
X
X	while ( --numchrs >= 0L ) putchar('X');
X}
SHAR_EOF
$TOUCH -am 1126044190 send.c &&
chmod 0644 send.c ||
echo "restore of send.c failed"
set `wc -c send.c`;Wc_c=$1
if test "$Wc_c" != "332"; then
	echo original size 332, current size $Wc_c
fi
# ============= test.sh ==============
echo "x - extracting test.sh (Text)"
sed 's/^X//' << 'SHAR_EOF' > test.sh &&
Xecho "\nsend  <n>  |  recv\n"
X./send  100000 | ./recv
X./send  200000 | ./recv
X./send  300000 | ./recv
X./send  400000 | ./recv
X./send  500000 | ./recv
X./send 1000000 | ./recv
Xecho "\nsend  <n>  |  pass  |  recv\n"
X./send  100000 | ./pass | ./recv
X./send  200000 | ./pass | ./recv
X./send  300000 | ./pass | ./recv
X./send  400000 | ./pass | ./recv
X./send  500000 | ./pass | ./recv
X./send 1000000 | ./pass | ./recv
SHAR_EOF
$TOUCH -am 1126175890 test.sh &&
chmod 0755 test.sh ||
echo "restore of test.sh failed"
set `wc -c test.sh`;Wc_c=$1
if test "$Wc_c" != "411"; then
	echo original size 411, current size $Wc_c
fi
exit 0



More information about the Comp.sys.att mailing list