Has anyone played with speeding up I/O transfers?

Sat Dec 8 11:42:05 AEST 1984

I posted this article to net.bugs.2bsd yesterday and decided that
I should have posted it to net.unix-wizards as well, sorry for
the double posting but I think much of this has to do with
unix systems in general.

I was just poking around my kernel and remembered that I/O under
2.9 unix is very cpu intensive for block special devices, especially
when running on a small cpu (I have an 11/23).  The routines that
copy data to/from the user address space to the local block buffers
are not exactly the most effecient.  For example, copyin looks like

	jsr	pc,copsu
1:
	mfpd	(r0)+
	mov	(sp)+,(r1)+
	sob	r2,1b
	br	2f

Note the fact that two memory writes and two reads are done for
each word transfered.  Note also that 3 instructions are needed for
each such transfer.  Now, I don't really like that.  There are
routines in mch.s that do copying in clicks at a time, used for
forking incore and other such things.  I wonder if anyone has
either:
	1)	considered and implemented the changes to use these much
		faster routines for I/O instead of just for forks and such
or, better yet (by quite a bit I imagine)
	2)	considered actually using the DMA capabilities of the disk
		to access the entire address space of the computer (at
		least in my case; I do have 22 bit controllers) and
		bypassed the transfer to kernal space completely and
		transfer into user memory.  Now I realize that this
		only works for whole-block transfers but, if one
		uses stdio, that is the only kind of transfers one
		normally does.  Besides, one could easily use the
		existing mechanisms to transfer partial blocks.

I suspect that either change (especially the second) would
significantly reduce the time for I/O transfers and reduce the
need for disk buffers a bit.

Some ideas on direct DMA:

	you could set up another array of block pointers that
	point out into user space.  The same mechanisms that
	access the current struct block array could access this
	new array with one exception:  the data would not
	be considered valid after a return to the user process
	takes place.

	This would make changes to the drivers unnecessary as
	they already know how to deal with buffers mapped all
	over the place (at least on my Q-bus system, all 22
	bits of address are availible for the devices, no
	unibuss map sits in the way to spoil things).

Please send me mail if you either like these ideas or have
reason to suspect that they are not workable (I have only
booted about 3 2.9 systems, I haven't had much practice
poking about in the system internals except to write
driver for my funky Xebec winchester drive)

Also, I would like to coorespond with other small PDP11
unix owners out there.  We do have special needs you know;
berkeley did not *really* intend for their operating system
to run on an 11/23.  I suspect they didn't even try it much.
For example:  I have a DLV11E card, full modem control and
baud rate select.  Berkeley did not support it except as
a generic DLV11 device.  I had to write all of the modem
control and speed selection stuff myself.  What a pain.
I suspect this has happened to many of you.

Sorry for rambling on like this, it's after midnight in the
middle of finals...
	keith packard
	...(almost anyone)!tektronix!reed!keith
	or
	...!tektronix!tekmdp!keithp

	"even a pdp11/23 with unix is better than a machintosh without."