problems with 2 drives under 386/ix 2.0.1, and a TCP/IP problem

Thu Nov 23 05:07:45 AEST 1989

In article <1989Nov14.175913.7840 at antel.uucp> mike at antel.uucp (Michael
Borza) wrote describing a two disk problem and a tcp/ip problem.

I can't help you with your first problem, the disk related issues, but
I may be able to offer some insight to your tcp/ip woes.

Your tcp/ip problem description...

   This frequently causes hangs on one or the other of the systems,
   in which all system activity ceases (character echoing included).
   I've played around with the number of dblocks, which changes how
   early the hang occurs, but not ultimately whether it occurs at
   some time.

Sounds all too familiar. I've had that problem here for some time.
What I've learned so far is that you should set the number of dblocks
of each class (NBLK64, NBLK128 etc) big enough that there are never
any failures.

Your list (edited)...

   		 alloc	 inuse	   total     max    fail
   dblock class:
       1 (  16)	   128	    30	   41906      32       0
       2 (  64)	   128	    17	  214582     115      26  <<<<***
       3 ( 128)	   128	   108	   25676     115    9001  <<<<***
       4 ( 256)	   128	     0	   11969       8       0

shows some failures. These are not good.

My strategy has been, double the number of allocated dblock classes
until I get no more failures. So in your case, I would double NBLK64
and NBLK128 each to 256. If failures continue to show up, increase
them again. Also, watch for failures in the streams and queues.

Our two 2.0.2 systems are set up like this...

		 alloc	 inuse	   total     max    fail
streams:	    96	    40	     304      53       0
queues: 	   512	   216	    1702     294       0
mblocks: 	  3270	   735	 4625499     969       0
dblocks: 	  2616	   735	 3954806     969       0
dblock class:
    0 (   4)	   256	     1	  138298       7       0
    1 (  16)	   256	    26	  560779     130       0
    2 (  64)	  1024	   601	 3061270     819       0
    3 ( 128)	   512	   104	   47186     148       0
    4 ( 256)	   256	     0	   60700     123       0
    5 ( 512)	   128	     0	   26377      40       0
    6 (1024)	    64	     0	   23847      11       0
    7 (2048)	    64	     3	   36349       8       0
    8 (4096)	    56	     0	       0       0       0

dblock 64 seems to be our biggest headache. I see from this list it is
getting close to overflowing again.

It has been two days since I lasted booted this machine.

What I have seen is that sometimes, if you wait long enough, the
system *will* come back to life.  If you are so lucky, and your
machine continues to breathe, then do a netstat -m, you will likely
see one of the dblock classes with a failure count in the hundreds of
thousands or possibly in the millions. I believe the kernel is a loop
trying to get the dblocks over and over and over again. 

Having said all that, we still occasionally get these mysterious
hangs, but much less frequently now.

Also note that our load is different than yours. We have an RFS link
between the two 386/ix machines and up to 10 users rlogin'd in at any
given time from ms-dos and BSD4.3 machines. I haven't brought NFS up
yet to know what effect it will have.

-larry