Should 'sync' stop terminal/system activity ? (SUMMARY)

Andy-Krazy-Glew aglew at urbana.mcd.mot.com
Wed Sep 6 11:29:26 AEST 1989


To: tuvie!dpmizar!lcz
Subject: Buffers and interactive response
Reply-To: aglew at urbana.mcd.mot.com
Bcc: 
Date: Tue, 05 Sep 89 21:14:38 CDT
From: Andy 'Krazy' Glew <aglew at chant>


I'm trying to post this, but having problems.

In-reply-to: ee at atbull.UUCP's message of 5 Sep 89 01:44:01 GMT
Newsgroups: comp.unix.wizards
Followup-To: comp.unix.wizards
Subject: Re: Should 'sync' stop terminal/system activity ? (SUMMARY)
References: <265 at atbull.UUCP> <266 at atbull.UUCP>
Distribution: world

>In article <265 at atbull.UUCP> i write:
>>_The configuration:
>>	60830,UNIX V.3,16MB
>>
>>_The Situation:
>>	2MB for BUFFERS
>>		'sync' stops all terminal/system activity for
>>		some seconds ( until buffers are written to
>>		disk ? )
>>
>>	1MB for BUFFERS
>>		no troubles with sync
>>	
>>	( of course 2MB BUFFERS is preferable, but users
>>	complain about terminals freezing for no obvious
>>	reason )
>>
>>_The question:
>>	Is it normal/ok for sync to freeze terminal activity ?
>Answers : 
>1>From: tuvie!dpmizar!lcz (Lee Ziegenhals)
>1>
>1>I would appreciate any information you have on this problem.  I ran into the
>1>same thing on a Motorola 68030 system running SystemV/68.  Motorola's response
>1>was basically (1) set the file hardening switch (which turns the cache into
>1>a write-through cache -- at a tremendous performance hit), or (2) use fewer
>1>buffers.  I don't really consider either of these an acceptable solution.
>1>
>1>I'm hoping to get more information from the engineers at Motorola, but I'm
>1>not holding my breath...
>1>
>1>-Lee Ziegenhals

This isn't the correct forum for a formal announcement of functionality,
and what I say must not be understood as an official Motorola policy, but
I feel a bit bad about Lee Ziegenhals' "not holding his breath" for help
from Motorola, so...

Yep, we found this performance problem, large buffer caches producing
big jerks in interactive response, as soon as we started living on large
memory machines.  So far the biggest jerk I measured was 13 seconds!

The problem was an O(n^2) algorithm in the buffer cache scanning code
(standard UNIX), when a lot of buffers were dirty.
    In Motorola SYSTEM V/68 R3V6 we have provided a different buffer
cache scanning algorithm, that is O(n), but, moreover, reduces the
"jerk" by scanning the buffer cache in segments. So, if you are
scanning the buffer cache (BDFLUSHR) once every second, then we can
now split up the work into, say, 1/60 as much on every clock tick.
    Yes, it performs a lot better. First of all, empirically (it's my
job to measure these things). Secondly, "feel" -- we installed the fix
on our production machines, and then took it off so that I could
provide before and after measurements of jerkiness on a real system.
I was almost lynched when I took it off. It's back now (down, down,
angry programmers!)
    There are still a few other O(n^2) algorithms in UNIX (remember,
simple, not sophisticated, algorithms? Uh-huh), but I think the buffer
cache was the biggy.  Tell me if it's still a problem after you update
to R3V6 -- I know how to fix some of 'em, just need the time and
justification (I cannot go fixing things that we have no evidence are 
problems - not without real good reason).

Motorola System V/68 R3V6 is not, I believe, formally released yet,
and you may want to double-check that the "syncfix" functionality is
in it.


For the moment, if you have a Motorola System V/68 R3V5 or earlier system,
and are having trouble with interactive response, you might:

    -- reduce NBUF (to reduce the number of buffers you need to scan.
    	    	    take a look at your buffer cache hit statistics,
    	    	    to see if you really need all those buffers)

    -- turn on FILEHARDN (with the problems mentioned above)

    -- change BDFLUSHR
    	    This is one that hasn't yet been mentioned, that you might
    	    want to consider. However, it's a bit of a toughie:
    	    	The O(n^2) behaviour I describe above is really
    	    more like O(n*d), where d is the number of dirty buffers
    	    (if d=c*n, then O(n^2).
    	    	If your workload's buffer writing characteristics are 
    	    such that you dirty a lot of different buffers, you may want
    	    to reduce BDFLUSHR (increase the rate at which scanning is done).
    	    This way, you'll be writing the data out more frequently,
    	    but hopefully fewer buffers will have been dirtied each time,
    	    so the scan will take less time - you'll have smaller jerks
    	    less frequently).
    	    	However, if you are constantly redirtying the same buffers,
    	    then a higher flush rate will just mean that you're writing out
    	    more data - probably not good. In this case, you might increase
    	    BDLUSHR (frankly, I would reduce NBUF first - but I'm paranoid
    	    about reliability).
    	    	If you really feel daring, you can patch the value of
    	    "bdflushr" on the fly in your kernel: bdflushcnt = tune.t_bdflushr
    	    Change tune.t_bdflushr.  (In R3V6 you have to change a different
    	    variable).  This way you could dynamically try out a few
    	    values, and see which you prefer.
    	    	NB. THIS IS NOT MOTOROLA RECOMMENDED STANDARD PROCEDURE!!!
    	    We do not recommend changing tuneable parameters except via
    	    sysgen, and any potential damage you do is on your own head.

    	    	
Hope this helps.
If there was a standard newsgroup for Motorola SYSTEM V/68 (and 88) systems,
I'd cross-post to it.

...

Now, finally - mind if I be commercial for just a little bit?  (I'm
normally a really good net.citizen, talking about anything *except* my
company's products, but I've got this character flaw: I'm proud of the
company I work for (and I only work for companies I'm proud of)):
    I'm sorry that some of you "don't hold your breath" for help from
Motorola, but -- Motorola Microcomputer Division (the part of Motorola
that sells computer systems as opposed to parts) is full of people trying
to make our products better.  Yeah, we've had problems, but we're getting
better quickly.  We've been challenged to produce the same sort of quality
in computer systems, hardware and software, that other parts of Motorola
put into chips and communications equipment. What 99.9999% defect free
means to software isn't always clear, but it certainly means solving 
our customers' problems.  So keep those bug reports coming - it may take
a while to get 'em fixed, but we're gonna.
    End of inspirational commercial.
    
Please, please, please - report those bugs to your sales office or
customer support. I didn't know about this buffer cache scanning
problem until I started working on a system with a lot of memory
myself.
    This isn't a commercial - I know that other companies have difficulty
getting customers to report problems.  Hell - I know that when I was
a sysadmin at school I was often too lazy to report bugs.  But, believe
it or not, people at system shops actually do look at your problem
reports.  Keep 'em coming!!

--
Andy "Krazy" Glew,  Motorola MCD,    	    	    aglew at urbana.mcd.mot.com
1101 E. University, Urbana, IL 61801, USA.          {uunet!,}uiucuxc!udc!aglew
   
My opinions are my own; I indicate my company only so that the reader
may account for any possible bias I may have towards our products.



More information about the Comp.unix.wizards mailing list