NFS load characterization studies, anyone?

Tue Mar 27 07:20:49 AEST 1990

[ I posted this to Sun-Spots and Sun-Nets, and it occurred to me that I should
send something out here, too.  Here goes... ]

   OK, I'm curious:  what have people Out There done in terms of modeling
NFS performance?  What I'd really like to see is information on the
following:

	-- for the mythical average fileserver, are there multiple requests
		arriving at the same time, or is it often the case that the
		server is servicing one workstation at a time?

	-- related to the above, does a 'packet-train' model apply to NFS?
		(See Steve Heimlich's recent Usenix paper for info on what a
		packet train is.)

	-- How do different models of client/server organization change the
		load?  (For example, how much extra load do diskless
		workstations cause?  If all my workstations have local
		disks, and I put all my user files on the local disk with
		/usr on the server, how does that change things?  If I put
		the user files on the servers, and /usr on the local disks,
		what difference does that make?)

	-- Under different client/server models, which runs out first, the
		client CPU, the server CPU, or the network?  If the
		packet-train model applies, and there are few if any
		overlapping trains, which loses first?  If there is lots of
		overlap in service requests (because multiple clients are
		banging on the server at the same time, keeping it from
		doing much in the way of sequential reads), how does that
		change the picture?  How does the "user data local" (few
		writes) versus "system files local" (fewer reads, perhaps,
		but more writes) change things?

	-- Under the "user data local" and "system files local" models, are
		there files that are referenced much more frequently than
		others?

   Yes, I know that different disks, network configurations, CPU speeds,
etc.  will all strongly influence the results.  I even think I have some
answers to some of these questions, at least for the UMCP CSD and/or UMIACS
configurations and networks.  What I'm looking for is enough data points
(i.e., "my configuration looks like this, and this is what I see") to begin
to build a general model.

   It occurs to me that hacking the kernel to record NFS requests and
timestamps is a reasonable way to get a handle on the request arrival
characterization problem, and is probably a reasonable way to get a handle
on the "which files are referenced most" problem.  That seems like an easy
hack, so I might whip that out and see what happens.

   There is a masters' student here who is working on a fairly extensive
characterization of NFS client and server loading, but (a) she should know
what previous work has been done in the area, and (b) I'm just downright
curious.  The answers to these questions will strongly influence server
purchasing decisions, and I've got some servers to plan for...

   Please reply directly to me, and I'll summarize.  Thanks.

	-Steve

--
Spoken: Steve Miller    Domain: steve at umiacs.umd.edu    UUCP: uunet!mimsy!steve
Phone: +1-301-454-1808  USPS: UMIACS, Univ. of Maryland, College Park, MD 20742