Scale and Performance in a Distributed File System (Andrew)
J.H. Howard et al. ACM Transactions on Computer
Systems. Vol. 6, No. 1 (Feb 1988), pp. 51-81.
They describe Andrew and design decisions that went into it. They talk about performance with N servers and M users.
The design of Andrew:
- On each client, a "Venus" process runs which handles remote
file requests by talking to "Vice" servers. Venus also does
caching on a local disk and in memory.
- Whole files are always shipped over. Bad for accessing
large files; can't even access ones bigger than local disk
size. However, a lot less network traffic and server load.
- Cache consistency is done by server callbacks; local changes
are not visible until the file is closed. Directories are
cached, but directory changes and certain metadata changes
(e.g., permission) are sent directly to the server.
- They moved to using inode-like things instead of just names so that clients could do pathname resolution.
They introduce the Andrew Benchmark which is used
countless times in subsequent papers. It's basically copying a
bunch of source files, reading them all through, and compiling
them.
It scales better than NFS. Lower server CPU load (not to
mention actually consistency semantics). But still the large-file
problem: what about database-type applications?
Other nice features: Volumes/mount points (like
NFS/vnodes) allow migration. Supports snapshots (with
copy-on-write) for backups and migration.
Umesh Shankar
Last modified: Wed Jul 11 11:52:09 PDT 2001