Funny characters in filenames

Guy Harris guy at rlgvax.UUCP
Fri Jul 29 10:27:16 AEST 1983


Well, the argument could also be made that unused generality that confuses
users who haven't learned the ropes isn't worth the cost.  The only subsets
of 0x00-0xFF I could see are:

1) 0x00-0x7F - unless you're doing something *really* obscure you don't need
the parity bit.  I've heard that some systems used it to supply file version
numbers ala TENEX (a wonderful way to fill a disk without even trying; I
remember seeing a VMS system where somebody had *43* or so versions of one
source file.  If your OS supports them it should support auto-purge as well,
so you can have only the last N versions around.  VMS, I'm told, supports it,
but I don't know if they make it easy for the user to get at it.), but that
the long filenames in 4.2BSD obviated the need for that.  Note that even
4.1BSD won't allow you to reference or create a file with ('/'|0200) in it,
and 4.1cBSD won't allow you to create a file with any character with the
parity bit on in it.

2) printable characters, including space, only - see previous argument.
Not quite as nasty as characters with the 200 bit on, because you can at
least type them into the shell between quotes.

3) alphanumerics plus underscore only (i.e., like C identifiers) - well,
you certainly won't get nailed by metacharacters.  However, you really
can't leave out ".", for obvious reasons, and RCS users like us will scream
bloody murder if you eliminate ",", and....  I suspect this might be *too*
restrictive.  Restricting only *some* of the metacharacters merely protects
against being unable to delete files from *existing* shells; if you've
protected against the Bourne shell metacharacters, what about the C shell
metacharacters?  (Note that I consider space a shell metacharacter in this
sense; also note that "ed" does NOT consider space a legal character in a
filename).

I think that 1) and 2) aren't too controversial - I don't see a compelling
reason to allow those characters, and I don't know of any systems that
would be seriously inconvenienced by those restrictions.  (If anybody does,
speak up.)  3) is a different matter.  A lot of systems find various special
characters useful in file names, so I think a case can be made that all
printable ASCII characters plus space should remain legal.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy



More information about the Comp.unix.wizards mailing list