Case sensitive file names

std-unix at ut-sally.UUCP std-unix at ut-sally.UUCP
Mon Oct 20 19:13:29 AEST 1986


From: cbosgd!cbosgd.ATT.COM!mark at ucbvax.berkeley.edu (Mark Horton)
Date: Sun, 19 Oct 86 23:11:35 edt
Organization: AT&T Medical Information Systems, Columbus

>If a conforming system may be case sensitive or case insensitive,
>then a lot of programs won't be portable.  Ignore for the moment
>all existing UNIX code and consider new program development.  I
>believe that programmers on one kind of system won't bother
>with the library routines that are used to compare and/or convert
>mixed-case names to monocase.  It doesn't matter what people "ought"
>to do.  A well-known example of this effect is 4.2BSD.  The source
>code is full of variables that should be declared "long" but --
>since on the VAX "long" and "int" are identical -- are not.  In the
>same way, optional case sensitivity will spawn code that only runs
>correctly in the environment where it was written.
>
>Therefore, I believe that case sensitivity must be retained, and
>it should not be made optional.

I'm sorry, but I don't buy this argument.  It seems to be based on
the assumption that case insensitivity will be implemented by the
use of subroutines for case-insensitive operations, with a different
user interface from that available today.  I think such an implementation
is silly, even if other operating systems may do it that way.

I'm talking about file names only.  I do not advocate even considering
making all of the user interfaces in UNIX case insensitive.  While it
might have once been a good idea to design them that way, I feel it's
far too late for someone to decree that all the upper and lower case
keys in, say, vi must be equivalent.

I think it's a given that existing code won't be rewritten to use new
interfaces, even if we come up with a wonderful way to do it.  Vi still
uses raw terminfo, even through curses would have been much easier and
better.  Also, there are lots of binaries out there that can't even be
recompiled.  Any solution to this problem must be in the kernel, or possibly
in libc underneath such subroutines as open, unlink, and chmod, (if you
have shared libraries or full source to recompile) or it won't work all
the time.

The obvious implementation is that the code in the kernel, when mapping a
filename to an inode number, to do a case-insensitive comparison when
checking each filename element in a directory.  This would be pretty
simple to add, although issues such as speed and international variations
would probably require a clever case-insensitive comparison, possibly
using a country-specific case mapping table with some flags or other
hacks to deal with single-multiple glyph mappings like SS to ess-tset.
There might even be a performance GAIN if creation of a directory entry
including calculating an appropriate hash function which is also stored
in the directory and used for initial comparisons.

I see no need to map everything to lower case when creating the directory
entry.  Let the entries be in mixed case; this allows more readable names.
I don't know what to do about sorting (e.g. in the shell or ls) - it might
be case sensitive or insensitive sorting, and good arguments can probably
be made for both.

The behavior I'm concerned about is that, if the user types, say, "mail"
and there's a command "Mail" in the search path, it should still work.
If the file "FooBar" exists and the user cats "foobar", because somebody
read that name over the phone, it should find it.

	Mark

Volume-Number: Volume 7, Number 72



More information about the Mod.std.unix mailing list