Shell programming style -- a plea for better shell scripts

Sun Feb 26 13:51:10 AEST 1984

From:            Matthew J. Weinstein <matt at ucla-locus>

Firstly:

The Csh manual says that the standard shell will be invoked UNLESS the 
file begins with a # character (for command files).  From the Csh man 
page on command files:

   ``... The shell opens this file, and saves its name for
     possible resubstitution by	`$0'.  Since many systems use
     either the	standard version 6 or version 7	shells whose
     shell scripts are not compatible with this	shell, the shell
     will execute such a `standard' shell if the first character
     of	a script is not	a `#', i.e. if the script does not start
     with a comment...''

As for #!, EXEC has been hacked on 4.x to look for #! as a special magic 
number; I'm not sure that Bell Unix has that (although it may);  anyway,
the man page for exec says:

   ``To aid execution of command files of various programs, if
     the first two characters of the executable file are '#!'
     then exec attempts to read a pathname from the executable
     file and use that program as the command files command
     interpreter. For example, the following command file
     sequence would be used to begin a csh script:
          #! /bin/csh
          # This shell script computes the checksum on /dev/foobar
          #
               ...

     ...  The space (or tab) following the '#!' is mandatory, and 
     the pathname must be explicit (no paths are searched)...''

(By the way, you left out the space after #! in your last message, which is 
mandatory).

Finally, there is no mention of # as a comment character in my sh man 
page... If it's in yours, it's probably mentioned as a Csh compatability
hack.

Secondly:

The contention on names of programs stems from a difference in outlook
on name binding.  

A few types of name binding are available to the shell programmer:

    Static: A qualified pathname (one that contains a slash).  
   	This is sort of a ``you said it, you got it'' kind of execution.  
	If that program doesn't work or isn't there, your command fails.  
	There are two flavors of this:

	    Absolute: This is a name that begins with a slash, and names
	    	a particular object in the file hierarchy.  "/bin/sort"
		is an example of this.  Note that a name of this sort is
		note context-sensitive.

		Useful if you want to make sure that you get a
		PARTICULAR executable.

	    Relative: A partially qualified pathname.  The name is RELATIVE
    		TO your current working directory, and IS context
		sensitive. "./foo" is an example of one of these names.

		Useful if the shell script changes working directories,
		or if you are executing in a controlled environment.

	Note that the execution path mechanism is not used in this case.

    Dynamic: An unqualified pathname.  This is the the kind of command
    	name most shell files utter.  It has no slashes, and the command
	is found by searching the execution path until an executable of
	the same name is found.  

	The problem is, of course, that the particular program found may
	not be the same kind of program as was originally intended.
	The user may have his own `bzork' program in his bin directory.
	When you execute what you think is /bin/bzork, what you really get 
	is the user's program instead.

	A possible solution is to alter the ``search path''
	to guarantee that the name-program binding is performed using
	a specific ordered set of domains (directories).  However, this 
	may lead to unpleasant side-effects (example: a script which invokes 
	an interactive program.  The user forks a shell from that
	interactive program.  He is, however, unable to reference his
	bin directory in the way he assumed he would be able to...).

	Clearly, when a shell script may invoke an interactive program,
	changing the path (or for that matter the working directory)
	without warning is not a good idea.

In any case, this points out the fact that the semantics of commands may
vary in certain circumstances, and that the script writer should
consider his choices carefully.

My suggestions about defining names should thus be considered in the
light of the functionality required.

I think that work on the semantics of command language bindings (in Unix)
is overdue.  We use a lot of ad-hoc mechanisms, and attempt to make
up for this with ``programming style''. 

I am not sure who is doing compilable shell work, but they must have come 
across this sort of thing before.  Does anyone have any feedback on
this?

					- Matt