shell for MS-DOS (part 1)

Douglas Orr doug at umich.UUCP
Mon Aug 5 04:13:27 AEST 1985


Here is the first part of the Unix-like shell for ms-dos.  This
is just the readme file, which I am posting to both net.sources and
net.micro.pc.  I will be posting the other two files to net.sources only.

Let me know if you find any problems, or have any suggestions for
improvements (or if I have screwed up the posting ... it's happened before).

We don't have much in the way of arpa access around here, so if someone
would like to send this stuff out on the arpa net, feel free.

	-Doug

	{ihnp4,mb2c,pur-ee}!umich!textset!doug  (preferred)
	doug%textset at umich.csnet
	doug%textset%umich.csnet at csnet-relay.ARPA


#	This is a shell archive.
#	Remove everything above and including the cut line.
#	Then run the rest of the file through sh.
-----cut here-----cut here-----cut here-----cut here-----
#!/bin/sh
# shar:	Shell Archiver
#	Run the following text with /bin/sh to create:
#	readme
# This archive created: Sun Aug  4 14:04:18 1985
cat << \SHAR_EOF > readme

Ann Arbor, Michigan
August 3, 1985


The following describes some of the features of the Unix(tm) like shell
for DOS that I have written.  It is intended as a productivity aid, not
an exact duplicate of either the C or Bourne shells.  There are some
features taken from both.  Minor details may vary.  Compatibility flames 
should be routed to /dev/null.

To get started:

The program was written using the Microsoft C compiler.  Most of the code
should be fairly portable (which is easy to say, since I haven't tried to
port any of it).  I compile it using the small memory model.  There has been
only one case where that has proven to be a problem, which I will describe
more in the "limitations" section below.  I set the stack size to be 4096
(the default is 2048).

If you have version 3.0 of the Microsoft C compiler, you should be able
to invoke "compile.bat" which will compile and link everything.  See
the section below entitled "Porting to another compiler" if you don't
have Microsoft version 3.0.  If you don't have it, I would suggest
thinking seriously about getting it.  I have been generally very
favorably impressed with their library support.  I don't currently have
anything to compare it with, in terms of speed (I am sure there are lots
of published benchmarks), but I have as yet to hit a bug.  That counts
for a lot.

I am not sure how much is DOS 3.0 specific.  I think the directory stuff
is.  I suspect Microsoft 3.0 C is.  DOS 3.0 is well worth getting, in any
case.


Philosophy:

I tried to keep this program relatively small so that it would work on
the small memory model.


The program:

is invoked by    

	sh [-e]

the -e flag indicates that '\' is not to be used as a literal-next
character.

When invoked it tries to source the file /profile.

When you type a command, the shell first escapes all characters
preceded by a backslash (if not invoked with the -e option) and
all special characters enclosed within quotes.

It then performs history substitutions on expressions beginning
with '!' as follows:

	!!  is replaced with the previous command
	!<num> is replaced with the <num>th command
	!<str> is replaced with the command beginning with <str>

It then breaks the command string into tokens.  Entities enclosed
within quotes form one token.  Space characters are used to delimit
other tokens.

It then performs wildcard substitutions as follows:

Tokens containing wildcard characters (currently, only '*') are assumed
to be file names.  The wildcards are expanded to match file names.

	'*' matches any character

It then performs I/O redirections:

	'>word'  redirects stdout
	'>>word' redirects stdout, appending to the indicated file
	'>&word' redirects stdout and stderr
	'<word'  redirects stdin

It then executes the command as follows:

The shell keeps track of what all of the commands in its path are,
as well as what built-in commands it knows about.  If the first argument
is a pathname, that command is executed.  If not, the shell checks its
internal list of commands within the PATH, and built-in commands.  To
update the list of internal commands, use the "rehash" built-in command.

If the command is not found, it is given to dos, allowing convenient
access to built-in dos functions.  For example, 

	dir '*.c'
	copy 'b:*.*'

do more or less what you would want them to.


Built-in commands -

ls [-lsRa]		performs a directory listing.  If given no arguments,
			ls lists the current directory.  The flags are:
			l - long listing,  s - summary, R - recursive
			directory listing, a - attribute (directory/executable).

cd <dir>		change the current working directory to the indicated
			directory

pushd <dir>		cd to the indicated directory, pushing the current 
			directory on the stack.  Invoking this with no arguments
			causes the current directory and the top of the stack to
			be interchanged.  Also known as pd.

popd <dir>		pop the top element off the directory stack, and cd 
			to it.

dirs			print the current directory stack

mkdir <dir> [...]	create directories with the indicated names

rmdir <dir> [...]	remove the indicated directories

set var=val [...]	set the environment variable var to the given value

source file		take commands from the given file

fgrep str file [...]	look for the given string within the given file(s).



Limitations / bugs:

The goddamn dos limitation of 128 bytes of arguments really take a lot
of the utility out of the wildcards.  That is why fgrep is built in.

I spent a little time working on making this the resident SHELL.  I
gave it up when I ran into problems getting a full segment of variable
space.  It seemed to work alright otherwise.  There are some cryptic
references in the manual sections describing the SET and COMMAND
commands, and the section describing the psp about where this problem
comes from.  If anyone who knows more about it would like to enlighten
me, I would be very appreciative.

The implementation of history is pretty limited.  There are no modifiers,
and ^^^ is not implemented.  The parsing is not always consistent with
the c-shell's parsing.

The only wildcard character implemented is '*'.  It is not implemented
in the general case.  It is only matches final destination files or
directories correctly.  Patterns such as /*/* or /*/xxx do not work.

The ls command and wild card characters use the qsort routine which requires 
adequate stack space to sort its arguments.  Very large directories could
cause problems. The largest directory I have has 85 members.  This will
not sort with a 2K stack size.  It has no problems with a 4K stack size.
Unfortunately, I don't know how to trap this stack error, so this is a
fatal error.

The parsing of quoted strings is not quite right.  For example, you
can't say 
		set PROMPT="% "
This may be fairly simple to resolve.  In the mean time, I say
		set PROMPT=%\       (with a trailing blank)

No shell variable substitution.  Not hard to do - I just haven't gotten
around to it.  There are no local variables.

There are no shell scripts or control structures.  Oh well.

Fgrep works, but isn't great.  No useful -i options, etc.  I freely
admit that the algorithm was the one that required the absolute minimum
amount of thought.


Porting to another compiler:

As I say, I haven't tried porting to any other compilers, but here are
my best guesses as to trouble spots:

Data structures -

The dos routines use a union called REGS and a structure called SREGS.
REGS is a union of the byte register names (h.ah, h.al, h.bh, etc.) and
the word register names (x.ax, x.bx, etc.).  SREGS contains the segment
register names.

There is a special directive called "far" that indicates that a pointer
is to have an offset and a segment part, both of which can be assigned
to or from independantly.  For example, char far * x; declares such a
pointer whose type is char *.

Library routines -

	strchr(ptr,ch)		returns char * pointing to ch within ptr or null
	char * ptr;		if not found.  also known as "index"
	int ch;

	strrchr(ptr,ch)		same as above only search backwards (rindex).

	strlwr(ptr)		convert to lower case
	char * ptr;

	strcmpi(a,b)		like strcmp, only ignore case

	char *
	strdup(ptr)		=> strcpy( malloc(strlen(ptr)+1), ptr )
	char * ptr;

	qsort(b,n,w,c)		sort n items at b each of width w using 
	char * b;		compare rtne c
	int n;
	unsigned w;
	int (* c)();

	intdos(ireg,oreg)
	REGS * ireg, * oreg;	issue a dos interrupt first loading registers 
				from ireg.  after returning, assign oreg the 
				register values.

	intdosx(ireg,oreg,sreg)  same as above, but set the segment registers
	REGS * ireg, * oreg;	 before the call, also.
	SREGS * sreg;

	stat(path,statb)	this is just like the Unix stat.  
	struct stat statb;	I can picture lots of C libraries skipping it.

	spawn*			like a fork/exec combo


Include files -

Microsoft was very complete about declaring the return values of routines
in include files.  They have a whole bunch of include files, and since
they go to all of this trouble, I didn't have to declare much in the
way of include files.  This will be a nusiance if you try to use some
other C compiler.  Specifics include malloc.h, stdlib.h, dos.h, etc.

File handling.

Microsoft scores again with relatively sane file handling.  Slashes
work like backslashes.  Stat works more or less normally.  Binary
files are manageable (although I don't think this actually came up
in this particular program).  A center of compiler dependency is
in the routines that do directory queries - open_dir and nxt_entry.
Look there first if you are going to a new compiler.



Extra programs:

Also included are a cp/mv  and  a more that takes file name arguments
and uses inverse video.  I guess I went a little nuts.
More has some little glitches.



	-Douglas Orr
SHAR_EOF
#	End of shell archive
exit 0



More information about the Comp.sources.unix mailing list