Broff and a proposed net project

Wed Jul 13 07:53:00 AEST 1983

        The design of broff, the successor to [nt]roff, is proceeding at
a fast clip.  I have received several comments on my earlier note on
the design of the new system, and these are summarized at the end of this
submission.  One idea that was proposed by several people, and which seems
intriguing, is to have the software produced as a net project.  The purpose of
the first part of this note is to describe the idea of the net project and
to request volunteers.

        The idea of a net project is as follows.  My job is to act as moderator
("chief programmer" for you IBM types).  I have already drawn up a high level
design on the system (and in fact have coded a good part of it), but various
important subtasks have been left incomplete.  These subtasks are to be
distributed to various interested parties to complete, perhaps even giving
some tasks to more than one party to work on.  My job as moderator is to
collect these parts, chosing or combining different solutions to the same
problem, producing the final product.

Some problems I can see.

1.      Communication.  Suppose way down in the lowest level of some
task (say font manipulation) someone discovers a serious flaw in the design
that has widespread ramifications.  How do you communicate this to all
other people working on various aspects of the project.  (I guess that is
part of my job as moderator).

2.      The myth of egoless programming.  I think redundancy is a good idea.
Since I have no actual control over the people producing the code (ie, their
jobs are not on the line) I cannot control deadlines or even insure that
a commitment to produce some part of the project will be satisfied.
Furthermore, since I'm not hiring these people (nor even meeting them face
to face) i have no way of evaluating programming skills or experience.
Thus having multiple parties working on the same project maximizes the
chance that someone will come through with a good version of the desired
code.  Nevertheless, suppose that three different groups are working on
some project.  They all spend a considerable amount of time coming up with
a solution.  They all submit solutions, and my job as moderator is to choose
the best.  It will be difficult not to have two groups quite angry with me.

        I'm sure there would be other problems with this idea, and welcome
any communication on this topic.  Nevertheless, I'm interested enough with
the idea to go forward.

        I have implemented a general framework for the new system, for now
called lroff (for "little roff", later will come broff).  The system is
crude in its capabilities but generally good, I believe, in its design.
At this point I can see the following areas requiring further work:

1.      Line and/or page layout.  This is a real biggie.  The troff
algorithm is well known to be bad.  Perhaps better algorithms, such as those
described by Knuth, can be incorporated into the system without changing the
user interface.

2.      Fonts, font description, virtual fonts, ligatures, special fonts.
This is closely related to 1, but in view of the size of 1 it would probably
be worthwhile to try to define a clean interface and divide the project up.
I have pondered the implementation of virtual fonts (see my earlier note
for a definition of this term) a little, and see some real difficulties in
realizing them.  I would like to see someone spend more time thinking about
this.

3.      Hyphenation algorithm.  Knuth gives a better (although more expensive)
algorithm than the one used by [nt]roff.  This is not as big a project as
1 or 2, but clearly separable.

4.      Device specifications, device independence, device drivers, etc.
I have pondered this a bit, and am not sure the direction taken by ditroff
was the right one, other than for expediencies sake.  There is a world of
difference between different devices you might like to drive, and I am
not sure any concise description would be adequate.  Perhaps a better
solution is to define a clean interface, and write short drivers for each
device you would like, and then have a troff shell script choose the program
to be run in any particular case.  In any event, this is an area deserving
some more thought.

5.      Macros.  I have written a short set of MS like macros.  One major
motivation for this project is to provide a set of tools sufficient to bring
the task of macro writing and reading within the grasp of the average
programmer.  (anybody that has ever peeked at the inside the ms macro
package will know what i am talking about).  The current package is very
simple, it could stand being expanded.

6.      Symbol table management.  This is a small task, but a necessary
one.  In lroff i just used linear table lookup for symbol table routines,
since I didn't want to spend any more time than necessary in writing
this section of code.  This should be changed.

        If anyone is interested in participating in this (historic) project,
drop me a note indicating your area of interest and I will send you a copy
of the system as it stands now, plus more detailed comments on my ideas
for extensions in the particular direction indicated.

--tim budd

        {utah-cs, cornell, taklabs, purdue, ucbvax, kpno} ! arizona ! budd

Now, to summarize the comments i have received:

        The notion of a net project was first suggested by
Ian Darwin (utcsstat!ian).

        Almost everyone commented on the most obvious limitation of [nt]roff,
which is the ridiculous 1 or 2 character name restriction.  Unfortunately,
few people seemed to realise the implication of this change (which was the
reason for my not mentioning it in my first design document).  Consider the
common string OQ which prints an open quote mark.  This is frequently placed
immediately next to text, as in \*(OQfoo\*(CQ.  Unfortunately, I do think
there should be some attempt to be "somewhat" upward compatible, and thus
this is somewhat of a problem.  The best solution came fro Mario Ruggiero,
also of Toronto (utcsstat!mario) who suggests

        \*A             for one character names
        \*(AB           for two character names
        \*{ABCDEFG}     for N character names.

        Bill Tuthill (ucbvax!G:tut) suggests \n be changed to \#, so that
\n could take on its more conventional "newline" meaning.  After just
woofing about compatibility, I agree.

        watmath!idallen suggests that predefined register names have more
meaningful names, for example \#(pl for page length instead of \#(.p.  I agree.

        Ian Utting (ukc!iau) suggests the line and page breaking algorithms
be rewritten a la Knuth.  I agree.  Can we make it a truly trans-atlantic
project, Ian?

        Rick Zaccone (psuvax!zaccone) suggests better debugging features.
I've added a "debug" command (.db) that can be followed by a single character
modifier to type out various information (registers, strings, diversion,
etc).  He also wanted to see a better method of controlling widows and orphans
(sounds rather Dickens-like, doesn't it).  This is probably tied up in a better
line and page breaking algorithm.

        Guy Harris (rlgvax!guy) suggests a driver be written to produce
the ANSI X3.64 terminal escape sequence.  Sounds good.  Whats that?

        Kenneth Almquist (spanky!ka) suggested that macros be radically
redesigned to be more C like, including local variables, strings, etc.
This has apparently been done in at least one site (see below) but it is
not public.  An interesting idea, but I'm dubious that it would make
reading or writing macros any easier.

        Finally, I got a note describing a C program that recoded some of the
MS macros directly into C, and thus produces documents much faster than nroff.
This note was written a long time ago on a machine in a galaxy far far away ....
Unfortunately, I am not at liberty to describe how the note came into
my hands.