Comments on your program

Joseph S. D. Yao jsdy at hadron.UUCP
Sat Nov 30 10:10:17 AEST 1985


In article <135 at brl-tgr.ARPA> cottrell at nbs-vms.arpa (COTTRELL, JAMES) writes:
>                             ...  `Now why didn't you think before posting?'
>>                                    ...  This program was written
>> to help decode a bitnet routing table that I had been netcopy'd
>> to me and didn't get translated into ascii.  So after running dd
>> over it, the line markers had disappeared into never never land.
>> But from looking real closely at the file I could see that each
>> line was supposed to start with ROUTE.....  thus this program:
>
>It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second
>`R' will not be recognized as the start of the sequence!
>
>I thought of ways to use existing tools to do the job. How about this:
>1) run thru `tr' to change all `R's to newlines. This gives you all
>possible places where a line might start. Now run an `ex' script that
>chex (wheat, corn, rice) each line begins with OUTE. If it doesn't,
>then put back the R. Then for each line that begins with an R, join
>it with the previous line. Finally, put back an R on each line.

Yes, Herron's algorithm won't work without some way of backing up.
No, Cottrell's algorithm won't work either.  It assumes that ALL
NL's have been removed, which is a possible but not necessary
interpretation of the originally stated problem.  In C, one way
to do things is:
	while ((c = my_getchar()) != EOF {
		if (c != 'R') {
			putchar(c);
			last_put = c;
			continue;
		}
		gather 4 more
		test for ROUTE
		if so, print NL + 5 chars; last_put = 'E';
		else ungetchar 4 (which is why my_getchar())
	}
	if (last_put != NL)	/* almost certainly so */
		putchar(NL);
This assumes that Herron is correct in his assumption that the
word "ROUTE" was one-to-one with line starts.

Note also that Herron implies a conversion from E***** to ASCII.
If the original tape/file was blocked with fixed-length records,
then there is a dd arg to size lines (cbs=, I believe).  If var-
length, he may have to read all lines in the original for the
record sizes and substitute for them the E@#$%^ NL character
before dd'ing.
-- 

	Joe Yao		hadron!jsdy at seismo.{CSS.GOV,ARPA,UUCP}



More information about the Comp.lang.c mailing list