the adventure of sed to perl conversion

Wed Sep 14 10:35:16 AEST 1988

In article <836 at gmdzi.UUCP> tietz at gmdzi.UUCP (Christoph Tietz) writes:
: A problem with the different notion of line ends in sed and perl could
: easily be solved. s2p translates the sed expression:
: 
: 	#   second case: no '\' at line end => append '\' to line end and
: 	#		 insert .sym dependency as new line without '\'
: 	#		 at the end
: 	#
: 	s/^\(\([^ \	]*\)_Dummy\.o:.*[^\\]\)$/\1 \\\
: 	\	\	\2\.sym/
: 
: to:
: 
: 	s/^(([^ \	]*)_Dummy\.o:.*[^\\])$/$1 \\\n\	\	$2\.sym/;
: 
: This is not what was intended, because '$1' in the perl script contains the
: whole line including the newline character and the match perl performs
: allows a backslash at the end of the line because the newline character is
: matched against [^\\]. Because of this the perl substitution just inserts
: a new line that contains a space and the backslash even if the matching
: line ends with '\'. I had to change the perl command to:
: 
:     if (/^([^ \	]*)_Dummy\.o:.*[^\\]\n$/) {
:        chop; $_ .= "\\\n";
:        $atext .= "\	\	$1.sym \n";
:     }
:     # print $_ and $atext
: 
: and it worked. My first question: Why is the '\n' before the line end '$'
: neccessary ? 

It isn't strictly necessary.  $ will match either before the \n or at the
end of the string (after the \n, in this case).  I wrote s2p before I put
the "chop" operator into perl, so I traded off an occasional error with
end of line processing for greatly increased speed.  Now that "chop" exists
I should probably rewrite s2p to use it.  At least as an option.  It's
still a little faster to leave the \n on if we can get away with it.

: What sense makes the existence of '$' if I have to use '\n' to
: anchor a match at the line end ?

Ordinarily you don't have to anchor it with a \n, since most things you put
into a regular expression don't match a newline.  A negated character class
is an exception, unfortunately.  S2p should be smarter about negated
character classes, I suppose.  The \s (whitespace) will also match \n.
A dot (.) specifically does NOT match \n.

: What possibility do I have to end a line
: other than using '\n' as the delimiter ?

You can set your input line delimiter to any character you choose,
and your input line will end with that.  The primary reason perl doesn't
strip it off on input is so that "while (<INFILE>)" always evaluates the
input line as true, even if it's a blank line.

: The next problem took me more time to solve. s2p translates:
: 
: 		/^[^ ]*\.out:/,/^[ ]*$/d
: 
: to the perl command:
: 
: 	if (/^[^ ]*\.out:/ .. /^[ ]*$/) {
: 	   # skip this input line
: 	}
: 
: The sed script is intended to clean up a makefile that contains the lines:
: 
: SIMCore_Dummy.out: SIMCore_Dummy.o UserCore.o /users/susi/vaxlib/Strings.o 
: 		$(M2C) -e SIMCore_Dummy -o \
: 		SIMCore_Dummy.out $(M2FLAGS) $(M2LINK) 
: 
: objects: 	Alias.sym Alias.o CoreTool.sym CoreTool.o Env.sym Env.o \
: 
: The sed script erases the lines from "SIMCore_Dummy.out:" up to the line
: before "objects:". The perl script erases the whole file following
: "SIMCore_Dummy.out:". The end of the range is never found. If the range
: expression is evaluated before the 'IF' statement everything works fine:
: 
:     $gotcha = /^[^ ]*\.out:/ .. /^[ ]*$/;
:     if ($gotcha) { 
:        # skip this input line
:     }
: 
: does exactly what I wanted it to do. My second question: Is this a bug or
: am I missing some semantical details?

This looks like a real bug, probably brought about by trying to optimize
the conditional.  I'll have to glare at it some.

Larry Wall
lwall at jpl-devvax.jpl.nasa.gov