a perl question

Tom Christiansen tchrist at convex.COM
Sat Nov 11 23:19:04 AEST 1989


In article <RJK.89Nov9162936 at sawmill.uucp> rjk at sawmill.uucp (Richard Kuhns) writes:

|I'm not entirely sure that this is the newsgroup I should use, but
|I've seen a number of perl questions/answers and I don't know of a
|better newgroup (until comp.lang.perl comes along).

|My question:  I'd dearly love to have a filter, written in perl (the
|rest of the code for this project is in perl, and I'll post it when I
|get it working), which would turn the string `B^HBO^HOL^HLD^HD' into
|`$bold_startBOLD$bold_end', where $bold_start and $bold_end are
|predefined character strings.  I have a filter that does this already
|written in C, but it seems to me I should be able to do it easier in
|perl (using regular expressions?), but I can't come up with a good way
|to do it.  /(.)\010$1/ recognizes one element of such a string (always
|the first).  s/(.)\010$1/$1/g specifically does NOT work (it only
|changes the first occurrence).

This is quite close to what you want:

    $SO = "\033[1m";
    $SE = "\033[m";

    $_ = "this string is B\010BO\010OL\010LD\010D today\n";

    if (/(.)\010$1/) {
	$begin = $`;
	do { s/$&/$1/; } while /(.)\010$1/;
	( $end = $' ) =~ s/.(.*)/$1/;
	s/^$begin/$&$SO/;
	s/$end$/$SE$&/;
    }

    print;

I say "quite close" because if you consider the following string:

    $_ = "this string is B\010BO\010OL\010LD\010D and B\010BR\010RI\010IG\010GH\010HT\010T today\n";

The "and' also gets emboldened, which isn't quite right, but this should 
be a good starting point.  

It would be really nice if just
    s/((.)\010$1)+/${SO}$1${SE}/g; 
would somehow work without any explicit looping, but as with your substitute, 
$1 won't be reset on each scan.  I'll forward this to the perl-users
mailing list (who are waiting on comp.lang.perl) to see whether anybody
there has any bright ideas.


--tom

    Tom Christiansen                       {uunet,uiucdcs,sun}!convex!tchrist 
    Convex Computer Corporation                            tchrist at convex.COM
		 "EMACS belongs in <sys/errno.h>: Editor too big!"



More information about the Comp.unix.questions mailing list