perl 2.0 patch #14

Larry Wall lwall at jpl-devvax.JPL.NASA.GOV
Fri Sep 9 09:27:07 AEST 1988


System: perl version 2.0
Patch #: 14
Priority: MEDIUM
Subject: case insensitive search speedup
Subject: searches should now work on chars with the 8th bit set
Subject: plugged memory leak on searches compiled at run time
Subject: some patterns such as /[Cc]at/ could fail
Subject: /foo/ && s//bar/ could fail
Subject: % should now work with a negative left argument
Subject: closing a pipe now returns child process status in $?
Subject: eof() will no longer dump core when no files are opened with <>
Subject: printf no longer drops last argument after %%
Subject: printf now works more like C version in weird cases
Subject: srand always returns true now
Subject: documented necessity of seek between reads and writes
Subject: $foo = `echo $foo` now works right
Subject: backreferences weren't treated as variable length
Subject: attempted fix for machines where $* = 1 was failing
Subject: added detection of "sort" not used as keyword
Subject: evals of long strings could use up gobs of memory
Subject: $) and $| weren't properly evaluated in `` or ""
Subject: man pages for ld and cc probably not in $mansrc

Description:
	Case insensitive searches have been sped up 5 to 10 times thanks to
	some super work by Bob Best.

	Searches should now work on chars with the 8th bit set.  I had to
	double the table size for the Boyer-Moore searches and declare
	some chars as unsigned to make the pointer math work right.

	Patterns containing variables are compiled every time they are
	executed.  Unfortunately, the storage from the last pattern compilation
	was not reclaimed, resulting in memory gobbling.

	Some patterns such as /[Cc]at/ could fail, in that the optimization
	that first searched for the substring "at" failed to back off
	the correct distance to the beginning of the pattern.  This only
	happened to optimizations that couldn't be hoisted to the cmd level
	but remained at the spat level.

	Saying /foo/ && s//bar/ could fail if the pattern /foo/ was optimized
	at the spat level, since it failed to set lastspat correctly.

	The % operator should now work with a negative left argument.  Since
	some compilers punt on this, I had to catch it and calculate it
	explicitly myself, using only positive operands to %.  It does
	mod not rem, so -1 % 5 is 4.

	Closing a pipe now returns child process status in $?.  I thought
	it did this already but I was wrong.

	The eof() function will no longer dump core when no files are opened
	with <>.  Though why anyone would want eof() in that situation is
	beyond me.

	The printf operator no longer drops last argument after %%.  This
	was caused by decrementing the number of fields remaining for the
	printf even though %% didn't consume one.  printf also now works
	more like the C version when it meets strange things like % with
	an invalid format letter.

	Since the return value of srand is void on some machines and
	undocumented on others, perl's srand always returns true now.

	Most stdio packages require you to do an fseek or reach eof between
	reads and writes to the same stream.  I've now documented the
	restriction in perl, which after all uses stdio.

	$foo = `echo $foo` used to destroy $foo before getting its value.

	In patterns, backreferences weren't treated as variable length, so
	they didn't disable optimizations that depend on finding constant
	substrings.  In particular /(a*) b \1 c \1 d/ produced a constant
	substring of " b  c  d ", obviously wrong.

	I've heard of machines where the assignment $* = 1 doesn't work right.
	Nobody's sent me a fix (that I know of), so I've take a guess at
	what is going wrong.  Lemme know if it still fails.

	A number of perl 1.0 scripts had "sort" as a filehandle.  Since
	perl 2.0 added sort as a reserved word, I've added a diagnostic that
	catches most uses of "sort" as a non-keyword and complains
	appropriately.

	There's a nifty trick, when searching for lots of different strings
	that are not known till runtime, in which you build up a long string
	of commands and eval that so that the patterns are compiled only
	once.  Unfortunately, the way the tokener handled the scanning
	of quoted strings caused enormous gobs of memory to be used up and
	not returned to the free memory pool.  Throwing in a realloc at
	the appropriate place fixes this.

	References to variables $) and $| are illegal in search patterns
	because they look like the end-of-string test at the end of an
	optional pattern.  Unfortunately, they were also made illegal
	in `` and "", where they should be perfectly legal.  This has been
	remedied.

	Configure looked for manual pages in $mansrc.  Since that's usually
	local or new manual pages, ld.1 and cc.1 probably weren't there.

	Some random cleanup:
		"make realclean" now deletes perl.man
		missing " in README
		op.sprintf now tests %%
		in x2p/Makefile.SH added redirection of stderr to /dev/null
		in a2py.c walk() needed to be declared outside of main()
		spelled caesar right in manual

Fix:	From rn, say "| patch -p -N -d DIR", where DIR is your perl source
	directory.  Outside of rn, say "cd DIR; patch -p -N <thisarticle".
	If you don't have the patch program, apply the following by hand,
	or get patch (version 2.0, latest patchlevel).

	After patching:
		Configure -d
		make depend
		make
		make test
		make install

	If patch indicates that patchlevel is the wrong version, you may need
	to apply one or more previous patches, or the patch may already
	have been applied.  See the patchlevel.h file to find out what has or
	has not been applied.  In any event, don't continue with the patch.

	If you are missing previous patches they can be obtained from me:

	Larry Wall
	lwall at jpl-devvax.jpl.nasa.gov

	If you send a mail message of the following form it will greatly speed
	processing:

	Subject: Command
	@SH mailpatch PATH perl 2.0 LIST
		   ^ note the c

	where PATH is a return path FROM ME TO YOU either in Internet notation,
	or in bang notation from some well-known host, and LIST is the number
	of one or more patches you need, separated by spaces, commas, and/or
	hyphens.  Saying 35- says everything from 35 to the end.


	You can also get the patches via anonymous FTP from
	jpl-devvax.jpl.nasa.gov (128.149.8.43).

Index: patchlevel.h
Prereq: 13
1c1
< #define PATCHLEVEL 13
---
> #define PATCHLEVEL 14

Index: Configure
Prereq: 2.0.1.4
*** Configure.old	Wed Sep  7 17:16:11 1988
--- Configure	Wed Sep  7 17:16:14 1988
***************
*** 8,14 ****
  # and edit it to reflect your system.  Some packages may include samples
  # of config.h for certain machines, so you might look for one of those.)
  #
! # $Header: Configure,v 2.0.1.4 88/08/05 01:23:27 root Exp $
  #
  # Yes, you may rip this off to use in other distribution packages.
  # (Note: this Configure script was generated automatically.  Rather than
--- 8,14 ----
  # and edit it to reflect your system.  Some packages may include samples
  # of config.h for certain machines, so you might look for one of those.)
  #
! # $Header: Configure,v 2.0.1.5 88/09/07 16:28:09 lwall Locked $
  #
  # Yes, you may rip this off to use in other distribution packages.
  # (Note: this Configure script was generated automatically.  Rather than
***************
*** 163,168 ****
--- 163,169 ----
  attrlist="$attrlist ns32000 ns16000 iAPX286 mc300 mc500 mc700 sparc"
  attrlist="$attrlist nsc32000 sinix xenix venix posix ansi M_XENIX"
  attrlist="$attrlist $mc68k __STDC__ UTS M_I8086 M_I186 M_I286 M_I386"
+ attrlist="$attrlist i186"
  pth="/usr/ucb /bin /usr/bin /usr/local /usr/local/bin /usr/lbin /etc /usr/lib /lib /usr/local/lib"
  d_newshome="/usr/NeWS"
  defvoidused=7
***************
*** 776,783 ****
  *split)
      case "$split" in
      '') 
! 	if $contains '\-i' $mansrc/ld.1 >/dev/null 2>&1 || \
! 	   $contains '\-i' $mansrc/cc.1 >/dev/null 2>&1; then
  	    dflt='-i'
  	else
  	    dflt='none'
--- 777,784 ----
  *split)
      case "$split" in
      '') 
! 	if $contains '\-i' /usr/man/man1/ld.1 >/dev/null 2>&1 || \
! 	   $contains '\-i' /usr/man/man1/cc.1 >/dev/null 2>&1; then
  	    dflt='-i'
  	else
  	    dflt='none'

Index: Makefile.SH
Prereq: 2.0.1.4
*** Makefile.SH.old	Wed Sep  7 17:16:21 1988
--- Makefile.SH	Wed Sep  7 17:16:22 1988
***************
*** 25,33 ****
  
  echo "Extracting Makefile (with variable substitutions)"
  cat >Makefile <<!GROK!THIS!
! # $Header: Makefile.SH,v 2.0.1.4 88/08/03 22:00:44 root Exp $
  #
  # $Log:	Makefile.SH,v $
  # Revision 2.0.1.4  88/08/03  22:00:44  root
  # 
  # patch11: make install doesn't modify current directory any more
--- 25,36 ----
  
  echo "Extracting Makefile (with variable substitutions)"
  cat >Makefile <<!GROK!THIS!
! # $Header: Makefile.SH,v 2.0.1.5 88/09/07 16:29:26 lwall Locked $
  #
  # $Log:	Makefile.SH,v $
+ # Revision 2.0.1.5  88/09/07  16:29:26  lwall
+ # patch14: make realclean now deletes perl.man
+ # 
  # Revision 2.0.1.4  88/08/03  22:00:44  root
  # 
  # patch11: make install doesn't modify current directory any more
***************
*** 184,190 ****
  	rm -f *.o
  
  realclean:
! 	rm -f perl *.orig */*.orig *~ */*~ *.o core $(addedbyconf)
  
  # The following lint has practically everything turned on.  Unfortunately,
  # you have to wade through a lot of mumbo jumbo that can't be suppressed.
--- 187,193 ----
  	rm -f *.o
  
  realclean:
! 	rm -f perl *.orig */*.orig *~ */*~ *.o core $(addedbyconf) perl.man
  
  # The following lint has practically everything turned on.  Unfortunately,
  # you have to wade through a lot of mumbo jumbo that can't be suppressed.

Index: x2p/Makefile.SH
Prereq: 2.0.1.1
*** x2p/Makefile.SH.old	Wed Sep  7 17:19:18 1988
--- x2p/Makefile.SH	Wed Sep  7 17:19:19 1988
***************
*** 18,26 ****
  esac
  echo "Extracting x2p/Makefile (with variable substitutions)"
  cat >Makefile <<!GROK!THIS!
! # $Header: Makefile.SH,v 2.0.1.1 88/07/11 23:13:39 root Exp $
  #
  # $Log:	Makefile.SH,v $
  # Revision 2.0.1.1  88/07/11  23:13:39  root
  # patch2: now expects more shift/reduce errors
  # 
--- 18,29 ----
  esac
  echo "Extracting x2p/Makefile (with variable substitutions)"
  cat >Makefile <<!GROK!THIS!
! # $Header: Makefile.SH,v 2.0.1.2 88/09/07 17:13:30 lwall Locked $
  #
  # $Log:	Makefile.SH,v $
+ # Revision 2.0.1.2  88/09/07  17:13:30  lwall
+ # patch14: added redirection of stderr to /dev/null
+ # 
  # Revision 2.0.1.1  88/07/11  23:13:39  root
  # patch2: now expects more shift/reduce errors
  # 
***************
*** 93,99 ****
  # won't work with csh
  	export PATH || exit 1
  	- mv $(bin)/a2p $(bin)/a2p.old 2>/dev/null
! 	- mv $(bin)/s2p $(bin)/s2p.old
  	- if test `pwd` != $(bin); then cp $(public) $(bin); fi
  	cd $(bin); \
  for pub in $(public); do \
--- 96,102 ----
  # won't work with csh
  	export PATH || exit 1
  	- mv $(bin)/a2p $(bin)/a2p.old 2>/dev/null
! 	- mv $(bin)/s2p $(bin)/s2p.old 2>/dev/null
  	- if test `pwd` != $(bin); then cp $(public) $(bin); fi
  	cd $(bin); \
  for pub in $(public); do \

Index: README
*** README.old	Wed Sep  7 17:16:27 1988
--- README	Wed Sep  7 17:16:27 1988
***************
*** 66,72 ****
  
  7)  Read the manual entry before running perl.
  
! 8)  Go down to the x2p directory and do a "make depend, a "make" and a
      "make install" to create the awk to perl and sed to perl translators.
  
  9)  IMPORTANT!  Help save the world!  Communicate any problems and suggested
--- 66,72 ----
  
  7)  Read the manual entry before running perl.
  
! 8)  Go down to the x2p directory and do a "make depend", a "make" and a
      "make install" to create the awk to perl and sed to perl translators.
  
  9)  IMPORTANT!  Help save the world!  Communicate any problems and suggested

Index: x2p/a2py.c
Prereq: 2.0.1.2
*** x2p/a2py.c.old	Wed Sep  7 17:19:25 1988
--- x2p/a2py.c	Wed Sep  7 17:19:26 1988
***************
*** 1,6 ****
! /* $Header: a2py.c,v 2.0.1.2 88/08/03 22:50:05 root Exp $
   *
   * $Log:	a2py.c,v $
   * Revision 2.0.1.2  88/08/03  22:50:05  root
   * patch11: in a2p, numbers couldn't start with '.'
   * 
--- 1,9 ----
! /* $Header: a2py.c,v 2.0.1.3 88/09/07 17:15:57 lwall Locked $
   *
   * $Log:	a2py.c,v $
+  * Revision 2.0.1.3  88/09/07  17:15:57  lwall
+  * patch14: walk() needed to be declared outside of main()
+  * 
   * Revision 2.0.1.2  88/08/03  22:50:05  root
   * patch11: in a2p, numbers couldn't start with '.'
   * 
***************
*** 20,25 ****
--- 23,29 ----
  char *filename;
  
  int checkers = 0;
+ STR *walk();
  
  main(argc,argv,env)
  register int argc;
***************
*** 29,35 ****
      register STR *str;
      register char *s;
      int i;
-     STR *walk();
      STR *tmpstr;
  
      linestr = str_new(80);
--- 33,38 ----

Index: arg.c
Prereq: 2.0.1.3
*** arg.c.old	Wed Sep  7 17:16:35 1988
--- arg.c	Wed Sep  7 17:16:38 1988
***************
*** 1,6 ****
! /* $Header: arg.c,v 2.0.1.3 88/08/03 22:06:41 root Exp $
   *
   * $Log:	arg.c,v $
   * Revision 2.0.1.3  88/08/03  22:06:41  root
   * patch11: support for broken 386 compiler
   * patch11: join of null array could leave destination string non-null
--- 1,17 ----
! /* $Header: arg.c,v 2.0.1.4 88/09/07 16:46:25 lwall Locked $
   *
   * $Log:	arg.c,v $
+  * Revision 2.0.1.4  88/09/07  16:46:25  lwall
+  * patch14: case insensitive search speedup
+  * patch14: plugged memory leak on searches compiled at run time
+  * patch14: some patterns such as /[Cc]at/ could fail
+  * patch14: /foo/ && s//bar/ could fail
+  * patch14: closing a pipe now returns child process status in $?
+  * patch14: eof() will no longer dump core when no files are opened with <>
+  * patch14: printf no longer drops last argument after %%
+  * patch14: printf now works more like C version in weird cases
+  * patch14: searches should now work on chars with the 128 bit set
+  * 
   * Revision 2.0.1.3  88/08/03  22:06:41  root
   * patch11: support for broken 386 compiler
   * patch11: join of null array could leave destination string non-null
***************
*** 28,33 ****
--- 39,45 ----
  #include <errno.h>
  
  extern int errno;
+ extern char fold[];
  
  STR *
  do_match(arg,retary,sarg,ptrmaxsarg,sargoff,cushion)
***************
*** 64,69 ****
--- 76,83 ----
  	if (debug & 8)
  	    deb("2.SPAT /%s/\n",t);
  #endif
+ 	if (spat->spat_regexp)
+ 	    regfree(spat->spat_regexp);
  	spat->spat_regexp = regcomp(t,spat->spat_flags & SPAT_FOLD,1);
  	if (!*spat->spat_regexp->precomp && lastspat)
  	    spat = lastspat;
***************
*** 119,125 ****
  		    goto nope;
  		else if (spat->spat_flags & SPAT_ALL)
  		    goto yup;
! 		else if (spat->spat_regexp->regback >= 0) {
  		    ++*(long*)&spat->spat_short->str_nval;
  		    s -= spat->spat_regexp->regback;
  		    if (s < t)
--- 133,139 ----
  		    goto nope;
  		else if (spat->spat_flags & SPAT_ALL)
  		    goto yup;
! 		if (s && spat->spat_regexp->regback >= 0) {
  		    ++*(long*)&spat->spat_short->str_nval;
  		    s -= spat->spat_regexp->regback;
  		    if (s < t)
***************
*** 183,188 ****
--- 197,205 ----
  
  yup:
      ++*(long*)&spat->spat_short->str_nval;
+     lastspat = spat;
+     if (spat->spat_flags & SPAT_ONCE)
+ 	spat->spat_flags |= SPAT_USED;
      if (sawampersand) {
  	char *tmps;
  
***************
*** 216,221 ****
--- 233,240 ----
  	fatal("panic: do_subst");
      else if (spat->spat_runtime) {
  	m = str_get(eval(spat->spat_runtime,Null(STR***),-1));
+ 	if (spat->spat_regexp)
+ 	    regfree(spat->spat_regexp);
  	spat->spat_regexp = regcomp(m,spat->spat_flags & SPAT_FOLD,1);
      }
  #ifdef DEBUGGING
***************
*** 249,255 ****
  	    }
  	    else if (!(s = fbminstr(s, strend, spat->spat_short)))
  		goto nope;
! 	    else if (spat->spat_regexp->regback >= 0) {
  		++*(long*)&spat->spat_short->str_nval;
  		s -= spat->spat_regexp->regback;
  		if (s < m)
--- 268,274 ----
  	    }
  	    else if (!(s = fbminstr(s, strend, spat->spat_short)))
  		goto nope;
! 	    if (s && spat->spat_regexp->regback >= 0) {
  		++*(long*)&spat->spat_short->str_nval;
  		s -= spat->spat_regexp->regback;
  		if (s < m)
***************
*** 363,368 ****
--- 382,389 ----
  	    arg_free(spat->spat_runtime);	/* it won't change, so */
  	    spat->spat_runtime = Nullarg;	/* no point compiling again */
  	}
+ 	if (spat->spat_regexp)
+ 	    regfree(spat->spat_regexp);
  	spat->spat_regexp = regcomp(m,spat->spat_flags & SPAT_FOLD,1);
      }
  #ifdef DEBUGGING
***************
*** 725,732 ****
  	return FALSE;
      }
      if (stio->fp) {
! 	if (stio->type == '|')
! 	    retval = (pclose(stio->fp) >= 0);
  	else if (stio->type == '-')
  	    retval = TRUE;
  	else {
--- 746,756 ----
  	return FALSE;
      }
      if (stio->fp) {
! 	if (stio->type == '|') {
! 	    status = pclose(stio->fp);
! 	    retval = (status >= 0);
! 	    statusvalue = (unsigned)status & 0xffff;
! 	}
  	else if (stio->type == '-')
  	    retval = TRUE;
  	else {
***************
*** 755,762 ****
      register STIO *stio;
      int ch;
  
!     if (!stab)			/* eof() */
! 	stio = argvstab->stab_io;
      else
  	stio = stab->stab_io;
  
--- 779,790 ----
      register STIO *stio;
      int ch;
  
!     if (!stab) {			/* eof() */
! 	if (argvstab)
! 	    stio = argvstab->stab_io;
! 	else
! 	    return TRUE;
!     }
      else
  	stio = stab->stab_io;
  
***************
*** 1105,1120 ****
  	    break;		/* not enough % patterns, oh well */
  	for (t++; *sarg && *t && t != s; t++) {
  	    switch (*t) {
! 	    case '\0':
! 		t--;
! 		break;
! 	    case '%':
  		ch = *(++t);
  		*t = '\0';
  		sprintf(buf,s);
  		s = t;
  		*(t--) = ch;
  		break;
  	    case 'l':
  		dolong = TRUE;
  		break;
--- 1133,1153 ----
  	    break;		/* not enough % patterns, oh well */
  	for (t++; *sarg && *t && t != s; t++) {
  	    switch (*t) {
! 	    default:
  		ch = *(++t);
  		*t = '\0';
  		sprintf(buf,s);
  		s = t;
  		*(t--) = ch;
+ 		len++;
  		break;
+ 	    case '0': case '1': case '2': case '3': case '4':
+ 	    case '5': case '6': case '7': case '8': case '9': 
+ 	    case '.': case '#': case '-': case '+':
+ 		break;
+ 	    case '\0':
+ 		t--;
+ 		break;
  	    case 'l':
  		dolong = TRUE;
  		break;
***************
*** 1751,1757 ****
  do_study(str)
  STR *str;
  {
!     register char *s = str_get(str);
      register int pos = str->str_cur;
      register int ch;
      register int *sfirst;
--- 1784,1790 ----
  do_study(str)
  STR *str;
  {
!     register unsigned char *s = (unsigned char*)(str_get(str));
      register int pos = str->str_cur;
      register int ch;
      register int *sfirst;
***************
*** 1794,1799 ****
--- 1827,1839 ----
  	else
  	    snext[pos] = -pos;
  	sfirst[ch] = pos;
+ 
+ 	/* If there were any case insensitive searches, we must assume they
+ 	 * all are.  This speeds up insensitive searches much more than
+ 	 * it slows down sensitive ones.
+ 	 */
+ 	if (sawi)
+ 	    sfirst[fold[ch]] = pos;
      }
  
      str->str_pok |= 4;

Index: eval.c
Prereq: 2.0.1.5
*** eval.c.old	Wed Sep  7 17:16:53 1988
--- eval.c	Wed Sep  7 17:16:56 1988
***************
*** 1,6 ****
! /* $Header: eval.c,v 2.0.1.5 88/08/03 22:17:04 root Exp $
   *
   * $Log:	eval.c,v $
   * Revision 2.0.1.5  88/08/03  22:17:04  root
   * patch11: support for incompetent 386 compiler
   * patch11: support for Sun compiler that can't cast double to unsigned long.
--- 1,10 ----
! /* $Header: eval.c,v 2.0.1.6 88/09/07 16:49:52 lwall Locked $
   *
   * $Log:	eval.c,v $
+  * Revision 2.0.1.6  88/09/07  16:49:52  lwall
+  * patch14: % should now work with a negative left argument
+  * patch14: srand always returns true now
+  * 
   * Revision 2.0.1.5  88/08/03  22:17:04  root
   * patch11: support for incompetent 386 compiler
   * patch11: support for Sun compiler that can't cast double to unsigned long.
***************
*** 489,496 ****
      case O_MODULO:
      	if ((tmplong = (long) str_gnum(sarg[2])) == 0L)
      	    fatal("Illegal modulus zero");
! 	value = str_gnum(sarg[1]);
! 	value = (double)(((long)value) % tmplong);
  	goto donumset;
      case O_ADD:
  	value = str_gnum(sarg[1]);
--- 493,503 ----
      case O_MODULO:
      	if ((tmplong = (long) str_gnum(sarg[2])) == 0L)
      	    fatal("Illegal modulus zero");
! 	when = (long)str_gnum(sarg[1]);
! 	if (when >= 0)
! 	    value = (double)(when % tmplong);
! 	else
! 	    value = (double)(tmplong - (-when % tmplong));
  	goto donumset;
      case O_ADD:
  	value = str_gnum(sarg[1]);
***************
*** 1080,1087 ****
  #endif
  	goto donumset;
      case O_SRAND:
! 	value = (double)srand((int)str_gnum(sarg[1]));
! 	goto donumset;
      case O_EXP:
  	value = exp(str_gnum(sarg[1]));
  	goto donumset;
--- 1087,1095 ----
  #endif
  	goto donumset;
      case O_SRAND:
! 	srand((int)str_gnum(sarg[1]));
! 	str = &str_yes;
! 	break;
      case O_EXP:
  	value = exp(str_gnum(sarg[1]));
  	goto donumset;

Index: t/op.sprintf
Prereq: 2.0
*** t/op.sprintf.old	Wed Sep  7 17:18:53 1988
--- t/op.sprintf	Wed Sep  7 17:18:54 1988
***************
*** 1,8 ****
  #!./perl
  
! # $Header: op.sprintf,v 2.0 88/06/05 00:14:40 root Exp $
  
  print "1..1\n";
  
! $x = sprintf("%3s %-4s foo %5d%c%3.1f","hi",123,456,65,3.0999);
! if ($x eq ' hi 123  foo   456A3.1') {print "ok 1\n";} else {print "not ok 1\n";}
--- 1,8 ----
  #!./perl
  
! # $Header: op.sprintf,v 2.0.1.1 88/09/07 17:04:35 lwall Locked $
  
  print "1..1\n";
  
! $x = sprintf("%3s %-4s%%foo %5d%c%3.1f","hi",123,456,65,3.0999);
! if ($x eq ' hi 123 %foo   456A3.1') {print "ok 1\n";} else {print "not ok 1\n";}

Index: perl.h
Prereq: 2.0.1.2
*** perl.h.old	Wed Sep  7 17:17:04 1988
--- perl.h	Wed Sep  7 17:17:05 1988
***************
*** 1,6 ****
! /* $Header: perl.h,v 2.0.1.2 88/08/03 22:19:11 root Exp $
   *
   * $Log:	perl.h,v $
   * Revision 2.0.1.2  88/08/03  22:19:11  root
   * patch11: some support for crippled compilers that don't grok str_get macro
   * patch11: str_peek improperly reused a buffer
--- 1,9 ----
! /* $Header: perl.h,v 2.0.1.3 88/09/07 16:51:18 lwall Locked $
   *
   * $Log:	perl.h,v $
+  * Revision 2.0.1.3  88/09/07  16:51:18  lwall
+  * patch14: added sawi variable to optimize study when no //i found
+  * 
   * Revision 2.0.1.2  88/08/03  22:19:11  root
   * patch11: some support for crippled compilers that don't grok str_get macro
   * patch11: str_peek improperly reused a buffer
***************
*** 231,236 ****
--- 234,240 ----
  EXT bool allstabs INIT(FALSE);	/* init all customary symbols in symbol table?*/
  EXT bool sawampersand INIT(FALSE);	/* must save all match strings */
  EXT bool sawstudy INIT(FALSE);		/* do fbminstr on all strings */
+ EXT bool sawi INIT(FALSE);		/* study must assume case insensitive */
  
  #define TMPPATH "/tmp/perl-eXXXXXX"
  EXT char *e_tmpname;

Index: perl.man.1
Prereq: 2.0.1.4
*** perl.man.1.old	Wed Sep  7 17:17:16 1988
--- perl.man.1	Wed Sep  7 17:17:20 1988
***************
*** 1,7 ****
  .rn '' }`
! ''' $Header: perl.man.1,v 2.0.1.4 88/08/03 22:21:28 root Exp $
  ''' 
  ''' $Log:	perl.man.1,v $
  ''' Revision 2.0.1.4  88/08/03  22:21:28  root
  ''' patch11: random typos and clarifications
  ''' 
--- 1,10 ----
  .rn '' }`
! ''' $Header: perl.man.1,v 2.0.1.5 88/09/07 16:52:04 lwall Locked $
  ''' 
  ''' $Log:	perl.man.1,v $
+ ''' Revision 2.0.1.5  88/09/07  16:52:04  lwall
+ ''' patch14: documented setting $? by closing pipe
+ ''' 
  ''' Revision 2.0.1.4  88/08/03  22:21:28  root
  ''' patch11: random typos and clarifications
  ''' 
***************
*** 1261,1266 ****
--- 1264,1270 ----
  does not.
  Also, closing a pipe will wait for the process executing on the pipe to complete,
  in case you want to look at the output of the pipe afterwards.
+ Closing a pipe explicitly also puts the status value of the command into $?.
  Example:
  .nf
  

Index: perl.man.2
Prereq: 2.0.1.5
*** perl.man.2.old	Wed Sep  7 17:17:37 1988
--- perl.man.2	Wed Sep  7 17:17:42 1988
***************
*** 1,7 ****
  ''' Beginning of part 2
! ''' $Header: perl.man.2,v 2.0.1.5 88/08/05 01:27:31 root Exp $
  '''
  ''' $Log:	perl.man.2,v $
  ''' Revision 2.0.1.5  88/08/05  01:27:31  root
  ''' patch13: clarified goto problems
  ''' 
--- 1,13 ----
  ''' Beginning of part 2
! ''' $Header: perl.man.2,v 2.0.1.6 88/09/07 16:54:49 lwall Locked $
  '''
  ''' $Log:	perl.man.2,v $
+ ''' Revision 2.0.1.6  88/09/07  16:54:49  lwall
+ ''' patch14: spelled caesar right
+ ''' patch14: generalized $? slightly
+ ''' patch14: removed caveat about % of negative numbers
+ ''' patch14: documented necessity of seek between reads and writes
+ ''' 
  ''' Revision 2.0.1.5  88/08/05  01:27:31  root
  ''' patch13: clarified goto problems
  ''' 
***************
*** 233,239 ****
  
  	open(LOG, \'>>/usr/spool/news/twitlog\'\|);	# (log is reserved)
  
! 	open(article, "caeser <$article |"\|);		# decrypt article
  
  	open(extract, "|sort >/tmp/Tmp$$"\|);		# $$ is our process#
  
--- 239,245 ----
  
  	open(LOG, \'>>/usr/spool/news/twitlog\'\|);	# (log is reserved)
  
! 	open(article, "caesar <$article |"\|);		# decrypt article
  
  	open(extract, "|sort >/tmp/Tmp$$"\|);		# $$ is our process#
  
***************
*** 323,329 ****
  	open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
  
  .fi
! Explicitly closing the filehandle causes the parent process to wait for the
  child to finish, and returns the status value in $?.
  .Ip "ord(EXPR)" 8 3
  Returns the ascii value of the first character of EXPR.
--- 329,335 ----
  	open(FOO, "\-|") || exec \'cat\', \'\-n\', $file;
  
  .fi
! Explicitly closing any piped filehandle causes the parent process to wait for the
  child to finish, and returns the status value in $?.
  .Ip "ord(EXPR)" 8 3
  Returns the ascii value of the first character of EXPR.
***************
*** 1270,1276 ****
  running this script.
  (Mnemonic: same as shells.)
  .Ip $? 8
! The status returned by the last backtick (\`\`) command or
  .I system
  operator.
  Note that this is the status word returned by the wait() system
--- 1276,1282 ----
  running this script.
  (Mnemonic: same as shells.)
  .Ip $? 8
! The status returned by the last pipe close, backtick (\`\`) command or
  .I system
  operator.
  Note that this is the status word returned by the wait() system
***************
*** 1630,1637 ****
  .PP
  .I Perl
  is at the mercy of the C compiler's definitions of various operations
! such as % and atof().
! In particular, don't trust % on negative numbers.
  .PP
  While none of the built-in data types have any arbitrary size limits (apart
  from memory size), there are still a few arbitrary limits:
--- 1636,1646 ----
  .PP
  .I Perl
  is at the mercy of the C compiler's definitions of various operations
! such atof().
! .PP
! If your stdio requires an seek or eof between reads and writes on a particular
! stream, so does
! .IR perl .
  .PP
  While none of the built-in data types have any arbitrary size limits (apart
  from memory size), there are still a few arbitrary limits:

Index: perl.y
Prereq: 2.0.1.3
*** perl.y.old	Wed Sep  7 17:17:53 1988
--- perl.y	Wed Sep  7 17:17:55 1988
***************
*** 1,6 ****
! /* $Header: perl.y,v 2.0.1.3 88/08/03 22:25:12 root Exp $
   *
   * $Log:	perl.y,v $
   * Revision 2.0.1.3  88/08/03  22:25:12  root
   * patch11: deleted fossilized join syntax
   * patch11: fixed join('a','b')
--- 1,9 ----
! /* $Header: perl.y,v 2.0.1.4 88/09/07 16:55:41 lwall Locked $
   *
   * $Log:	perl.y,v $
+  * Revision 2.0.1.4  88/09/07  16:55:41  lwall
+  * patch14: case insensitive search speedup
+  * 
   * Revision 2.0.1.3  88/08/03  22:25:12  root
   * patch11: deleted fossilized join syntax
   * patch11: fixed join('a','b')
***************
*** 656,662 ****
  	|	FUNC2 '(' sexpr ',' expr ')'
  			{ $$ = make_op($1, 2, $3, $5, Nullarg, 0);
  			    if ($1 == O_INDEX && $$[2].arg_type == A_SINGLE)
! 				fbmcompile($$[2].arg_ptr.arg_str); }
  	|	FUNC3 '(' sexpr ',' sexpr ',' expr ')'
  			{ $$ = make_op($1, 3, $3, $5, $7, 0); }
  	|	STABFUN '(' WORD ')'
--- 659,665 ----
  	|	FUNC2 '(' sexpr ',' expr ')'
  			{ $$ = make_op($1, 2, $3, $5, Nullarg, 0);
  			    if ($1 == O_INDEX && $$[2].arg_type == A_SINGLE)
! 				fbmcompile($$[2].arg_ptr.arg_str,0); }
  	|	FUNC3 '(' sexpr ',' sexpr ',' expr ')'
  			{ $$ = make_op($1, 3, $3, $5, $7, 0); }
  	|	STABFUN '(' WORD ')'

Index: perly.c
Prereq: 2.0.1.6
*** perly.c.old	Wed Sep  7 17:18:11 1988
--- perly.c	Wed Sep  7 17:18:17 1988
***************
*** 1,6 ****
! char rcsid[] = "$Header: perly.c,v 2.0.1.6 88/08/05 01:29:43 root Exp $";
  /*
   * $Log:	perly.c,v $
   * Revision 2.0.1.6  88/08/05  01:29:43  root
   * patch13: fixed loop stack overflow on goto
   * patch13: fixed recursive subroutine storage management
--- 1,10 ----
! char rcsid[] = "$Header: perly.c,v 2.0.1.7 88/09/07 16:57:47 lwall Locked $";
  /*
   * $Log:	perly.c,v $
+  * Revision 2.0.1.7  88/09/07  16:57:47  lwall
+  * patch14: $foo = `echo $foo` now works right
+  * patch14: % should now work with a negative left argument
+  * 
   * Revision 2.0.1.6  88/08/05  01:29:43  root
   * patch13: fixed loop stack overflow on goto
   * patch13: fixed recursive subroutine storage management
***************
*** 1103,1110 ****
  	       (chld[1].arg_type == A_INDREAD && !(arg[1].arg_flags & AF_SPECIAL))
  		||
  	       (chld[1].arg_type == A_GLOB && !(arg[1].arg_flags & AF_SPECIAL))
! 		||
! 	       chld[1].arg_type == A_BACKTICK ) ) ) ) {
  	    arg[2].arg_type = chld[1].arg_type;
  	    arg[2].arg_ptr = chld[1].arg_ptr;
  	    free_arg(chld);
--- 1107,1113 ----
  	       (chld[1].arg_type == A_INDREAD && !(arg[1].arg_flags & AF_SPECIAL))
  		||
  	       (chld[1].arg_type == A_GLOB && !(arg[1].arg_flags & AF_SPECIAL))
! 	      ) ) ) ) {
  	    arg[2].arg_type = chld[1].arg_type;
  	    arg[2].arg_ptr = chld[1].arg_ptr;
  	    free_arg(chld);
***************
*** 1184,1189 ****
--- 1187,1193 ----
      register char *tmps;
      int i;
      unsigned long tmplong;
+     long tmp2;
      double exp(), log(), sqrt(), modf();
      char *crypt();
      double sin(), cos(), atan2(), pow();
***************
*** 1227,1233 ****
  	    tmplong = (long)str_gnum(s2);
  	    if (tmplong == 0L)
  		fatal("Illegal modulus of constant zero");
! 	    str_numset(str,(double)(((long)str_gnum(s1)) % tmplong));
  	    break;
  	case O_ADD:
  	    value = str_gnum(s1);
--- 1231,1241 ----
  	    tmplong = (long)str_gnum(s2);
  	    if (tmplong == 0L)
  		fatal("Illegal modulus of constant zero");
! 	    tmp2 = (long)str_gnum(s1);
! 	    if (tmp2 >= 0)
! 		str_numset(str,(double)(tmp2 % tmplong));
! 	    else
! 		str_numset(str,(double)(tmplong - (-tmp2 % tmplong)));
  	    break;
  	case O_ADD:
  	    value = str_gnum(s1);

Index: regexp.c
Prereq: 2.0.1.4
*** regexp.c.old	Wed Sep  7 17:18:30 1988
--- regexp.c	Wed Sep  7 17:18:34 1988
***************
*** 7,15 ****
   * blame Henry for some of the lack of readability.
   */
  
! /* $Header: regexp.c,v 2.0.1.4 88/08/03 22:37:26 root Exp $
   *
   * $Log:	regexp.c,v $
   * Revision 2.0.1.4  88/08/03  22:37:26  root
   * patch11: deleted regchar()
   * patch11: fixed some pointer arithmetic that didn't work on the 286
--- 7,19 ----
   * blame Henry for some of the lack of readability.
   */
  
! /* $Header: regexp.c,v 2.0.1.5 88/09/07 17:02:10 lwall Locked $
   *
   * $Log:	regexp.c,v $
+  * Revision 2.0.1.5  88/09/07  17:02:10  lwall
+  * patch14: case insensitive search speedup
+  * patch14: backreferences weren't treated as variable length
+  * 
   * Revision 2.0.1.4  88/08/03  22:37:26  root
   * patch11: deleted regchar()
   * patch11: fixed some pointer arithmetic that didn't work on the 286
***************
*** 154,160 ****
   */
  
  /* The following have no fixed length. */
! char varies[] = {BRANCH,BACK,STAR,PLUS,REF,0};
  
  /* The following always have a length of 1. */
  char simple[] = {ANY,ANYOF,ANYBUT,ALNUM,NALNUM,SPACE,NSPACE,DIGIT,NDIGIT,0};
--- 158,165 ----
   */
  
  /* The following have no fixed length. */
! char varies[] = {BRANCH,BACK,STAR,PLUS,
! 	REF+1,REF+2,REF+3,REF+4,REF+5,REF+6,REF+7,REF+8,REF+9,0};
  
  /* The following always have a length of 1. */
  char simple[] = {ANY,ANYOF,ANYBUT,ALNUM,NALNUM,SPACE,NSPACE,DIGIT,NDIGIT,0};
***************
*** 337,343 ****
  	r->regback = -1;
  	r->regstclass = Nullch;
  	scan = r->program+1;			/* First BRANCH. */
! 	if (!fold && OP(regnext(scan)) == END) {/* Only one top-level choice. */
  		scan = NEXTOPER(scan);
  
  		first = scan;
--- 342,348 ----
  	r->regback = -1;
  	r->regstclass = Nullch;
  	scan = r->program+1;			/* First BRANCH. */
! 	if (OP(regnext(scan)) == END) {/* Only one top-level choice. */
  		scan = NEXTOPER(scan);
  
  		first = scan;
***************
*** 347,354 ****
  			first = NEXTOPER(first);
  
  		/* Starting-point info. */
! 		if (OP(first) == EXACTLY)
  			r->regstart = str_make(OPERAND(first)+1);
  		else if ((exp = index(simple,OP(first))) && exp > simple)
  			r->regstclass = first;
  		else if (OP(first) == BOUND || OP(first) == NBOUND)
--- 352,362 ----
  			first = NEXTOPER(first);
  
  		/* Starting-point info. */
! 		if (OP(first) == EXACTLY) {
  			r->regstart = str_make(OPERAND(first)+1);
+ 			if (r->regstart->str_cur > !(sawstudy|fold))
+ 				fbmcompile(r->regstart,fold);
+ 		}
  		else if ((exp = index(simple,OP(first))) && exp > simple)
  			r->regstclass = first;
  		else if (OP(first) == BOUND || OP(first) == NBOUND)
***************
*** 411,418 ****
  			if (back < 0)
  				back = -1;
  			r->regback = back;
! 			if (len > !(sawstudy))
! 				fbmcompile(r->regmust);
  			*(long*)&r->regmust->str_nval = 100;
  		}
  		else
--- 419,426 ----
  			if (back < 0)
  				back = -1;
  			r->regback = back;
! 			if (len > !(sawstudy|fold))
! 				fbmcompile(r->regmust,fold);
  			*(long*)&r->regmust->str_nval = 100;
  		}
  		else

Index: stab.c
Prereq: 2.0.1.4
*** stab.c.old	Wed Sep  7 17:18:42 1988
--- stab.c	Wed Sep  7 17:18:44 1988
***************
*** 1,6 ****
! /* $Header: stab.c,v 2.0.1.4 88/08/03 22:38:51 root Exp $
   *
   * $Log:	stab.c,v $
   * Revision 2.0.1.4  88/08/03  22:38:51  root
   * patch11: added sanity check on $- going negative
   * 
--- 1,9 ----
! /* $Header: stab.c,v 2.0.1.5 88/09/07 17:03:28 lwall Locked $
   *
   * $Log:	stab.c,v $
+  * Revision 2.0.1.5  88/09/07  17:03:28  lwall
+  * patch14: attempted fix for machines where $* = 1 was failing
+  * 
   * Revision 2.0.1.4  88/08/03  22:38:51  root
   * patch11: added sanity check on $- going negative
   * 
***************
*** 286,292 ****
  	    }
  	    break;
  	case '*':
! 	    multiline = (int)str_gnum(str) != 0;
  	    break;
  	case '/':
  	    record_separator = *str_get(str);
--- 289,296 ----
  	    }
  	    break;
  	case '*':
! 	    i = (int)str_gnum(str);
! 	    multiline = (i != 0);
  	    break;
  	case '/':
  	    record_separator = *str_get(str);

Index: str.h
Prereq: 2.0.1.1
*** str.h.old	Wed Sep  7 17:18:49 1988
--- str.h	Wed Sep  7 17:18:50 1988
***************
*** 1,6 ****
! /* $Header: str.h,v 2.0.1.1 88/08/03 22:43:53 root Exp $
   *
   * $Log:	str.h,v $
   * Revision 2.0.1.1  88/08/03  22:43:53  root
   * patch11: support for botched C compilers that ungrok && outside of conditionals
   * 
--- 1,9 ----
! /* $Header: str.h,v 2.0.1.2 88/09/07 17:04:00 lwall Locked $
   *
   * $Log:	str.h,v $
+  * Revision 2.0.1.2  88/09/07  17:04:00  lwall
+  * patch14: searches should now work on chars with the 128 bit set
+  * 
   * Revision 2.0.1.1  88/08/03  22:43:53  root
   * patch11: support for botched C compilers that ungrok && outside of conditionals
   * 
***************
*** 20,27 ****
      } str_link;
      char	str_pok;	/* state of str_ptr */
      char	str_nok;	/* state of str_nval */
!     char	str_rare;	/* used by search strings */
!     char	str_prev;	/* also used by search strings */
  };
  
  #define Nullstr Null(STR*)
--- 23,30 ----
      } str_link;
      char	str_pok;	/* state of str_ptr */
      char	str_nok;	/* state of str_nval */
!     unsigned char str_rare;	/* used by search strings */
!     unsigned char str_prev;	/* also used by search strings */
  };
  
  #define Nullstr Null(STR*)

Index: toke.c
Prereq: 2.0.1.4
*** toke.c.old	Wed Sep  7 17:19:02 1988
--- toke.c	Wed Sep  7 17:19:05 1988
***************
*** 1,6 ****
! /* $Header: toke.c,v 2.0.1.4 88/08/03 22:47:39 root Exp $
   *
   * $Log:	toke.c,v $
   * Revision 2.0.1.4  88/08/03  22:47:39  root
   * patch11: unterminated literal strings blew up tokener in eval
   * 
--- 1,12 ----
! /* $Header: toke.c,v 2.0.1.5 88/09/07 17:09:52 lwall Locked $
   *
   * $Log:	toke.c,v $
+  * Revision 2.0.1.5  88/09/07  17:09:52  lwall
+  * patch14: added detection of "sort" not used as keyword
+  * patch14: case insensitive search speedup
+  * patch14: evals of long strings could use up gobs of memory
+  * patch14: $) and $| weren't properly evaluated in `` or ""
+  * 
   * Revision 2.0.1.4  88/08/03  22:47:39  root
   * patch11: unterminated literal strings blew up tokener in eval
   * 
***************
*** 591,596 ****
--- 597,605 ----
  	if (strEQ(d,"symlink"))
  	    FUN2(O_SYMLINK);
  	if (strEQ(d,"sort")) {
+ 	    while (*s && isspace(*s)) s++;
+ 	    if (*s == ';' || *s == ')')
+ 		fatal("sort is now a reserved word\n");
  	    yylval.ival = O_SORT;
  	    OPERATOR(LISTOP);
  	}
***************
*** 781,786 ****
--- 790,796 ----
  {
      register SPAT *spat = (SPAT *) safemalloc(sizeof (SPAT));
      register char *d;
+     SPAT savespat;
  
      bzero((char *)spat, sizeof(SPAT));
      spat->spat_next = spat_root;	/* link into spat list */
***************
*** 804,809 ****
--- 814,820 ----
      s++;
      if (*s == 'i') {
  	s++;
+ 	sawi = TRUE;
  	spat->spat_flags |= SPAT_FOLD;
      }
      for (d=tokenbuf; *d; d++) {
***************
*** 817,843 ****
  	    goto got_pat;		/* skip compiling for now */
  	}
      }
!     if (!(spat->spat_flags & SPAT_FOLD)) {
! 	if (*tokenbuf == '^') {
! 	    spat->spat_short = scanconst(tokenbuf+1);
! 	    if (spat->spat_short) {
! 		spat->spat_slen = strlen(spat->spat_short->str_ptr);
! 		if (spat->spat_slen == strlen(tokenbuf+1))
! 		    spat->spat_flags |= SPAT_ALL;
! 	    }
  	}
- 	else {
- 	    spat->spat_flags |= SPAT_SCANFIRST;
- 	    spat->spat_short = scanconst(tokenbuf);
- 	    if (spat->spat_short) {
- 		spat->spat_slen = strlen(spat->spat_short->str_ptr);
- 		if (spat->spat_slen == strlen(tokenbuf))
- 		    spat->spat_flags |= SPAT_ALL;
- 	    }
- 	}	
      }
!     spat->spat_regexp = regcomp(tokenbuf,spat->spat_flags & SPAT_FOLD,1);
!     hoistmust(spat);
    got_pat:
      yylval.arg = make_match(O_MATCH,stab2arg(A_STAB,defstab),spat);
      return s;
--- 828,873 ----
  	    goto got_pat;		/* skip compiling for now */
  	}
      }
!     if (spat->spat_flags & SPAT_FOLD)
! #ifdef STRUCTCOPY
! 	savespat = *spat;
! #else
! 	bcopy((char *)spat, (char *)&savespat, sizeof(SPAT));
! #endif
!     if (*tokenbuf == '^') {
! 	spat->spat_short = scanconst(tokenbuf+1);
! 	if (spat->spat_short) {
! 	    spat->spat_slen = strlen(spat->spat_short->str_ptr);
! 	    if (spat->spat_slen == strlen(tokenbuf+1))
! 		spat->spat_flags |= SPAT_ALL;
  	}
      }
!     else {
! 	spat->spat_flags |= SPAT_SCANFIRST;
! 	spat->spat_short = scanconst(tokenbuf);
! 	if (spat->spat_short) {
! 	    spat->spat_slen = strlen(spat->spat_short->str_ptr);
! 	    if (spat->spat_slen == strlen(tokenbuf))
! 		spat->spat_flags |= SPAT_ALL;
! 	}
!     }	
!     if ((spat->spat_flags & SPAT_ALL) && (spat->spat_flags & SPAT_SCANFIRST)) {
! 	fbmcompile(spat->spat_short, spat->spat_flags & SPAT_FOLD);
! 	spat->spat_regexp = regcomp(tokenbuf,spat->spat_flags & SPAT_FOLD,1);
! 		/* Note that this regexp can still be used if someone says
! 		 * something like /a/ && s//b/;  so we can't delete it.
! 		 */
!     }
!     else {
! 	if (spat->spat_flags & SPAT_FOLD)
! #ifdef STRUCTCOPY
! 	    *spat = savespat;
! #else
! 	    bcopy((char *)&savespat, (char *)spat, sizeof(SPAT));
! #endif
! 	spat->spat_regexp = regcomp(tokenbuf,spat->spat_flags & SPAT_FOLD,1);
! 	hoistmust(spat);
!     }
    got_pat:
      yylval.arg = make_match(O_MATCH,stab2arg(A_STAB,defstab),spat);
      return s;
***************
*** 878,884 ****
  	spat->spat_short = scanconst(tokenbuf);
  	if (spat->spat_short)
  	    spat->spat_slen = strlen(spat->spat_short->str_ptr);
!     }	
      d = savestr(tokenbuf);
  get_repl:
      s = scanstr(s);
--- 908,914 ----
  	spat->spat_short = scanconst(tokenbuf);
  	if (spat->spat_short)
  	    spat->spat_slen = strlen(spat->spat_short->str_ptr);
!     }
      d = savestr(tokenbuf);
  get_repl:
      s = scanstr(s);
***************
*** 902,920 ****
  	}
  	if (*s == 'i') {
  	    s++;
  	    spat->spat_flags |= SPAT_FOLD;
  	}
      }
      if (!spat->spat_runtime) {
  	spat->spat_regexp = regcomp(d, spat->spat_flags & SPAT_FOLD,1);
  	hoistmust(spat);
  	safefree(d);
      }
-     if (spat->spat_flags & SPAT_FOLD) {		/* Oops, disable optimization */
- 	str_free(spat->spat_short);
- 	spat->spat_short = Nullstr;
- 	spat->spat_slen = 0;
-     }
      yylval.arg = make_match(O_SUBST,stab2arg(A_STAB,defstab),spat);
      return s;
  }
--- 932,953 ----
  	}
  	if (*s == 'i') {
  	    s++;
+ 	    sawi = TRUE;
  	    spat->spat_flags |= SPAT_FOLD;
+ 	    if (!(spat->spat_flags & SPAT_SCANFIRST)) {
+ 		str_free(spat->spat_short);	/* anchored opt doesn't do */
+ 		spat->spat_short = Nullstr;	/* case insensitive match */
+ 		spat->spat_slen = 0;
+ 	    }
  	}
      }
+     if (spat->spat_short && (spat->spat_flags & SPAT_SCANFIRST))
+ 	fbmcompile(spat->spat_short, spat->spat_flags & SPAT_FOLD);
      if (!spat->spat_runtime) {
  	spat->spat_regexp = regcomp(d, spat->spat_flags & SPAT_FOLD,1);
  	hoistmust(spat);
  	safefree(d);
      }
      yylval.arg = make_match(O_SUBST,stab2arg(A_STAB,defstab),spat);
      return s;
  }
***************
*** 1161,1167 ****
  	    int sqstart = line;
  	    char *tmps;
  
! 	    tmpstr = str_new(strlen(s));
  	    s = str_append_till(tmpstr,s+1,term,leave);
  	    while (!*s) {	/* multiple line string? */
  		if (!rsfp || !(s = str_gets(linestr, rsfp))) {
--- 1194,1200 ----
  	    int sqstart = line;
  	    char *tmps;
  
! 	    tmpstr = str_new(0);
  	    s = str_append_till(tmpstr,s+1,term,leave);
  	    while (!*s) {	/* multiple line string? */
  		if (!rsfp || !(s = str_gets(linestr, rsfp))) {
***************
*** 1172,1177 ****
--- 1205,1214 ----
  		s = str_append_till(tmpstr,s,term,leave);
  	    }
  	    s++;
+ 	    if (tmpstr->str_cur + 5 < tmpstr->str_len) {
+ 		tmpstr->str_len = tmpstr->str_cur + 1;
+ 		tmpstr->str_ptr = saferealloc(tmpstr->str_ptr,tmpstr->str_len);
+ 	    }
  	    if (term == '\'') {
  		arg[1].arg_ptr.arg_str = tmpstr;
  		break;
***************
*** 1182,1188 ****
  		if (*s == '\\' && s[1] && isdigit(s[1]) && !isdigit(s[2]) &&
  		  !index("`\"",term) )
  		    *s = '$';		/* grandfather \digit in subst */
! 		if (*s == '$' && s[1] && s[1] != ')' && s[1] != '|') {
  		    makesingle = FALSE;	/* force interpretation */
  		}
  		else if (*s == '\\' && s[1]) {
--- 1219,1226 ----
  		if (*s == '\\' && s[1] && isdigit(s[1]) && !isdigit(s[2]) &&
  		  !index("`\"",term) )
  		    *s = '$';		/* grandfather \digit in subst */
! 		if (*s == '$' && s[1] &&
! 		  (index("`\"",term) || (s[1] != ')' && s[1] != '|'))) {
  		    makesingle = FALSE;	/* force interpretation */
  		}
  		else if (*s == '\\' && s[1]) {

Index: util.c
Prereq: 2.0.1.3
*** util.c.old	Wed Sep  7 17:19:12 1988
--- util.c	Wed Sep  7 17:19:14 1988
***************
*** 1,6 ****
! /* $Header: util.c,v 2.0.1.3 88/08/03 22:48:34 root Exp $
   *
   * $Log:	util.c,v $
   * Revision 2.0.1.3  88/08/03  22:48:34  root
   * patch11: fiddled with declarations to keep some compilers happy
   * 
--- 1,10 ----
! /* $Header: util.c,v 2.0.1.4 88/09/07 17:12:49 lwall Locked $
   *
   * $Log:	util.c,v $
+  * Revision 2.0.1.4  88/09/07  17:12:49  lwall
+  * patch14: case insensitive search speedup
+  * patch14: searches should now work on chars with the 128 bit set
+  * 
   * Revision 2.0.1.3  88/08/03  22:48:34  root
   * patch11: fiddled with declarations to keep some compilers happy
   * 
***************
*** 201,210 ****
      register int i;
      register int len = str->str_cur;
  
!     str_grow(str,len+128);
      s = str->str_ptr;
      table = s + len;
!     for (i = 1; i < 128; i++) {
  	table[i] = len;
      }
      i = 0;
--- 205,214 ----
      register int i;
      register int len = str->str_cur;
  
!     str_grow(str,len+256);
      s = str->str_ptr;
      table = s + len;
!     for (i = 1; i < 256; i++) {
  	table[i] = len;
      }
      i = 0;
***************
*** 219,224 ****
--- 223,263 ----
  }
  #endif /* NOTDEF */
  
+ unsigned char fold[] = {
+ 	0,	1,	2,	3,	4,	5,	6,	7,
+ 	8,	9,	10,	11,	12,	13,	14,	15,
+ 	16,	17,	18,	19,	20,	21,	22,	23,
+ 	24,	25,	26,	27,	28,	29,	30,	31,
+ 	32,	33,	34,	35,	36,	37,	38,	39,
+ 	40,	41,	42,	43,	44,	45,	46,	47,
+ 	48,	49,	50,	51,	52,	53,	54,	55,
+ 	56,	57,	58,	59,	60,	61,	62,	63,
+ 	64,	'a',	'b',	'c',	'd',	'e',	'f',	'g',
+ 	'h',	'i',	'j',	'k',	'l',	'm',	'n',	'o',
+ 	'p',	'q',	'r',	's',	't',	'u',	'v',	'w',
+ 	'x',	'y',	'z',	91,	92,	93,	94,	95,
+ 	96,	'A',	'B',	'C',	'D',	'E',	'F',	'G',
+ 	'H',	'I',	'J',	'K',	'L',	'M',	'N',	'O',
+ 	'P',	'Q',	'R',	'S',	'T',	'U',	'V',	'W',
+ 	'X',	'Y',	'Z',	123,	124,	125,	126,	127,
+ 	128,	129,	130,	131,	132,	133,	134,	135,
+ 	136,	137,	138,	139,	140,	141,	142,	143,
+ 	144,	145,	146,	147,	148,	149,	150,	151,
+ 	152,	153,	154,	155,	156,	157,	158,	159,
+ 	160,	161,	162,	163,	164,	165,	166,	167,
+ 	168,	169,	170,	171,	172,	173,	174,	175,
+ 	176,	177,	178,	179,	180,	181,	182,	183,
+ 	184,	185,	186,	187,	188,	189,	190,	191,
+ 	192,	193,	194,	195,	196,	197,	198,	199,
+ 	200,	201,	202,	203,	204,	205,	206,	207,
+ 	208,	209,	210,	211,	212,	213,	214,	215,
+ 	216,	217,	218,	219,	220,	221,	222,	223,	
+ 	224,	225,	226,	227,	228,	229,	230,	231,
+ 	232,	233,	234,	235,	236,	237,	238,	239,
+ 	240,	241,	242,	243,	244,	245,	246,	247,
+ 	248,	249,	250,	251,	252,	253,	254,	255
+ };
+ 
  static unsigned char freq[] = {
  	1,	2,	84,	151,	154,	155,	156,	157,
  	165,	246,	250,	3,	158,	7,	18,	29,
***************
*** 255,274 ****
  };
  
  void
! fbmcompile(str)
  STR *str;
  {
!     register char *s;
!     register char *table;
      register int i;
      register int len = str->str_cur;
      int rarest = 0;
      int frequency = 256;
  
!     str_grow(str,len+128);
      table = str->str_ptr + len;		/* actually points at final '\0' */
      s = table - 1;
!     for (i = 1; i < 128; i++) {
  	table[i] = len;
      }
      i = 0;
--- 294,314 ----
  };
  
  void
! fbmcompile(str, iflag)
  STR *str;
+ int iflag;
  {
!     register unsigned char *s;
!     register unsigned char *table;
      register int i;
      register int len = str->str_cur;
      int rarest = 0;
      int frequency = 256;
  
!     str_grow(str,len+256);
      table = str->str_ptr + len;		/* actually points at final '\0' */
      s = table - 1;
!     for (i = 1; i < 256; i++) {
  	table[i] = len;
      }
      i = 0;
***************
*** 275,293 ****
      while (s >= str->str_ptr) {
  	if (!isascii(*s))
  	    return;
! 	if (table[*s] == len)
! 	    table[*s] = i;
  	s--,i++;
      }
      str->str_pok |= 2;		/* deep magic */
  
      s = str->str_ptr;		/* deeper magic */
!     for (i = 0; i < len; i++) {
! 	if (freq[s[i]] < frequency) {
! 	    rarest = i;
! 	    frequency = freq[s[i]];
  	}
      }
      str->str_rare = s[rarest];
      str->str_prev = rarest;
  #ifdef DEBUGGING
--- 315,352 ----
      while (s >= str->str_ptr) {
  	if (!isascii(*s))
  	    return;
! 	if (table[*s] == len) {
! 	    if (iflag)
! 		table[*s] = table[fold[*s]] = i;
! 	    else
! 		table[*s] = i;
! 	}
  	s--,i++;
      }
      str->str_pok |= 2;		/* deep magic */
  
      s = str->str_ptr;		/* deeper magic */
!     if (iflag) {
! 	register int tmp, foldtmp;
! 	str->str_pok |= 8;
! 	for (i = 0; i < len; i++) {
! 	    tmp=freq[s[i]];
! 	    foldtmp=freq[fold[s[i]]];
! 	    if (tmp < frequency && foldtmp < frequency) {
! 		rarest = i;
! 		/* choose most frequent among the two */
! 		frequency = (tmp > foldtmp) ? tmp : foldtmp;
! 	    }
  	}
      }
+     else {
+ 	for (i = 0; i < len; i++) {
+ 	    if (freq[s[i]] < frequency) {
+ 		rarest = i;
+ 		frequency = freq[s[i]];
+ 	    }
+ 	}
+     }
      str->str_rare = s[rarest];
      str->str_prev = rarest;
  #ifdef DEBUGGING
***************
*** 330,345 ****
  register char *bigend;
  STR *littlestr;
  {
!     register char *s;
      register int tmp;
      register int littlelen;
!     register char *little;
!     register char *table;
!     register char *olds;
!     register char *oldlittle;
      register int min;
  
!     if (littlestr->str_pok != 3)
  	return instr(big,littlestr->str_ptr);
  
      littlelen = littlestr->str_cur;
--- 389,404 ----
  register char *bigend;
  STR *littlestr;
  {
!     register unsigned char *s;
      register int tmp;
      register int littlelen;
!     register unsigned char *little;
!     register unsigned char *table;
!     register unsigned char *olds;
!     register unsigned char *oldlittle;
      register int min;
  
!     if (littlestr->str_pok < 3)
  	return instr(big,littlestr->str_ptr);
  
      littlelen = littlestr->str_cur;
***************
*** 346,369 ****
      table = littlestr->str_ptr + littlelen;
      s = big + --littlelen;
      oldlittle = little = table - 1;
!     while (s < bigend) {
!       top:
! 	if (tmp = table[*s]) {
! 	    s += tmp;
  	}
! 	else {
! 	    tmp = littlelen;	/* less expensive than calling strncmp() */
! 	    olds = s;
! 	    while (tmp--) {
! 		if (*--s == *--little)
! 		    continue;
! 		s = olds + 1;	/* here we pay the price for failure */
! 		little = oldlittle;
! 		if (s < bigend)	/* fake up continue to outer loop */
! 		    goto top;
! 		return Nullch;
  	    }
! 	    return s;
  	}
      }
      return Nullch;
--- 405,452 ----
      table = littlestr->str_ptr + littlelen;
      s = big + --littlelen;
      oldlittle = little = table - 1;
!     if (littlestr->str_pok & 8) {	/* case insensitive? */
! 	while (s < bigend) {
! 	  top1:
! 	    if (tmp = table[*s]) {
! 		s += tmp;
! 	    }
! 	    else {
! 		tmp = littlelen;	/* less expensive than calling strncmp() */
! 		olds = s;
! 		while (tmp--) {
! 		    if (*--s == *--little || fold[*s] == *little)
! 			continue;
! 		    s = olds + 1;	/* here we pay the price for failure */
! 		    little = oldlittle;
! 		    if (s < bigend)	/* fake up continue to outer loop */
! 			goto top1;
! 		    return Nullch;
! 		}
! 		return s;
! 	    }
  	}
!     }
!     else {
! 	while (s < bigend) {
! 	  top2:
! 	    if (tmp = table[*s]) {
! 		s += tmp;
  	    }
! 	    else {
! 		tmp = littlelen;	/* less expensive than calling strncmp() */
! 		olds = s;
! 		while (tmp--) {
! 		    if (*--s == *--little)
! 			continue;
! 		    s = olds + 1;	/* here we pay the price for failure */
! 		    little = oldlittle;
! 		    if (s < bigend)	/* fake up continue to outer loop */
! 			goto top2;
! 		    return Nullch;
! 		}
! 		return s;
! 	    }
  	}
      }
      return Nullch;
***************
*** 374,385 ****
  STR *bigstr;
  STR *littlestr;
  {
!     register char *s, *x;
!     register char *big = bigstr->str_ptr;
      register int pos;
      register int previous;
      register int first;
!     register char *little;
  
      if ((pos = screamfirst[littlestr->str_rare]) < 0) 
  	return Nullch;
--- 457,468 ----
  STR *bigstr;
  STR *littlestr;
  {
!     register unsigned char *s, *x;
!     register unsigned char *big = bigstr->str_ptr;
      register int pos;
      register int previous;
      register int first;
!     register unsigned char *little;
  
      if ((pos = screamfirst[littlestr->str_rare]) < 0) 
  	return Nullch;
***************
*** 391,410 ****
  	if (!(pos += screamnext[pos]))
  	    return Nullch;
      }
!     do {
! 	if (big[pos] != first)
! 	    continue;
! 	for (x=big+pos+1,s=little; *s; /**/ ) {
! 	    if (!*x)
! 		return Nullch;
! 	    if (*s++ != *x++) {
! 		s--;
! 		break;
  	    }
! 	}
! 	if (!*s)
! 	    return big+pos;
!     } while (pos += screamnext[pos]);
      return Nullch;
  }
  
--- 474,511 ----
  	if (!(pos += screamnext[pos]))
  	    return Nullch;
      }
!     if (littlestr->str_pok & 8) {	/* case insignificant? */
! 	do {
! 	    if (big[pos] != first && big[pos] != fold[first])
! 		continue;
! 	    for (x=big+pos+1,s=little; *s; /**/ ) {
! 		if (!*x)
! 		    return Nullch;
! 		if (*s++ != *x++ && fold[*(s-1)] != *(x-1)) {
! 		    s--;
! 		    break;
! 		}
  	    }
! 	    if (!*s)
! 		return big+pos;
! 	} while (pos += screamnext[pos]);
!     }
!     else {
! 	do {
! 	    if (big[pos] != first)
! 		continue;
! 	    for (x=big+pos+1,s=little; *s; /**/ ) {
! 		if (!*x)
! 		    return Nullch;
! 		if (*s++ != *x++) {
! 		    s--;
! 		    break;
! 		}
! 	    }
! 	    if (!*s)
! 		return big+pos;
! 	} while (pos += screamnext[pos]);
!     }
      return Nullch;
  }
  



More information about the Comp.sources.bugs mailing list