Help a novice: Will "sed" do? *** SUMMARY OF RESPONSES ***

Dr. Rouben Rostamian rostamia at umbc3.UMBC.EDU
Wed Jul 19 05:36:43 AEST 1989


A couple of days ago I posted the following inquirey to this newsgroup:

>I need a command or a script that searches a text file for a given
>word or pattern and prints out all paragraphs that contain that word
>or pattern.  Paragraphs are blocks of text separated by one or
>more blank lines.
>

I have received a large number of private replies since, most of them 
quite helpful, and a large number of comments, suggestions, and solutions.
I would like to thank all the netters for their generous reponse and oppologize
to some for not being able to connect to their mailers for a direct
acknowledgement.

I also received a few requests to post a summary for other interested parties, 
so here it goes:
----------------------------------------------------------------------
Among the various methods suggested, the following, 
from     jk at cbnewsh.ATT.COM (ihor.j.kinal) 
and from cey at tcgould.TN.CORNELL.EDU (John Lacey)
and from hansen at pegasus.att.com (Tony Hansen) 
and from rupley!local at megaron.arizona.edu (John Rupley)
is the easiest and most transparent:

awk < text_file 'BEGIN{RS=""} /RE/'

where RE stands for the word or regular expression to be searched.

For instance,

awk < text_file 'BEGIN{RS=""} /This/ {print $0,"\n"} '

will print out all paragraphs in the file text_file which contain the string
"This" and will output a blank line before each paragraph.
It is essential that the blank lines separating the paragraphs to be
truely null; a line containing one or more blank characcters is NOT
a null line.
----------------------------------------------------------------------
Another solution, again using awk and attributed to an unknown source,
was sent to me by Gunter Steinbach <steinbac at hpl-opus.hp.com>i:

#! /bin/sh
#
# pgrep - grep by paragraphs
#

pattern=$1
shift

awk 'BEGIN              {printit=0;}
                        {blankline=0;}
    /'$pattern'/        {printit=1;}
    /^[  ]*$/ || /^.sp / || /^.[LP]P/   {
                        if (printit==1) {
                            print text;
                            print "";
                            }
                        text="";
                        printit=0;
                        blankline=1;
                        }
                       {if (blankline==0)
                            text=text $0 "\n";}
    END                 {if (printit==1) {
                            print text;
                            print;
                            }
                        }' $*
 
For instance, if this script is saved as a file named pgrep, then 
pgrep STRING < text_file 
will print all paragraphs containing the string "STRING"
This method considers either a null line or a line of blanks as
separator of paragraphs. A somewhat a similar solution was also offered
by campbell$bsw.com (Larry Campbell).

-- 
Rouben Rostamian
Department of Mathematics                      e-mail:
University of Maryland Baltimore Counnty       Rostamian at umbc2.bitnet
Baltimore, MD 21228                            rostamia at umbc3.umbc.edu



More information about the Comp.unix.questions mailing list