Parsing Lines for Words? 
Author Message
 Parsing Lines for Words?


is messy?
Steve


Mon, 10 Nov 1997 03:00:00 GMT  
 Parsing Lines for Words?

Quote:

>I'm looking for an easier way to do this, since I am not even close to a
>PERL master yet:

I'm not either and I've been using Perl since version 1 in 1987 or 1988 ---
I don't remember exactly!

Quote:

>Let's say I have a sentence, stored in $line.
>Now, I want to pick out, say the 7th word in $line and put it into $word.
>Or maybe the 11th.  Or 3rd.  Or the last, no matter what number that is.

>Is there a really simple way to do this?  And let's say I want the
>delimeter to be something other than spaces - how do I do it then?

There are always lotsa ways to do just about anything in Perl, the most
obvious, straightforward way is to use "split".

You can make the regexp for splitting be anything you like.  For sentences
you might choose something like "/[^-'a-z0-9]+/i" which accepts words
including contractions and hyphenated words and things like your "3rd".  I
spent only 10 seconds thinking about that pattern, so you might find a more
suitable one.  The one I gave will break on things like "1,000,000".

Quote:
>I've accomplished this with a combination of split(), index(), and
>substr(), but it is messy and doesnt work too well.  I havent found a way
>to just grab the last word, either.

For simplicity, assume you're splitting on whitespace:


    $fourthword = $line[3];
    $lastword = $line[$#line];

You don't need the explicit array reference, except that the LAST element
is trickier if you don't have the explicit array:

    $fourthword = (split(" ", $line))[3];
    $lastword = (reverse split(" ", $line))[0];

You can also do things line:

    ($thirdword, $fourthword, $fifthword) = (split(" ", $line))[2..4];

Quote:
>Thanks!

You're welcome!

        pete peterson

        (508)287-7478; Home: (508)256-5829 (Chelmsford, MA)



Mon, 10 Nov 1997 03:00:00 GMT  
 Parsing Lines for Words?
I'm looking for an easier way to do this, since I am not even close to a
PERL master yet:

Let's say I have a sentence, stored in $line.
Now, I want to pick out, say the 7th word in $line and put it into $word.
Or maybe the 11th.  Or 3rd.  Or the last, no matter what number that is.

Is there a really simple way to do this?  And let's say I want the
delimeter to be something other than spaces - how do I do it then?

I've accomplished this with a combination of split(), index(), and
substr(), but it is messy and doesnt work too well.  I havent found a way
to just grab the last word, either.

Thanks!

--
=============================================================================

   The Official Internet Raytracing Competition!  More info and pics here:
                     ftp://ftp.povray.org/pub/competition
+-------------------------------------+-------------------------------------+
|  WWW-MK3-PERL-POVRAY-RGVA-CGR-ISCA  |  E-mail me if you have spare money! |
=============================================================================



Mon, 10 Nov 1997 03:00:00 GMT  
 Parsing Lines for Words?

M> I'm looking for an easier way to do this, since I am not even close
M> to a PERL master yet:
M> Let's say I have a sentence, stored in $line.
M> Now, I want to pick out, say the 7th word in $line and put it into $word.
M> Or maybe the 11th.  Or 3rd.  Or the last, no matter what number
M> that is.

If the line is in $line, and the word number is in $index:

        $word = (split(/\s+/, $line))[$index-1];

The '-1' is because the list built using split will be indexed from 0.

If you want to split it up using something other than whitespace, just
alter the match pattern in the split function.

Mark Conmy
Support Team,           School of Computer Studies,
University of Leeds,    England.



Mon, 10 Nov 1997 03:00:00 GMT  
 Parsing Lines for Words?

M> I've accomplished this with a combination of split(), index(), and
M> substr(), but it is messy and doesnt work too well.  I havent found
M> a way to just grab the last word, either.

Sorry - forgot to answer this:

        $lastword = (reverse(split(/\s+/, $line)))[0];

Mark Conmy
Support Team,        School of Computer Studies,
Unversity of Leeds,  England.



Mon, 10 Nov 1997 03:00:00 GMT  
 Parsing Lines for Words?
Thanks to everyone who has responded to this.  Since i got so many
replies (at least 8 so far), I thought I'd post and say thanks rather
than trying to e-mail each person :)

I don't know why I didn't come up with the way to do this.  I guess it
was a mental block, because I've done things a lot more complicated than
this before.

But it's just great that so many people are willing to help out.  Thanks!

--
=============================================================================

   The Official Internet Raytracing Competition!  More info and pics here:
                     ftp://ftp.povray.org/pub/competition
+-------------------------------------+-------------------------------------+
|  WWW-MK3-PERL-POVRAY-RGVA-CGR-ISCA  |  E-mail me if you have spare money! |
=============================================================================



Tue, 11 Nov 1997 03:00:00 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Parsing line of text into words

2. Searching a WORD 6 DOC by line line number

3. Parsing line by line

4. words words words

5. Parse a word into three strings

6. Parsing Word to ASCII

7. parsing a template and replacing certain words (from a form)

8. Parsing Word Docs

9. parsing words

10. How to parse MS word documents?

11. Extract last word on line

12. Words versus Lines versus Paragraphs

 

 
Powered by phpBB® Forum Software