A problematic twist with =~ and regex 
Author Message
 A problematic twist with =~ and regex

I'm having a similar problem as Dan's s/// substitution of spaces within
quotes.  I managed to solve *MOST* of the problem by gleaning code from
one of the FAQs, but my s/// has a bit of a twist:

To recite Dan's problem, I would like to split a text string -- by
spaces -- into terms or "keywords".  However, each term _may_ or _may
not_ be preceeded by a keyword modifier.  Also, a single multi-word term
may be contained within quotes.  The following is a code snipet that I
am currently using.


print STDERR "Raw Input:  $text\n\n";  # Debugging only


        ((?:.?)"[^\"\\]*(?:\\.[^\"\\]*)*)"\s?  # groups the phrase i$
        | ([^\s]+)\s?                    # Ignore extended whitespac$
        | \s

Quote:
}gx;


print "Terms:\n";                      # More

        print "$term\n";               # Lines        

Quote:
}                                      # ---

The regex is not yet complete.  It performs the term splitting (allowing
quoted text to remain as a single term).  It allows the singe character
keyword modifier to remain and removes the last quote.  HOWEVER, the
first quote is NEVER removed!  How can I modify this such that the
keyword modifier remains as the first character, and both quotes are
removed from the term?


Mon, 22 Nov 1999 03:00:00 GMT  
 A problematic twist with =~ and regex


Quote:

>         ((?:.?)"[^\"\\]*(?:\\.[^\"\\]*)*)"\s?  # groups the phrase i$
>         | ([^\s]+)\s?                    # Ignore extended whitespac$
>         | \s
> }gx;

Your comments seem truncated.

Quote:
> [the] first quote is NEVER removed!  How can I modify this such that the
> keyword modifier remains as the first character, and both quotes are
> removed from the term?

I think you want this for that second line:

         .?"([^\"\\]*(?:\\.[^\"\\]*)*)"\s?  # groups the phrase i$

If you really do want to capture the optional anychar before the
first quote, without catching the first quote, you probably should
use another statement to remove the quote.

Elijah
------
not sure that ".?" is a good idea



Mon, 22 Nov 1999 03:00:00 GMT  
 
 [ 2 post ] 

 Relevant Pages 

1. problematic regex to match 'first names'

2. Because twisted minds think twisted things...

3. Code fragment problematic?

4. problematic pattern matching

5. Problematic subroutine..

6. OT Y2K, was Re: Problematic horizontal scrollbars

7. How about a little twist on the shredding?

8. char to number conversion w/ a twist

9. search with a twist

10. s2p twisted

11. Sourcing with a twist of Parsing ?

 

 
Powered by phpBB® Forum Software