Finding all regexps matches at once? 
Author Message
 Finding all regexps matches at once?


Quote:
>... Well, if there are more than one
> URL on a given line, an itterative "grep"-type attempt will keep
> yielding the first URL... since it will still be a valid one. My only
> solution, at this point, is to just alter the first URL on the line and
> *hope* there aren't any others after it. This isn't reliable, though.

> Is there any way to get Perl to copy all strings matching a regexp into
> an array or something... or to make it search only the substring *after*
> the last thing it found (a la strtok(3) in C)?...

Does this do what you want? It is like a grep that ANDs its
regular expressions, rather than ORing them the way some
greps do.

$key_spec holds the /xxx/&/xxxx/&/xxx/ delimited set of regexps
that need to be matched.

In my case the beginning of each line contains a DOS filename.

open(KEY, "filename");                
while(<KEY>)
        {

        }
close(KEY);                                                                                             # extracts fnames from hit lines to array fn



Tue, 09 Sep 1997 12:39:36 GMT  
 Finding all regexps matches at once?
Following is a story about my adventures in writing a password-protected
http proxy server. If you don't care about that, but still want to help
me with my problem, just jump to the end of the post. Thanks!

I've just written a rudimentary http proxy gateway program in Perl4. The
reason I had to write one is because I don't know of any out there that
support password protection on the *proxy* function. This one does.

The way it ends up doing it is by getting the person's ID/password at
the outset and, thereafter, encoding that id/password into the
querystring of all of the URL's that the proxy relays back to the
client. Still with me? Good.

Now, the problem arises when I try to find all of the URL's in a HTML
document. Right now, I search for anything that starts with "HREF=",
"SRC=", or "ACTION=", and then I grab the URL, tack the ID/password
combo on the end, and put it back in. Well, if there are more than one
URL on a given line, an itterative "grep"-type attempt will keep
yielding the first URL... since it will still be a valid one. My only
solution, at this point, is to just alter the first URL on the line and
*hope* there aren't any others after it. This isn't reliable, though.

Is there any way to get Perl to copy all strings matching a regexp into
an array or something... or to make it search only the substring *after*
the last thing it found (a la strtok(3) in C)?.

I guess I could always do something icky like progressively take the
substring of the input and search for a URL at the beginning of it.
Ewww.
--
Joe Emenaker-CENSORED Engineer | Our infernal mailer daemon has been quite

Linux:"This isn't your father's| 4 lines. However, as you can see, I have
Oldsmo-Unix!" - DON'T finger me| figured out an elegant way to put as many as



Tue, 09 Sep 1997 11:00:21 GMT  
 Finding all regexps matches at once?

Quote:
>Now, the problem arises when I try to find all of the URL's in a HTML
>document. Right now, I search for anything that starts with "HREF=",
>"SRC=", or "ACTION=", and then I grab the URL, tack the ID/password
>combo on the end, and put it back in. Well, if there are more than one
>URL on a given line, an itterative "grep"-type attempt will keep
>yielding the first URL... since it will still be a valid one. My only
>solution, at this point, is to just alter the first URL on the line and
>*hope* there aren't any others after it. This isn't reliable, though.
>Is there any way to get Perl to copy all strings matching a regexp into
>an array or something... or to make it search only the substring *after*
>the last thing it found (a la strtok(3) in C)?.

Try this:
while (<>) {
  s/(<HREF=[^>]*)>/\1$youridpassword>/g

Quote:
}

I don't know much about URL's so this may need further refining (could
a URL be on two lines?). The main trick is the 'g' modifier which does global
substitution. If the replacement string is more complicated, an 'e'
modifier could be used. All this you can find on the third page of the FUNCTIONS
chapter in the Camel book.
Another way, if you dislike regexps, would be to use the index() function with
the additional parameter offset to start a new search after a hit.

                                                        Holger.
--

Holger Hellmuth at Uni Karlsruhe



Sun, 14 Sep 1997 09:52:57 GMT  
 Finding all regexps matches at once?
In comp.lang.perl,

:In my case the beginning of each line contains a DOS filename.
:
:open(KEY, "filename");              
:while(<KEY>)
:       {

:       }

Whoa.  There's a lurking bug in your worldview of arrays:




    which is a list of length one (not a scalar), and is a fairly common

    not really doing what you think it's doing for the reason you think
    it's doing it, which means one of these days, you'll shoot yourself
    in the foot; ponder for a moment what these will really do:


    Just always say $foo[2] and you'll be happier.

    This may seem confusing, but try to think of it this way:  you use the


    of %foo.  This is the same as using ($foo[1], $foo[2], $foo[3]) and
    ($foo{A}, $foo{B}, $foo{C}) respectively.  In fact, you can even use


--tom
--

"Our liberty depends upon the freedom of the press, and that cannot be
 limited without being lost."
        --Thomas Jefferson (1786)



Sun, 14 Sep 1997 16:45:38 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. Partial matching of regexps

2. Still stumped on match regexps

3. Multiline regexps with matches

4. matching word boundaries with regexps

5. Help needed: Regexps (pattern matching) with hashes?!

6. Finding minimal set of needed regexps

7. match all but once?

8. Matching two patterns at once

9. regexps with index(), regexps vs strings, clarify $*

10. Perl regexps compared to Python regexps

11. Get directory only once and put it in array with File::Find

12. How to set .htaccess valid once cookie found?

 

 
Powered by phpBB® Forum Software