Put multiple matched lines into an array? 
Author Message
 Put multiple matched lines into an array?

Hi all, I'm back :-)

I'm trying to build on one of the PHP manual examples, like so:

<?php
$file = fopen (" http://www.*-*-*.com/ ;, "r");
if (!$file) {
    echo "<p>Unable to open remote file.\n";
    exit;

Quote:
}

while (!feof ($file)) {
    $line = fgets ($file, 1024);
    if (eregi ("<A HREF=/DirectoryToMatch/.*>(.*)</A>", $line, $out)) {
        $myvar1 = $out[1];
        $myvar2 = $out[2];
        $myvar3 = $out[3];
        $myvar4 = $out[4];
        $myvar5 = $out[5];
        $myvar6 = $out[6];
    }
Quote:
}

fclose($file);

echo "Line 1: $myvar1<br />
Line 2: $myvar2<br />
Line 3: $myvar3<br />
Line 4: $myvar4<br />
Line 5: $myvar5<br />
Line 6: $myvar6<br />";
?>

However, all that gets echoed is the last successful match, and it is
located in $myvar1/$out[1]

My source file looks a little like this:

<TABLE>
<TR>
<TD><A HREF=/DirectoryToMatch/page1.html>Link 1</A></TD>
<TD><A HREF=/DirectoryToMatch/page2.html>Link 2</A></TD>
<TD><A HREF=/DirectoryToMatch/page3.html>Link 3</A></TD>
<TD><A HREF=/DirectoryToMatch/page4.html>Link 4</A></TD>
<TD><A HREF=/DirectoryToMatch/page5.html>Link 5</A></TD>
<TD><A HREF=/DirectoryToMatch/page6.html>Link 6</A></TD>
</TR>
</TABLE>

Although in the real file there are a lot more table cells, and rows, etc.,
I basically need to pull out the "Link 1" part from each line, and
everything between <A and A> is consistent for each, which should make the
regexp easier.

How do I match every matching line, and put each of the matching strings in
the same array so I can echo them later on?

Thanks for your council and advice...
--
dan rubin

webgraph: branding | usability | design
< http://www.*-*-*.com/ ;



Tue, 26 Apr 2005 15:00:36 GMT  
 Put multiple matched lines into an array?

Quote:
>     if (eregi ("<A HREF=/DirectoryToMatch/.*>(.*)</A>", $line, $out)) {
> How do I match every matching line, and put each of the matching strings
in
> the same array so I can echo them later on?

Try preg_match_all() instead

preg_match_all("/<A HREF=/DirectoryToMatch/.*>(.*)</A>/", $line, $out))

print_r($out); // have a look at this to see what you capture

All of the matches are stored in a multi-dimensional array, where:

$out[0]     matches for the _whole_ regexp
$out[1]     matches for the first group

so to print out all of the links captured by your (.*)

foreach ($out[1] as $link) {
    echo $link;

Quote:
}

Also, I suggest replacing (.*) with (.*?)
It is less greedy (i.e. it will match the least amount possible), which is
good if you have more than one link on a single line. For your example,
newlines are stoping the (.*) from matching everything from the first <A
...> to the _very_ last </A>

regards,
reggie.



Tue, 26 Apr 2005 16:24:22 GMT  
 Put multiple matched lines into an array?
On 11/8/02 3:24 AM, in article

<snip>

Quote:
> Also, I suggest replacing (.*) with (.*?)
> It is less greedy (i.e. it will match the least amount possible), which is
> good if you have more than one link on a single line. For your example,
> newlines are stoping the (.*) from matching everything from the first <A
> ...> to the _very_ last </A>

> regards,
> reggie.

And you said you're still learning ;-)

I owe you thanks again. Now I finally understand the "greedy" element of
regexp, and you're right, non-greedy is the better way to go.

I'm also starting to understand arrays a bit more, thanks for the detailed
run-down of the internal workings in your answer!

Cheers,
--
dan rubin

webgraph: branding | usability | design
<http://www.webgraph.com/>



Tue, 26 Apr 2005 16:30:08 GMT  
 Put multiple matched lines into an array?

Quote:


> >     if (eregi ("<A HREF=/DirectoryToMatch/.*>(.*)</A>", $line, $out)) {

> > How do I match every matching line, and put each of the matching strings
> in
> > the same array so I can echo them later on?

> Try preg_match_all() instead

> preg_match_all("/<A HREF=/DirectoryToMatch/.*>(.*)</A>/", $line, $out))

argh - my regexp is a horrible mess :(

I'll try again:

preg_match_all("|<A HREF=['"]?/DirectoryToMatch/.*?['"]?>(.*?)</A>|i";,
                $line, $out);

That's a little better.

FWIW - making a match non-greedy when appropriate makes the regexp more
efficient.

regards,
reggie.



Tue, 26 Apr 2005 17:00:30 GMT  
 Put multiple matched lines into an array?
On 11/8/02 4:00 AM, in article

Quote:

> argh - my regexp is a horrible mess :(

> I'll try again:

> preg_match_all("|<A HREF=['"]?/DirectoryToMatch/.*?['"]?>(.*?)</A>|i";,
>               $line, $out);

> That's a little better.

> FWIW - making a match non-greedy when appropriate makes the regexp more
> efficient.

And that's certainly a good thing to do :-)

OK, I've tried both your "horrible mess" regexp and this new one, and
print_r is the only line that seems to be outputting anything (warning: long
paste):

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
            [0] => Link 1
        )

    [1] => Array
        (
            [0] => Link 1
        )

)

Array
(
    [0] => Array
        (
            [0] => Link 2
        )

    [1] => Array
        (
            [0] => Link 2
        )

)

Array
(
    [0] => Array
        (
            [0] => Link 3
        )

    [1] => Array
        (
            [0] => Link 3
        )

)

Array
(
    [0] => Array
        (
            [0] => Link 4
        )

    [1] => Array
        (
            [0] => Link 4
        )

)

Array
(
    [0] => Array
        (
            [0] => Link 5
        )

    [1] => Array
        (
            [0] => Link 5
        )

)

Array
(
    [0] => Array
        (
            [0] => Link 6
        )

    [1] => Array
        (
            [0] => Link 6
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)

I know that print_r is supposed to output information in a human-readable
form, but since I'm not entirely sure what I'm reading, it doesn't do me any
good just yet :-)

And I'm a bit confused as to why this is apparently matching the items I
want it to, but they are not getting output by the following line:

foreach ($out[1] as $link) {
    echo $link;

Quote:
}

FWIW, here's all the code that's contributing to this:

<?php
$file = fopen ("http://my.url.com/", "r");
if (!$file) {
    echo "<p>Unable to open remote file.</p>";
    exit;

Quote:
}

while (!feof ($file)) {
    $line = fgets ($file, 4096);
    preg_match_all("|<A HREF=['\"]?/DirectoryToMatch/.*?['\"]?>(.*?)</A>|i",
$line, $out);

    echo "<pre>";
    print_r($out); // have a look at this to see what you capture
    echo "</pre>";

Quote:
}

fclose($file);

foreach ($out[1] as $link) {
    echo $link;

Quote:
}

?>

Thanks!
--
dan rubin

webgraph: branding | usability | design
<http://www.webgraph.com/>



Wed, 27 Apr 2005 00:35:56 GMT  
 Put multiple matched lines into an array?

Quote:
> OK, I've tried both your "horrible mess" regexp and this new one, and
> print_r is the only line that seems to be outputting anything (warning:
long
> paste):

<snip output>

Quote:
> while (!feof ($file)) {
>     $line = fgets ($file, 4096);
>     preg_match_all("|<A

HREF=['\"]?/DirectoryToMatch/.*?['\"]?>(.*?)</A>|i",

Quote:
> $line, $out);

>     echo "<pre>";
>     print_r($out); // have a look at this to see what you capture
>     echo "</pre>";
> }

The above block is not as it should be.

$line = '';
while (! feof($file)) {
    $line .= fgets($file, 4096);

Quote:
}

preg_match_all("|<A HREF=['\"]?/DirectoryToMatch/.*?['\"]?>(.*?)</A>|i",
                $line, $out);
echo  "<pre>";
print_r($out); // have a look at this to see what you capture
echo "</pre>";

This should give you better results.

regards,
reggie



Wed, 27 Apr 2005 02:21:06 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. multiple matching in a single line

2. pattern matching for multiple line?

3. multiple lines matching without separator

4. Can ETAGS match multiple VHDL lines ?

5. matching multiple regexs to a single line...

6. SmallEiffel and EiffelBase (COLLECTION.put vs. ARRAY.put)

7. Arrays: Build array in multiple for loops or replace array elements

8. Using single puts line spanning serveral source lines

9. awk -- pattern match a line and the line that follows

10. Sorting multiple-line data to single line data

11. Join multiple lines of records to single line

12. Putting or Getting multiple files with RxFtp

 

 
Powered by phpBB® Forum Software