What's the meaning of hash contents resulting from a match 
Author Message
 What's the meaning of hash contents resulting from a match

Perlsters,

If I execute:



found words in its elements, and I can get a match count from it
saying:


If I execute:

my %result = ($html =~ /\b(one|two)\b/sig);

%result also contains the found words in its key/value pairs, but this
time the number of elements is much smaller and there seems to be no
relation with the NUMBER of found words.

Can somebody please explain to me the purpose of using this second
notation?

Thanks,
Daniel



Wed, 01 Sep 2004 21:04:07 GMT  
 What's the meaning of hash contents resulting from a match

Quote:
> my %result = ($html =~ /\b(one|two)\b/sig);

> %result also contains the found words in its key/value pairs,
> but this time the number of elements is much smaller and there
> seems to be no relation with the NUMBER of found words.

It doesn't look like a key => value pattern, so I'm not sure
why you would use it to populate a hash...

But something like this might make sense:

#!/usr/bin/perl
use Data::Dumper;

undef $/;
$_ = <DATA>;

my %props = /(\w+)=(\w+)/g;

print Dumper \%props;

__END__
one=1
two=2
three=3

Since you're capturing a pair with each m//g.

Can you give an example of the 'no relation to the NUMBER'?

--
Steve



Thu, 02 Sep 2004 02:11:03 GMT  
 What's the meaning of hash contents resulting from a match
i believe it is because the first eg is called in list context, the second isn't.

rod.

Quote:

> Perlsters,

> If I execute:



> found words in its elements, and I can get a match count from it
> saying:


> If I execute:

> my %result = ($html =~ /\b(one|two)\b/sig);

> %result also contains the found words in its key/value pairs, but this
> time the number of elements is much smaller and there seems to be no
> relation with the NUMBER of found words.

> Can somebody please explain to me the purpose of using this second
> notation?

> Thanks,
> Daniel



Fri, 03 Sep 2004 23:19:16 GMT  
 What's the meaning of hash contents resulting from a match
Guys (Dolls?),

I know I should state my problem more detailed, but I just want to
know if there is any special Perl functionality behind assigning a
hash (instead of an array) to the results of a global match.

I did some fiddling in the mean time and I discovered that the hash
gets filled with the unique values of the alteration part of the
(case-insensitive global) regexp, e.g.:

One, one onE, ONE, two, TWO, etc.

Maybe these are internally used by Perl for case-insentitive
searches...

Thanks for your replies,
Daniel

Quote:

> Perlsters,

> If I execute:



> found words in its elements, and I can get a match count from it
> saying:


> If I execute:

> my %result = ($html =~ /\b(one|two)\b/sig);

> %result also contains the found words in its key/value pairs, but this
> time the number of elements is much smaller and there seems to be no
> relation with the NUMBER of found words.

> Can somebody please explain to me the purpose of using this second
> notation?

> Thanks,
> Daniel



Sat, 04 Sep 2004 02:29:18 GMT  
 What's the meaning of hash contents resulting from a match

Quote:

> Guys (Dolls?),

> I know I should state my problem more detailed, but I just want to
> know if there is any special Perl functionality behind assigning a
> hash (instead of an array) to the results of a global match.

> I did some fiddling in the mean time and I discovered that the hash
> gets filled with the unique values of the alteration part of the
> (case-insensitive global) regexp, e.g.:

> One, one onE, ONE, two, TWO, etc.

In your code fragment

  my %result = ($html =~ /\b(one|two)\b/sig);

the matches from the html are returned in a list, and that list is used
as alternating keys and values to initialise the hash.

Quote:
> Maybe these are internally used by Perl for case-insentitive
> searches...

No, they are the matches captuerd by the parentheses in the regex.

In the de{*filter*}:

  DB<1> $html = "One four all, and all four one"



   0  'One'
   1  'four'
   2  'and'
   3  'four'
   4  'one'
)

  DB<5> X hash
%hash = (
   'One' => 'four'
   'and' => 'four'
   'one' => undef
)

So the list has been used to initialise the hash.

If you want to count the occurrences then maybe you should consider the use
of a loop:

  DB<6> %hash = ()

  DB<7> $hash{lc($1)}++ while $html =~ /\b(one|four|and)\b/sig

  DB<8> X hash
%hash = (
   'and' => 1
   'four' => 2
   'one' => 2
)

Hope this helps,

Mike

[...]

Quote:


>> found words in its elements, and I can get a match count from it
>> saying:


>> If I execute:

>> my %result = ($html =~ /\b(one|two)\b/sig);

>> %result also contains the found words in its key/value pairs, but this
>> time the number of elements is much smaller and there seems to be no
>> relation with the NUMBER of found words.

>> Can somebody please explain to me the purpose of using this second
>> notation?

>> Thanks,
>> Daniel

--

http://www.*-*-*.com/ ~mike/       | GPG PGP Key      1024D/059913DA

http://www.*-*-*.com/ ;          |                  75D2 9EC4 C1C0 0599 13DA


Sat, 04 Sep 2004 03:05:20 GMT  
 What's the meaning of hash contents resulting from a match
Thanks for the explanations. I posted this question because I am
writing a script to extract search results from the mentioned HTML
files. I am looking for an alternative way to do:


where $pat contains the mentioned search word alteration:

one|two

I works nice, but the performance penalty is huge. Especially on large
files. When I remove one of the character ranges:

.{0,50}?

performance is what it should be (even on large files), but then I
loose the context within the text. Should I process the files line by
line instead of slurping them into $html?

Daniel

Quote:


> > Guys (Dolls?),

> > I know I should state my problem more detailed, but I just want to
> > know if there is any special Perl functionality behind assigning a
> > hash (instead of an array) to the results of a global match.

> > I did some fiddling in the mean time and I discovered that the hash
> > gets filled with the unique values of the alteration part of the
> > (case-insensitive global) regexp, e.g.:

> > One, one onE, ONE, two, TWO, etc.

> In your code fragment

>   my %result = ($html =~ /\b(one|two)\b/sig);

> the matches from the html are returned in a list, and that list is used
> as alternating keys and values to initialise the hash.

> > Maybe these are internally used by Perl for case-insentitive
> > searches...

> No, they are the matches captuerd by the parentheses in the regex.

> In the de{*filter*}:

>   DB<1> $html = "One four all, and all four one"


>   DB<3> X list

>    0  'One'
>    1  'four'
>    2  'and'
>    3  'four'
>    4  'one'
> )

>   DB<5> X hash
> %hash = (
>    'One' => 'four'
>    'and' => 'four'
>    'one' => undef
> )

> So the list has been used to initialise the hash.

> If you want to count the occurrences then maybe you should consider the use
> of a loop:

>   DB<6> %hash = ()

>   DB<7> $hash{lc($1)}++ while $html =~ /\b(one|four|and)\b/sig

>   DB<8> X hash
> %hash = (
>    'and' => 1
>    'four' => 2
>    'one' => 2
> )

> Hope this helps,

> Mike

> [...]



> >> found words in its elements, and I can get a match count from it
> >> saying:


> >> If I execute:

> >> my %result = ($html =~ /\b(one|two)\b/sig);

> >> %result also contains the found words in its key/value pairs, but this
> >> time the number of elements is much smaller and there seems to be no
> >> relation with the NUMBER of found words.

> >> Can somebody please explain to me the purpose of using this second
> >> notation?

> >> Thanks,
> >> Daniel

> --

> http://www.*-*-*.com/ ~mike/       | GPG PGP Key      1024D/059913DA

> http://www.*-*-*.com/ ;          |                  75D2 9EC4 C1C0 0599 13DA



Sat, 04 Sep 2004 17:04:19 GMT  
 What's the meaning of hash contents resulting from a match

Quote:
> Thanks for the explanations. I posted this question because I am
> writing a script to extract search results from the mentioned HTML
> files. I am looking for an alternative way to do:
[...]
> performance is what it should be (even on large files), but then I
> loose the context within the text. Should I process the files line by
> line instead of slurping them into $html?

Did you consider using HTML::Parser instead of rolling your own home-brewed
version?
See also "perldoc -q HTML": "How do I remove HTML from a string?"
This FAQ contains quite a bit of useful information how to parse HTML and
why it may not be a good idea to do it yourself.

jue



Sun, 05 Sep 2004 00:59:17 GMT  
 
 [ 7 post ] 

 Relevant Pages 

1. I'm trying to replace \%hash by its contents

2. File::Find - meaning of 'symlinks'?

3. explain an hash assignment: $hash{'key'}++

4. If "=~" means match what is doesn't match?

5. perl-cgi--help request: How to display matched results with navigational links

6. pattern matching results into an array

7. Using undefined match results?

8. Capturing result of an unordered, multi-word match

9. Matching the contents of 2 scalars

10. pattern matching; see content file as one string

11. Match the contents of HTML comment

12. matching contents of array

 

 
Powered by phpBB® Forum Software