BIG TROUBLE! Please help! 
Author Message
 BIG TROUBLE! Please help!

I have big problems. I am writing a script that does certain kinds of searches
on textfiles. It therefore reads these textfiles per-file into a string that
is put into an associative array, with the filename as key. These files are
between 1k and 3k and together they take up around 46k.

Things seem to work until I tried to handle multiple matches on the string. As
soon as I introduced $&, $' and $` I got perl "Out of memory" complaints. This
cannot be true! I am running on a SUN or DEC with LOTS of memory and no
restrictions on memory usage. When I tried to figure out what was going on, I
discovered that $& and such were not working....

Try the following:

============================ test2.pl ================================
$_ = "aaabbbccc";
/bbb/;
print "$`:$&:$'\n";
$_ = "aaabbbccc";
/b*/;
print "$`:$&:$'\n";
============================ output ==================================
aaa:bbb:ccc
::aaabbbccc
======================================================================

I have even seen lots of garbage on my screen when trying to print
length( $`) and their like.

Here is another file:

============================= test.pl ================================
#! /usr/local/bin/perl

$file = 'test.pl';
$verbatim = 1;

if (open( TXTFILE, $file))
{
    undef( $/);
    $database{$file} = <TXTFILE>;
    $/ = "\n";
    close( TXTFILE);
    warn( "MESSAGE: $file initialized.\n") if ($verbatim);

Quote:
}

else
{
    warn( "ERROR: Unable to read $file.\n");

Quote:
}

$regexp = '(^|\b|\s)(test)(\s|\b|$)';
$*=1;

$_ = $database{$file};
($pre, $match, $post) = /$regexp/;
print( "First match \"$match\" or \"$&\"\n");
print( length( $`), ' ', length( $&), ' ', length( $'), "\n");

======================================================================

You know what? Trying to run this with "perl test.pl" gives the following
output:

MESSAGE: test.pl initialized.
First match "test" or ""
437560 437296 438320

It looks to me that combining real regular expressions with $`, $& and $'
means trouble.

Am I making some big mistake here or is there some big bug here? (Which would
mean disaster, since I'll have to abandon perl in that case).

I am running perl 4.010 on SUNOS 4.1.1 (SPARC), Ultrix 4.0 (MIPS), SUNOS 3.5
(MC68020). All except 68020 are compiled with gcc. The SUNOS 3.5 version gives
smaller numbers as length( $`) etc.
--
Gerben Wierda
Phone: (+31) 2154 84415                 Home: (+31) 85 516677
        "If you don't know where you are going, any road will take you there."
        Lewis Carroll, "Alice in Wonderland".



Tue, 29 Mar 1994 20:09:29 GMT  
 BIG TROUBLE! Please help!

| Try the following:
|
| ============================ test2.pl ================================
| $_ = "aaabbbccc";
| /bbb/;
| print "$`:$&:$'\n";
| $_ = "aaabbbccc";
| /b*/;
| print "$`:$&:$'\n";
| ============================ output ==================================
| aaa:bbb:ccc
| ::aaabbbccc
| ======================================================================
+--------
This looks correct to me.  In your second example, /b*/ means "match
ZERO or more occurrences of a 'b'".  The match did succeed; it matched
the zero part.  Then the part before the match is also null and the
entire string is after the match.

Try the pattern /b+/ to get what you want.

        Jeff
--
Jefferson K. French                     Alliance Technologies, Inc.
(512) 794-0439                          Shepard Mountain Plaza
                                        6034 West Courtyard Dr., Suite 250
...!uunet!alliance!jefff                Austin, TX  78730



Wed, 30 Mar 1994 20:58:12 GMT  
 BIG TROUBLE! Please help!

:This looks correct to me.  In your second example, /b*/ means "match
:ZERO or more occurrences of a 'b'".  The match did succeed; it matched
:the zero part.  Then the part before the match is also null and the
:entire string is after the match.
:
:Try the pattern /b+/ to get what you want.

Another reason to use X+ over X* is performance.  In general, the star
operator causes much more backtracking than the plus operator.  Complex
regexps are pretty bad as it is; when you fill them with stars, you
quickly suffer exponentially slow performance.  There's nothing like
a bunch of .* 's in an expression to slow it way down, especially
with nice (...)* or (..|..)*'s.

--tom



Fri, 01 Apr 1994 02:40:36 GMT  
 BIG TROUBLE! Please help!
: I have big problems. I am writing a script that does certain kinds of searches
: on textfiles. It therefore reads these textfiles per-file into a string that
: is put into an associative array, with the filename as key. These files are
: between 1k and 3k and together they take up around 46k.
:
: Things seem to work until I tried to handle multiple matches on the string. As
: soon as I introduced $&, $' and $` I got perl "Out of memory" complaints. This
: cannot be true! I am running on a SUN or DEC with LOTS of memory and no
: restrictions on memory usage.

This is probably the uninitialized memory bug that has been pointed out.
Try changing New() to Newz() in regcomp.c.

: When I tried to figure out what was going on, I
: discovered that $& and such were not working....
:
: Try the following:
:
: ============================ test2.pl ================================
: $_ = "aaabbbccc";
: /bbb/;
: print "$`:$&:$'\n";
: $_ = "aaabbbccc";
: /b*/;
: print "$`:$&:$'\n";
: ============================ output ==================================
: aaa:bbb:ccc
: ::aaabbbccc
: ======================================================================

This is correct behavior.  /b*/ matches the first place it can.  Since it
can match the null string, it matches before the first 'a'.  Use /b+/ instead.

: I have even seen lots of garbage on my screen when trying to print
: length( $`) and their like.

That's the Newz() thingie, I believe.

: Here is another file:
:
: ============================= test.pl ================================
: #! /usr/local/bin/perl
:
: $file = 'test.pl';
: $verbatim = 1;
:
: if (open( TXTFILE, $file))
: {
:     undef( $/);
:     $database{$file} = <TXTFILE>;
:     $/ = "\n";
:     close( TXTFILE);
:     warn( "MESSAGE: $file initialized.\n") if ($verbatim);
: }
: else
: {
:     warn( "ERROR: Unable to read $file.\n");
: }
:
: $regexp = '(^|\b|\s)(test)(\s|\b|$)';
: $*=1;
:
: $_ = $database{$file};
: ($pre, $match, $post) = /$regexp/;
: print( "First match \"$match\" or \"$&\"\n");
: print( length( $`), ' ', length( $&), ' ', length( $'), "\n");
:
: ======================================================================
:
: You know what? Trying to run this with "perl test.pl" gives the following
: output:
:
: MESSAGE: test.pl initialized.
: First match "test" or ""
: 437560 437296 438320
:
: It looks to me that combining real regular expressions with $`, $& and $'
: means trouble.
:
: Am I making some big mistake here or is there some big bug here? (Which would
: mean disaster, since I'll have to abandon perl in that case).

To quote the manual:
             If used in a context that requires an array value, a
             pattern  match  returns  an  array consisting of the
             subexpressions matched by  the  parentheses  in  the
             pattern, i.e. ($1, $2, $3...).  It does NOT actually
             set $1, $2, etc. in this case, nor does it  set  $+,
             $`,  $&  or $'.

To do what you want, you'd have to say

    if (/$regexp/) {
        ($pre, $match, $post) = ($1,$2,$3);
        print( "First match \"$match\" or \"$&\"\n");
        print( length( $`), ' ', length( $&), ' ', length( $'), "\n");
    }

Larry



Sat, 02 Apr 1994 07:05:47 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. XML::LibXSLT module install trouble - help please!

2. pattern matching trouble. please help

3. please help - trouble with Win32::EventLog - reading event strings

4. LOCALE trouble -Please help

5. Date problems ... Big trouble :-(

6. Date problems ... Big trouble :-(

7. A big associative array - How big is it!

8. big import big mess!!!!

9. Convertaing BIG BIG integers to HEX

10. Randal's Big Day in Big D

11. How big is big?

12. Trouble shooting please!!!

 

 
Powered by phpBB® Forum Software