Bug in Jscrip RegExp? 
Author Message
 Bug in Jscrip RegExp?

Hi,

I have a HTML string sthat I'd need to manipulate using RegExp's. What I
want to do is replase the value of a particular tag as illustrated in the
below script example:

  var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P> zzzzzzzzzzzz
</P>;
  var re = /(<P id=myId>).*(<\/P>);

  s = s.replace(re, "$1 nnnnnnn $3");

I would expect only the yyyyyyyy to be replaced by nnnnnnnn, but actually
everything between the start tag and the *last* end tag ( yyyyyyyy </P> <P>
zzzzzzzz ) is replaced, so the resulting s becomes  '<P> xxxxxxxxx </P><P
id=myId> yyyyyyyy </P>.

Either this is a bug in the RegExp engine in JScrip or I do not understand
regular expressions correctly.

Any comments?

Thanks
Erik



Fri, 02 May 2003 03:00:00 GMT  
 Bug in Jscrip RegExp?



Quote:
> Hi,
>   var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P> zzzzzzzzzzzz
> </P>;
>   var re = /(<P id=myId>).*(<\/P>);

>   s = s.replace(re, "$1 nnnnnnn $3");
> Either this is a bug in the RegExp engine in JScrip or I do not understand
> regular expressions correctly.

The .* is greedy and eats up all the </p>'s until there's aren't any more.
use[^<\/P>]* instead of .*

Jim.



Fri, 02 May 2003 03:00:00 GMT  
 Bug in Jscrip RegExp?
Hi Erik,

Quote:
>   var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P>
zzzzzzzzzzzz
> </P>;

     ^^^
no closing '

Quote:
>  var re = /(<P id=myId>).*(<\/P>);

                                  ^^^
you haven't closed the expression with another /

Quote:
>   s = s.replace(re, "$1 nnnnnnn $3");

                                  ^^
$3 should read $2

The search is greedy, .* will match everything until it hits a new line
or the end of the string.  In IE5.5 you can change that to .*? which
will make it check ahead each match, non greedy.  I think pre 5.5 you
would have to do something like [^<\/P>]* insted of .* but that will
also match on /P<>, >P/< etc as well from where I have been trying it
out, I'm not sure if there is a better solution pre 5.5.
If you have 5.5, cut and paste the code below ( acounting for random
line breaks.. as is and it should work, it did for me :-).

var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P> zzzzzzzzzzzz
</P>';
var re = /(<P id=myId>).*?(<\/P>)/;
s = s.replace(re, "$1 nnnnnnn $2");
alert (s)

Mark

Sent via Deja.com http://www.deja.com/
Before you buy.



Fri, 02 May 2003 03:00:00 GMT  
 Bug in Jscrip RegExp?
Hi Jim,

You are absolutely correct. It works. In this test example I'm replacing the
content of a P tag. In my "real world" project I'm working on TBODY's. I
substituted the P with TBODY, but it dosn't seem to work then:

var reRekvVal = /(TBODY id=tbodyRekvValues)([^<\/TBODY>]*)(<\/TBODY>)/

Any ideas?

Thanks
Erik



Quote:



> > Hi,
> >   var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P> zzzzzzzzzzzz
> > </P>;
> >   var re = /(<P id=myId>).*(<\/P>);

> >   s = s.replace(re, "$1 nnnnnnn $3");

> > Either this is a bug in the RegExp engine in JScrip or I do not
understand
> > regular expressions correctly.

> The .* is greedy and eats up all the </p>'s until there's aren't any more.
> use[^<\/P>]* instead of .*

> Jim.



Fri, 02 May 2003 03:00:00 GMT  
 Bug in Jscrip RegExp?


Quote:
> Hi Jim,

> You are absolutely correct. It works. In this test example I'm
replacing the
> content of a P tag. In my "real world" project I'm working on
TBODY's. I
> substituted the P with TBODY, but it dosn't seem to work then:

> var reRekvVal = /(TBODY id=tbodyRekvValues)([^<\/TBODY>]*)(<\/TBODY>)/

> Any ideas?

Hi Eric

Probably because [^<\/TBODY>]* means match any one character not
between the brackets not match the actual phrase enclosed, if you have
any of these individual characters between the tags, thats a match
before it hits the end, see my other post.

Mark

Sent via Deja.com http://www.deja.com/
Before you buy.



Fri, 02 May 2003 03:00:00 GMT  
 Bug in Jscrip RegExp?
Mark,

Perfect. It works 100%.

Thanks
Erik



Quote:
> Hi Erik,

> >   var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P>
> zzzzzzzzzzzz
> > </P>;
>      ^^^
> no closing '

> >  var re = /(<P id=myId>).*(<\/P>);
>                                   ^^^
> you haven't closed the expression with another /

> >   s = s.replace(re, "$1 nnnnnnn $3");
>                                   ^^
> $3 should read $2

> The search is greedy, .* will match everything until it hits a new line
> or the end of the string.  In IE5.5 you can change that to .*? which
> will make it check ahead each match, non greedy.  I think pre 5.5 you
> would have to do something like [^<\/P>]* insted of .* but that will
> also match on /P<>, >P/< etc as well from where I have been trying it
> out, I'm not sure if there is a better solution pre 5.5.
> If you have 5.5, cut and paste the code below ( acounting for random
> line breaks.. as is and it should work, it did for me :-).

> var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P> zzzzzzzzzzzz
> </P>';
> var re = /(<P id=myId>).*?(<\/P>)/;
> s = s.replace(re, "$1 nnnnnnn $2");
> alert (s)

> Mark

> Sent via Deja.com http://www.deja.com/
> Before you buy.



Fri, 02 May 2003 03:00:00 GMT  
 Bug in Jscrip RegExp?
Sorry, I think I fuc... up. Your sugestion works fine.
Erik



Quote:
> Hi Jim,

> You are absolutely correct. It works. In this test example I'm replacing
the
> content of a P tag. In my "real world" project I'm working on TBODY's. I
> substituted the P with TBODY, but it dosn't seem to work then:

> var reRekvVal = /(TBODY id=tbodyRekvValues)([^<\/TBODY>]*)(<\/TBODY>)/

> Any ideas?

> Thanks
> Erik





> > > Hi,
> > >   var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P>
zzzzzzzzzzzz
> > > </P>;
> > >   var re = /(<P id=myId>).*(<\/P>);

> > >   s = s.replace(re, "$1 nnnnnnn $3");

> > > Either this is a bug in the RegExp engine in JScrip or I do not
> understand
> > > regular expressions correctly.

> > The .* is greedy and eats up all the </p>'s until there's aren't any
more.
> > use[^<\/P>]* instead of .*

> > Jim.



Fri, 02 May 2003 03:00:00 GMT  
 Bug in Jscrip RegExp?


Quote:



> > Hi,
> >   var s = '<P> xxxxxxxxx </P><P id=myId> yyyyyyyy </P> <P> zzzzzzzzzzzz
> > </P>;
> >   var re = /(<P id=myId>).*(<\/P>);

> >   s = s.replace(re, "$1 nnnnnnn $3");

> > Either this is a bug in the RegExp engine in JScrip or I do not
understand
> > regular expressions correctly.

> The .* is greedy and eats up all the </p>'s until there's aren't any more.
> use[^<\/P>]* instead of .*

> Jim.

Not quite. [^<\/P>] means any character except '<', '/', 'P' or '>', not the
entire string '</P>'. Here are a couple of examples that would break the
rule:

  <P id=myId> Peter Piper </P>
  <P id=myId> <i>Important</i> News </P>
  <P id=myId> 3/4 of an hour </P>

If you can guarantee JScript5.5 or JavaScript1.5 (NN6), you can use a
non-greedy match:

  var re = /(<P id=myId>).*?(<\/P>)/;
  s = s.replace(re, "$1 nnnnnnn $2");

Otherwise, you need to use indexOf/substr or multiple regexps to break up
the string, change it and reassemble it.

Another trap--'.' doesn't match newlines and the workaround in the docs
([.\n]) doesn't work. If there's a possibility that the paragraph you want
to replace could span lines, use:

  var re = /(<P id=myId>)(?:.|\n)*?(<\/P>)/;
  s = s.replace(re, "$1 nnnnnnn $2");

=-=-=
Steve
-=-=-



Fri, 02 May 2003 03:00:00 GMT  
 
 [ 8 post ] 

 Relevant Pages 

1. RegExp 5.5 bug?

2. RegExp 5.5 Bug

3. RegExp 5.5 bug

4. RegExp Bug: Negative Lookahead Following Optional Character

5. possible RegExp bug

6. Bug? JavaScript RegExp and Nav 4.08 implementation

7. RegExp 5.5 Bug

8. RegExp Bug: Negative Lookahead Following Optional Character

9. RegExp 5.5 bug

10. Bug in RegExp

11. RegExp 5.5 bug?

12. RegExp bug (Windows Script 5.6)?

 

 
Powered by phpBB® Forum Software