Parse::RecDescent stops parsing. 
Author Message
 Parse::RecDescent stops parsing.

Can anyone help me with a problem I am having regarding the
Parse::RecDescent module? The proble is that with the code below, the parser
just stops parsing when (in my opinion) it should pass an error. Is there a
way around this?

The code in question is:

#!/user/mvdlaan/cadbin/perl -w

use Parse::RecDescent;
use Data::Dumper;
use strict;

my $Grammar = q{
  file : <skip: '[ \t]*'> block(s) {print $text; $item[2];}
  block : while_block { $item[1]; }
    | text_block  { $item[1]; }

while_block : '#' 'while' <commit> condition /\\n/
    block(s)
    '#' 'endwhile' <commit> a_newline
    { ["While found: $item[4]", $item[6]]; }
  | <error?>

condition : { extract_bracketed($text, '(' ); }

text_block : text_line(s)

text_line : /\\n/ {"";}
  | /^[^#].*/ /\\n/ { $item[1]; }

a_newline : /\\n/

Quote:
};

my $parse = new Parse::RecDescent($Grammar);
undef $/;
my $text = <DATA>;
my $root = $parse->file($text);
print Dumper($root);

__END__
#while (1)
  Hello
#endwhile
#while ERROR!!!
  World
#endwhile
#while (1)
  Goodbye
#endwhile

Any help is very much appreciated!

Cheers,

Marcel



Sun, 02 Dec 2001 03:00:00 GMT  
 Parse::RecDescent stops parsing.

Marcel> my $Grammar = q{
Marcel>   file : <skip: '[ \t]*'> block(s) {print $text; $item[2];}

If your top-level item doesn't require ending at end-of-string, the
parser stops when it has seen a valid substring!

To fix this, add /\Z/ at the end of your top item there.

I got bit by the same thing.  And yes, Damian says it's documented. :)

--
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying

Web: <A HREF="http://www.stonehenge.com/merlyn/">My Home Page!</A>
Quote: "I'm telling you, if I could have five lines in my .sig, I would!" -- me



Sun, 02 Dec 2001 03:00:00 GMT  
 Parse::RecDescent stops parsing.

Quote:
>Marcel> my $Grammar = q{
>Marcel>   file : <skip: '[ \t]*'> block(s) {print $text; $item[2];}
>If your top-level item doesn't require ending at end-of-string, the
>parser stops when it has seen a valid substring!
>To fix this, add /\Z/ at the end of your top item there.
>I got bit by the same thing.  And yes, Damian says it's documented. :)

Randal's spot on here (and much more succinct in his explanation than
I would have been :-)

Your "file:" rule tells the parser to match one or more blocks and then
print whatever text remains. It does so successfully, so the entire
parse succeeds and the error messages generated along the way
are discarded.

And yes, it is documented, though perhaps not explicitly enough:

         Error messages generated by the various `<error...>'
         directives are not displayed immediately. Instead, they are
         "queued" in a buffer and are only displayed once parsing
         ultimately fails. Moreover, `<error...>' directives that cause
         one production of a rule to fail are automatically removed
         from the message queue if another production subsequently
         causes the entire rule to succeed. This means that you can put
         `<error...>' directives wherever useful diagnosis can be done,
         and only those associated with actual parser failure will ever
         be displayed.

Since your parser doesn't "ultimately fail", the error messages aren't
ever displayed. Adding the /\Z/ requires the block(s) to run to
the end of the input (it's like a $ inside a regex). Now, if a malformed
block is found, the block(s) repetition ends and the parser checks
whether any text remains. It does, so the "file:" rule fails, which
means the entire parse fails, which means the error messages are
generated.

In self-defence, I'm now going to add a paragraph to the documentation
describing this particular problem.

Damian



Mon, 03 Dec 2001 03:00:00 GMT  
 Parse::RecDescent stops parsing.
Thanks guys,

I thought it might have been something like this, and I treid it. My error
now gets caught. Unfortunately, now something seems to have occurred in the
return value when the file should succeed: dumping the $root now gives me
'undef'.

My altered grammar is:

my $Grammar = q{

file  : <skip:'[ \t]*'> block(s) eofile {$item[2]}

block  : while_block { $item[1]; }
  | text_block  { $item[1]; }

while_block : '#' 'while' <commit> condition /\\n/
    block(s)
    '#' 'endwhile' <commit> a_newline
    { ["While found: $item[4]", $item[6]]; }
  | <error?>

condition : { extract_bracketed($text, '(' ); }

text_block : text_line(s)

text_line : /\\n/ {"";}
  | /^[^#].*/ /\\n/ { $item[1]; }

a_newline : /\\n/

eofile  : /\cZ/

Quote:
};

Am I returning values incorrectly? Am I missing something?

Another question, just out of curiosity: When is the new-and-improved v2.00
(or any next version) planned?

Thanks,

Marcel van der Laan

Quote:


>>Marcel> my $Grammar = q{
>>Marcel>   file : <skip: '[ \t]*'> block(s) {print $text; $item[2];}

>>If your top-level item doesn't require ending at end-of-string, the
>>parser stops when it has seen a valid substring!

>>To fix this, add /\Z/ at the end of your top item there.

>>I got bit by the same thing.  And yes, Damian says it's documented. :)

>Randal's spot on here (and much more succinct in his explanation than
>I would have been :-)

>Your "file:" rule tells the parser to match one or more blocks and then
>print whatever text remains. It does so successfully, so the entire
>parse succeeds and the error messages generated along the way
>are discarded.

>And yes, it is documented, though perhaps not explicitly enough:

>         Error messages generated by the various `<error...>'
>         directives are not displayed immediately. Instead, they are
>         "queued" in a buffer and are only displayed once parsing
>         ultimately fails. Moreover, `<error...>' directives that cause
>         one production of a rule to fail are automatically removed
>         from the message queue if another production subsequently
>         causes the entire rule to succeed. This means that you can put
>         `<error...>' directives wherever useful diagnosis can be done,
>         and only those associated with actual parser failure will ever
>         be displayed.

>Since your parser doesn't "ultimately fail", the error messages aren't
>ever displayed. Adding the /\Z/ requires the block(s) to run to
>the end of the input (it's like a $ inside a regex). Now, if a malformed
>block is found, the block(s) repetition ends and the parser checks
>whether any text remains. It does, so the "file:" rule fails, which
>means the entire parse fails, which means the error messages are
>generated.

>In self-defence, I'm now going to add a paragraph to the documentation
>describing this particular problem.

>Damian



Mon, 03 Dec 2001 03:00:00 GMT  
 Parse::RecDescent stops parsing.

Quote:
>Unfortunately, now something seems to have occurred in the
>return value when the file should succeed: dumping the $root now
>gives me 'undef'.

"The fault, dear Marcel, is in your grammar, not in the parse!" ;-)

Quote:
>eofile  : /\cZ/

That should be:

        eofile  : /\Z/

You want to match "end-of-string", not "end-of-file".

Quote:
>Another question, just out of curiosity: When is the new-and-improved v2.00
>(or any next version) planned?

New subversions appear whenever I accumulate enough small changes. On
average, since the module was first released there has been a new
subversion every month.

The much anticipated version 2.00 is stuck in Development Hell at present,
whilst I frantically finish my OO Perl book, prepare my Perl Conference
parsing tutorial, hold down a full-time job as a senior academic, and
answer questions about when version 2.00 is coming out :-)

My best guess is that the much-improved RecDescent won't see the light
of day until October/November.

Damian



Mon, 03 Dec 2001 03:00:00 GMT  
 Parse::RecDescent stops parsing.
Hi Damian,

Quote:


>>Unfortunately, now something seems to have occurred in the
>>return value when the file should succeed: dumping the $root now
>>gives me 'undef'.

>"The fault, dear Marcel, is in your grammar, not in the parse!" ;-)

I didn't think it was in the parser for a minute! 10 seconds, max.

Quote:

>>eofile  : /\cZ/

>That should be:

> eofile  : /\Z/

I put it down to overeagerness to achieve results. I was thinking 'file
based parsing' and forgot that it was all about 'text based parsing'. My
fault.

Quote:

>You want to match "end-of-string", not "end-of-file".

>>Another question, just out of curiosity: When is the new-and-improved
v2.00
>>(or any next version) planned?

>New subversions appear whenever I accumulate enough small changes. On
>average, since the module was first released there has been a new
>subversion every month.

>The much anticipated version 2.00 is stuck in Development Hell at present,
>whilst I frantically finish my OO Perl book, prepare my Perl Conference
>parsing tutorial, hold down a full-time job as a senior academic, and
>answer questions about when version 2.00 is coming out :-)

I won't keep you further...

Quote:
>My best guess is that the much-improved RecDescent won't see the light
>of day until October/November.

>Damian

Thanks for the pointers, it all works like a charm. Now, back to my
parser...

Cheer,

Marcel



Tue, 04 Dec 2001 03:00:00 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Text Parsing - Parse::RecDescent or another method?

2. Parsing with Parse::RecDescent

3. Parse::RecDescent and parsing comments

4. Help: Problem with simple parsing and Parse::RecDescent

5. Parsing with Parse::RecDescent

6. Having Trouble with Parse::RecDescent on Solaris

7. Converting SQL89 YACC rule to Parse::RecDescent

8. ANNOUNCE: Parse::RecDescent 1.42

9. ANNOUNCE: Parse::RecDescent 1.41

10. ANNOUNCE: Parse::RecDescent 1.35

11. ANNOUNCE: Parse::RecDescent 1.30

12. ANNOUNCE: Parse::RecDescent 1.66

 

 
Powered by phpBB® Forum Software