Author 
Message 
Bill Hail #1 / 9

DCG Limitations?
Hi. I'm very new to prolog (1 week), but I'm an OK engineer and I learn fast :). I started writing around the PRESS code from the Sterling and Shapiro book, using SWI prolog and XPCE to make a gui for an equation solver. (to help my daughter with her maths homework :) I thought to parse "proper" algebraic expressions (so "2xy" means "2 * x * y" etc.) into equivalent prolog structures, I wrote a tokenizing DFA to do the preliminary parsing but very quickly hit a brick wall wrt DCGs. My problem is this: I know that: expr(X  Y) > expr(X), [''], primary(Y). won't work, because of the direct recursion on expr(X). I know that I can rewrite this as: expr(X  Y) > primary(X), [''], expr(Y). and that will parse *but* the associativity of the operators is wrong. given [x, , y, , z] this builds a right associative (x  (y  z)), instead of the correct left associative ((x  y)  z). Thoughts anyone?  Bill Hails

Sat, 11 Jun 2005 00:38:51 GMT 


Benjamin Johnsto #2 / 9

DCG Limitations?
Quote: > I thought to parse "proper" algebraic expressions > (so "2xy" means "2 * x * y" etc.) into equivalent > prolog structures, I wrote a tokenizing DFA to do the > preliminary parsing but very quickly hit a brick wall > wrt DCGs.
In similar situations, I usually use op/3 to declare any required operators. Then I just use the builtin parser in SWI to parse the term, using read_term/2 or atom_to_term/3. Quote: > expr(X  Y) > primary(X), [''], expr(Y). > and that will parse *but* the associativity of the > operators is wrong. given [x, , y, , z] this builds > a right associative (x  (y  z)), instead of the correct > left associative ((x  y)  z). > Thoughts anyone?
A simple and quick solution (though a bit inefficient) would be to keep track of the "depth" of the recursion. If each production contains at least one terminal, then the maximum depth of the recursion is the length of the parse string. Simply 'fail' if the recursion becomes too deep. Benjamin Johnston

Sat, 11 Jun 2005 07:31:22 GMT 


Bill Hail #3 / 9

DCG Limitations?
Quote:
>> (so "2xy" means "2 * x * y" etc.) > In similar situations, I usually use op/3 to declare any required > operators. Then I just use the builtin parser in SWI to parse the term, > using read_term/2 or atom_to_term/3.
That's exactly what I was doing at first, just using the builtin math operators, but my typos kept crashing the program :) I was typing "2x" instead of 2 * x which set me off thinking about a parser. Especially nice for things like 2ysin3 => 2 * y * sin(3). Quote: >> expr(X  Y) > primary(X), [''], expr(Y). >> and that will parse *but* the associativity of the >> operators is wrong. given [x, , y, , z] this builds >> a right associative (x  (y  z)), instead of the correct >> left associative ((x  y)  z). >> Thoughts anyone? > A simple and quick solution (though a bit inefficient) would be to keep > track of the "depth" of the recursion. If each production contains at > least one terminal, then the maximum depth of the recursion is the length > of the parse string. Simply 'fail' if the recursion becomes too deep. > Benjamin Johnston
Thanks, I'll give that a try. I have one case where a production doesn't contain a terminal but I can set some reasonably high arbitrary limit and fail on that. The reason I'm still thinking this is a general problem with DCGs is that I've found two implementations of mini languages in Prolog (the one in Sterling and Shapiro and another from an archive) and both dodge the issue by either accepting the mathematecally incorrect right associativity or by making the mathematical operators nonassociative. Is anyone aware of any other implementations of algebraic parsers in Prolog?  Bill Hails

Sat, 11 Jun 2005 17:56:29 GMT 


vanno.. #4 / 9

DCG Limitations?
Quote:
> The reason I'm still thinking this is a general problem with DCGs is that > I've found two implementations of mini languages in Prolog > (the one in Sterling and Shapiro and another from an archive) and both > dodge the issue by either accepting the mathematecally incorrect right > associativity or by making the mathematical operators nonassociative. > Is anyone aware of any other implementations of algebraic parsers in > Prolog?
One other common trick is to interpret the DCG rules by a bottomup parser. For instance a leftcorner parser. This is explained e.g. in Pereira and Shieber, Prolog and Natural Language Analysis. Gj  Gertjan van Noord Alfainformatica, RUG, Postbus 716, 9700 AS Groningen vannoord at let dot rug dot nl http://www.let.rug.nl/~vannoord

Sat, 11 Jun 2005 19:47:21 GMT 


Bill Hail #5 / 9

DCG Limitations?
Quote:
>> The reason I'm still thinking this is a general problem with DCGs is that >> I've found two implementations of mini languages in Prolog >> (the one in Sterling and Shapiro and another from an archive) and both >> dodge the issue by either accepting the mathematecally incorrect right >> associativity or by making the mathematical operators nonassociative. >> Is anyone aware of any other implementations of algebraic parsers in >> Prolog? > One other common trick is to interpret the DCG rules by a bottomup > parser. For instance a leftcorner parser. This is explained e.g. in > Pereira and Shieber, Prolog and Natural Language Analysis.
I'll see if I can find that book. In the meantime I've abandoned DCGs altogether, It's actually proved quite pleasantly easy to write a parser from scratch with just basic prolog, it's not quite working yet, but essentially it's just rules like: rational(Tokens, Expr, Remaining) : int(Tokens, N, ['/'Rest]), int(Rest, D, Remaining), Expr = N / D. int([NumRremaining], Num, Remaining) : integer(Num). the point being that if I guarantee to consume at least one token for each production I have to either fail or get to the end of the string. I'm not a computer scientist, but I think that is pretty much what any bottomup parser will do. Quote: > Gj
 Bill Hails

Sun, 12 Jun 2005 02:54:53 GMT 


vanno.. #6 / 9

DCG Limitations?
Quote:
> In the meantime I've abandoned DCGs altogether, It's actually proved > quite pleasantly easy to write a parser from scratch with just basic > prolog, it's not quite working yet, but essentially it's just rules > like: > rational(Tokens, Expr, Remaining) : > int(Tokens, N, ['/'Rest]), > int(Rest, D, Remaining), > Expr = N / D. > int([NumRremaining], Num, Remaining) : > integer(Num). > the point being that if I guarantee to consume at least one token for > each production I have to either fail or get to the end of the string. > I'm not a computer scientist, but I think that is pretty much what any > bottomup parser will do.
This is a much stronger restriction than neccessary. But this is beyond comp.lang.prolog, presumably. Gj  Gertjan van Noord Alfainformatica, RUG, Postbus 716, 9700 AS Groningen vannoord at let dot rug dot nl http://www.let.rug.nl/~vannoord

Sun, 12 Jun 2005 07:46:14 GMT 


Fernando Pereir #7 / 9

DCG Limitations?
Quote:
> I'll see if I can find that book.
Easy: <http://www.mtome.com/pnladigital.html> Quote: > In the meantime I've abandoned DCGs altogether, It's actually proved > quite pleasantly easy to write a parser from scratch with just basic > prolog, it's not quite working yet, but essentially it's just rules > like: > rational(Tokens, Expr, Remaining) : > int(Tokens, N, ['/'Rest]), > int(Rest, D, Remaining), > Expr = N / D. > int([NumRremaining], Num, Remaining) : > integer(Num). > the point being that if I guarantee to consume at least one token for > each production I have to either fail or get to the end of the string. > I'm not a computer scientist, but I think that is pretty much what any > bottomup parser will do.
This is more strict that necessary, as Gertjan said. You may want to look at <http://www.ifs.org.uk/~popx/prolog/tools/files/read.pl> for a complete Prolog parser written in bottomup operator precedence style. It could be simplified and modified for your purposes. It goes together with the tokenizer <http://www.ifs.org.uk/~popx/prolog/tools/files/rdtok.gen>. Both are written in a somewhat obsolete dialect, you may need to edit here and there.  F

Mon, 13 Jun 2005 05:26:37 GMT 


Fergus Henders #8 / 9

DCG Limitations?
Quote:
>I thought to parse "proper" algebraic expressions >(so "2xy" means "2 * x * y" etc.) into equivalent >prolog structures, I wrote a tokenizing DFA to do the >preliminary parsing but very quickly hit a brick wall >wrt DCGs. ... >expr(X  Y) > expr(X), [''], primary(Y). >won't work, because of the direct recursion on expr(X).
One good way to handle this problem is to eliminate left recursion by leftfactoring the grammar. For example, instead of expr > expr, [''], primary. expr > primary. you can use expr > primary, expr2. expr2 > [''], primary, expr2. expr2 > []. The latter grammar parses the same language as the former, but has the advantage that it is not left recursive. It's easy enough to construct the same (leftassociative) parse tree as you'd expect for the original grammar, by passing in the parse tree for the lefthandside to `expr2'. Here's an example of how to do it. This is an extract taken from one of the sample programs in the Mercury distribution (samples/calculator.m). If you delete the : pred and : type declarations, the code should be valid Prolog. (Some of the names used are slightly different than in your original example: this uses `minus/2' instead of `/2' and `term' instead of `primary'.) : type expr > number(int) ; plus(expr, expr) ; minus(expr, expr) ; times(expr, expr) ; div(expr, expr). : pred expr(expr::out, list(char)::in, list(char)::out) is semidet. expr(Expr) > factor(Factor), expr2(Factor, Expr). : pred expr2(expr::in, expr::out, list(char)::in, list(char)::out) is semidet. expr2(Factor, Expr) > ( ['+'] > factor(Factor2), expr2(plus( Factor, Factor2), Expr) ; [''] > factor(Factor2), expr2(minus(Factor, Factor2), Expr) ; { Expr = Factor } ). : pred factor(expr::out, list(char)::in, list(char)::out) is semidet. factor(Factor) > term(Term), factor2(Term, Factor). : pred factor2(expr::in, expr::out, list(char)::in, list(char)::out) is semidet. factor2(Term, Factor) > ( ['*'] > term(Term2), factor2(times(Term,Term2), Factor) ; ['/'] > term(Term2), factor2(div( Term,Term2), Factor) ; { Factor = Term } ). 
The University of Melbourne  of excellence is a lethal habit" WWW: <http://www.cs.mu.oz.au/~fjh>   the last words of T. S. Garp.

Tue, 14 Jun 2005 08:18:42 GMT 


Bill Hail #9 / 9

DCG Limitations?
Quote:
>> I'll see if I can find that book. > Easy: > <http://www.mtome.com/pnladigital.html> >[...] > This is more strict that necessary, as Gertjan said. You may want to look > at <http://www.ifs.org.uk/~popx/prolog/tools/files/read.pl> for a complete > Prolog parser written in bottomup operator precedence style. It could be > simplified and modified for your purposes. It goes together with the > tokenizer <http://www.ifs.org.uk/~popx/prolog/tools/files/rdtok.gen>. Both > are written in a somewhat obsolete dialect, you may need to edit here and > there. >  F
Thanks for those links, I have a working parser now.  Bill Hails

Tue, 14 Jun 2005 22:31:06 GMT 


