portability of floats 
Author Message
 portability of floats

I need to represent arbitrary Standard Prolog terms as
implementation-independent byte strings, for transmission
between (conforming) systems and for storage in RDBMS etc.

I also seek to define the corresponding "read" and "write"
operations in Standard Prolog (although I expect to use foreign
language implementations) (NB a standard FLI might be cool).

I'm having some trouble with floats: I read (in Deransart's book)
that a conforming implementation must use some flavour of ISO/IEC
10967-1 "Language Independent Arithmetic", defined by values of
five parameters: radix, precision, smallest & largest exponent, and
whether it allows denormalised values.

Is the "IEEE 64-bit" representation exactly one such flavour?

Is there any conventional environment-enquiry for dynamically
discovering which flavour a Prolog implementation supports?

Is there more than one flavour in serious use?

Where two Prolog systems use equivalent float ADTs, I want to
transfer float values without any loss of information, respecting
the total internal ordering.

And where one system has e.g. higher float precision than another,
I guess I want to discover whether "read"ing a term involved actual
loss of precision (hence two terms, distinct in the "writing" system,
may be identical in the reading system).

NB I am not willing or able to stipulate that all applications should
distruct float comparisons.

To borrow some Java jargon, I reckon Standard Prolog needs a term
serialisation API...

Paul Singleton



Thu, 15 May 2003 03:00:00 GMT  
 portability of floats

Quote:

> And where one system has e.g. higher float precision than another,
> I guess I want to discover whether "read"ing a term involved actual
> loss of precision (hence two terms, distinct in the "writing" system,
> may be identical in the reading system).

Note that there are similar problems with other limits that some
Prologs have, e.g. the range of integers, size of atoms, arity of
structures...

Quote:
> To borrow some Java jargon, I reckon Standard Prolog needs a term
> serialisation API...

In a sense Prolog had that long before Java even existed,
in the form of the generic write_canonical/writeq and read
builtins. Since these use a human-readable format they are
unnecessarily complex, but if conversion speed is not an
important issue, they are fine.

--
 Joachim Schimpf              /             phone: +44 20 7594 8187

 London SW7 2AZ, UK         /    http://www.icparc.ic.ac.uk/eclipse



Fri, 16 May 2003 03:00:00 GMT  
 portability of floats

Quote:


> > And where one system has e.g. higher float precision than another,
> > I guess I want to discover whether "read"ing a term involved actual
> > loss of precision (hence two terms, distinct in the "writing" system,
> > may be identical in the reading system).
> Note that there are similar problems with other limits that some
> Prologs have, e.g. the range of integers, size of atoms, arity of
> structures...

Similar but different: it's possible that an external term can't be read
into some implementation, but if it can, there's no ambiguity.

But loss of precision is different: some apps just want the nearest
available float, others might be broken because e.g. two different
floats are read in as the same local value.

I think I can see what to do about the first set of problems, but I
need some help with the "float" issue.

Quote:
> > To borrow some Java jargon, I reckon Standard Prolog needs a term
> > serialisation API...
> In a sense Prolog had that long before Java even existed,
> in the form of the generic write_canonical/writeq and read
> builtins. Since these use a human-readable format they are
> unnecessarily complex, but if conversion speed is not an
> important issue, they are fine.

I wanted an external representation which

 * compactly represented recombinant terms

 * finitely represented cyclic terms

 * preserved internal order of variables within a term

 * handled "binary" atoms (arbitrary byte strings, e.g. OLE objects or
   database blobs) (I realise that, sadly, not all Standard Prolog
   implementations support these as well as SWI-Prolog does)

I think all read/write variants fail at least one of these requirements.

Paul Singleton



Sat, 17 May 2003 03:00:00 GMT  
 portability of floats

Quote:

> Similar but different: it's possible that an external term can't be read
> into some implementation, but if it can, there's no ambiguity.

> But loss of precision is different: some apps just want the nearest
> available float, others might be broken because e.g. two different
> floats are read in as the same local value.

> I think I can see what to do about the first set of problems, but I
> need some help with the "float" issue.

I'd say if you lose something while reading (whether it's the
low bits of a float or the high bits of a bignum), that's
an error, unless you've set some option to ignore it.

Quote:
> I wanted an external representation which

>  * compactly represented recombinant terms

what's that?

Quote:

>  * finitely represented cyclic terms

not Standard Prolog, though...

Quote:

>  * preserved internal order of variables within a term

I'd recommend to drop that requirement. I know why one might
want it, but (not unlike cyclic terms) it's the kind of feature
that's almost impossible to implement consistently across
a whole system.

Quote:

>  * handled "binary" atoms (arbitrary byte strings, e.g. OLE objects or
>    database blobs) (I realise that, sadly, not all Standard Prolog
>    implementations support these as well as SWI-Prolog does)

I'd expect that systems that handle them internally would
also read/writeq them properly:

[eclipse 2]: atom_codes(A, [97,98,0,99]).

A = 'ab\000c'
yes.

Quote:

> I think all read/write variants fail at least one of these requirements.

> Paul Singleton

--
 Joachim Schimpf              /             phone: +44 20 7594 8187

 London SW7 2AZ, UK         /    http://www.icparc.ic.ac.uk/eclipse


Sun, 18 May 2003 03:00:00 GMT  
 portability of floats

Quote:


> > I wanted an external representation which

> >  * compactly represented recombinant terms
> what's that?

"Recombinant" may not be the right jargon (any offers?) but if a term
contains many occurrences of (or references to) the same (large) subterm,
then writing it naively can require vastly more space than necessary, so
I check each functor (storage address, not name/arity) against those
already seen, and represent a subterm indirectly on its 2nd and subsequent
occurrence.

Quote:
> >  * finitely represented cyclic terms

> not Standard Prolog, though...

I'm just not comfortable with a spec. of a general-purpose term-writing
utility  which says "will go beserk and produce an infinite string if
passed a cyclic term: but that's not our responsibility because you
shouldn't be using them, even 'though there's no way either to detect
them or prove that your code cannot produce them..."

This is such a blatant denial-of-service weakness that, if it's possible
to serialise cyclic terms safely, and it is, I prefer to do so.

Also, I do not control the programmers whose code might use this term
I/O, and some of them might think they can exploit cyclic terms safely
and advantageously, and I believe they may be right (e.g. representing
a cyclic graph with cyclic terms allows some graph operations to be
implemented at a lower order of cost than is possible with non-cyclic
representations, and if they are competing with Java or C++ developers
to do e.g. some commercial routing optimisation then they don't need
this extra disadvantage, whatever its theological merits :-)

Quote:
> >  * preserved internal order of variables within a term

> I'd recommend to drop that requirement. I know why one might
> want it, but (not unlike cyclic terms) it's the kind of feature
> that's almost impossible to implement consistently across
> a whole system.

I found it fairly easy to implement: each written term is preceded by
a count of its distinct variables: the 'read' routine creates an array
of this many new variables (using length/2) and sorts them before using
them.  I hope I haven't overlooked something :-/

Quote:
> >  * handled "binary" atoms (arbitrary byte strings, e.g. OLE objects or
> >    database blobs) (I realise that, sadly, not all Standard Prolog
> >    implementations support these as well as SWI-Prolog does)

> I'd expect that systems that handle them internally would
> also read/writeq them properly:

> [eclipse 2]: atom_codes(A, [97,98,0,99]).

> A = 'ab\000c'
> yes.

You're right: all systems should behave this well (do they?)

Paul Singleton



Sun, 18 May 2003 03:00:00 GMT  
 portability of floats

Quote:



>> >  * finitely represented cyclic terms

>> not Standard Prolog, though...

>I'm just not comfortable with a spec. of a general-purpose term-writing
>utility  which says "will go beserk and produce an infinite string if
>passed a cyclic term: but that's not our responsibility because you
>shouldn't be using them, even 'though there's no way either to detect
>them or prove that your code cannot produce them..."

>This is such a blatant denial-of-service weakness that, if it's possible
>to serialise cyclic terms safely, and it is, I prefer to do so.

Interesting point.

But there certainly are ways to prove that your code cannot produce
cyclic terms.  One such method is to do mode analysis on your code.
For example, code which passes the Mercury mode checker, and which is
called in a way that satisfies its mode declarations, and which does
not explicitly use the `any' inst (dynamic modes), the `store' module
(which provides routines for creating cyclic data structures), or the
foreign language interface, cannot create cyclic terms.

--

                                    |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.



Mon, 19 May 2003 03:00:00 GMT  
 portability of floats


|> I found it fairly easy to implement: each written term is preceded by
|> a count of its distinct variables: the 'read' routine creates an array
|> of this many new variables (using length/2) and sorts them before using
|> them.  I hope I haven't overlooked something :-/

As everybody knows: never get involved in a land war in Asia.
And above all, never sort variables !

Perhaps you have overlooked ISO ?
Sorting variables is implementation dependent.

Bart Demoen



Fri, 23 May 2003 03:00:00 GMT  
 portability of floats

Quote:



> |> I found it fairly easy to implement: each written term is preceded by
> |> a count of its distinct variables: the 'read' routine creates an array
> |> of this many new variables (using length/2) and sorts them before using
> |> them.  I hope I haven't overlooked something :-/
> As everybody knows: never get involved in a land war in Asia.

OK, even to a mercenary like me, I reckon that's good advice :-)

Quote:
> And above all, never sort variables !

Does it follow that we should never sort non-ground terms?

Should sort/2 raise an exception if its result depends upon the
internal ordering of variables?

Quote:
> Perhaps you have overlooked ISO ?
> Sorting variables is implementation dependent.

Surely sorting anything is defined by the prevailing, implementation-
defined internal ordering of terms?

Doesn't ISO require any conforming implementation to define *some*
internal ordering of variables?

So isn't it possible in principle to serialise a non-ground term from
system A, and reconstruct it in system B so that the (new) variables
in the reconstructed term have the same total ordering (based on the
local, implementation-specific internal-order scheme) as those in the
original term?

Does the scheme I (crudely) sketched achieve that?

If anything invalidates my efforts, it isn't the implementation-defined
ordering of variables (of which I was aware, and which I reckon I
circumvent), but the leeway which ISO allows for the ordering to change
during execution...

Paul Singleton



Fri, 23 May 2003 03:00:00 GMT  
 portability of floats


Quote:
> Does it follow that we should never sort non-ground terms?

No. As long as you don't need to compare variables, it is ok to sort
non-grond terms.

Quote:
> Should sort/2 raise an exception if its result depends upon the
> internal ordering of variables?

Depends on what you mean by "should" - as in "would it be a good
thing", perhaps yes.

Quote:
> Surely sorting anything is defined by the prevailing, implementation-
> defined internal ordering of terms?

> Doesn't ISO require any conforming implementation to define *some*
> internal ordering of variables?

If I remember well, the order of Prolog terms is determined by ISO
(not by implementations), except where the order between variables
is concerned; the order of variables is implementation "dependend",
not implementation "defined" (see fine print of ISO for the
difference). Most implementors are convinced that they can't define
an order on variables, other than by "whatever the address of the
variable is at a particular moment, determines its rank" - except for
implementors who work with time stamps (explicit or simulated).

Quote:
> So isn't it possible in principle to serialise a non-ground term from
> system A, and reconstruct it in system B so that the (new) variables
> in the reconstructed term have the same total ordering (based on the
> local, implementation-specific internal-order scheme) as those in the
> original term?

I think it is impossible in an ISO standard way, and I would be very
surprised if you can do it at all without learning the intricacies of
(the release of the day of) systems A and B. But you can do something
that is better: start by ignoring the particular order on variables
that an implementation (unreliably) provides. Now BYO order, i.e.
instead of serializing a term like f(X,Y,X), serialize
g([Y,X],f(X,Y,Z)) (e,g, by using a write canonical). Reading this
term, will give you on all Prolog systems a term g(L,T) which has as
invariant:

        the variables in T are ordered (*) according to their position
        in L

(*) the BYO order

Consequently, whenever you need to sort a list of terms and you would
need variable order, you should provide this order yourself; don't
use sort/2 but sort/3 where the extra argument is a list of variables
with the BYO order. I have advocated on several occasions such a
builtin. It is easy to implement (with complexity as good as sort/2)
and can be completely specified.

Quote:
> If anything invalidates my efforts, it isn't the implementation-defined
> ordering of variables

Once more, the ISO terminology here is implementation-DEPENDEND. It means
(a.o.) that it is not required to be in the manual. Maybe PLAI people can
comment on their experience on what it means in practice :-)

Cheers

Bart Demoen



Sat, 24 May 2003 03:00:00 GMT  
 portability of floats

Quote:

> > Should sort/2 raise an exception if its result depends upon the
> > internal ordering of variables?

> Depends on what you mean by "should" - as in "would it be a good
> thing", perhaps yes.

If we want to be on the safe side, yes.

Any order that involves variables is "fragile" in the sense that
the order may get destroyed as soon as a variable gets instantiated
or unified with another variable (even one that was not involved in
the sort). So it is probably better not to rely on the order being
maintained at all.

However, the ISO standard has done something quite clever by
requiring that the variable order remain constant _during_ sorting.
Since that is guaranteed, you can still use sorting to

 - separate variables from non-variables
 - eliminate duplicates (including duplicate variables)
 - group identical variables together (when not removing duplicates)

which are probably reasonably common programming techniques.

--
 Joachim Schimpf              /             phone: +44 20 7594 8187

 London SW7 2AZ, UK         /    http://www.icparc.ic.ac.uk/eclipse



Sat, 24 May 2003 03:00:00 GMT  
 portability of floats


|> However, the ISO standard has done something quite clever by
|> requiring that the variable order remain constant _during_ sorting.

Indeed quite clever because it is cunningly deceiving: the standard
does not mention by what means you can sort (with one exception). It
merely defines (but in a rather loose way in section 7.1.6.5) what a
sorted list is. I bet that you can't read the standard as: my own
implemented quicksort is guaranteed to produce a sorted list.

As far as I know the standard, there is only one way to produce a
sorted list; it is by using setof. Which means that the only reliable
way to write sort/2 in Prolog, is something like:

        sort(In,Out) :-
                setof(X,member(X,In),Out).

You can indeed use this sort to

|>  - separate variables from non-variables
|>  - eliminate duplicates (including duplicate variables)

However, to

|>  - group identical variables together (when not removing duplicates)
the above is useless and I wouldn't trust any of your hand crafted

BTW, to remove duplicates from a list of variables, you shouldn't use
sorting at all: sorting is O(nlog(n)) and you can do it in O(n) - for
those not on the XSB mailing list, there is perhaps a small challenge
here. The sorting issue came up there as well.

Cheers

Bart Demoen



Sun, 25 May 2003 03:00:00 GMT  
 portability of floats


Quote:
>BTW, to remove duplicates from a list of variables, you shouldn't use
>sorting at all: sorting is O(nlog(n))

Sort results are also less reproducible because one depends on some
variable ordering scheme. Otherwise neglectable program changes
may change that order.

Quote:
>                                       and you can do it in O(n) - for
>those not on the XSB mailing list, there is perhaps a small challenge
>here. The sorting issue came up there as well.

O(n) in ISO-Prolog seems only possible if you assume finite lists.

With the advent of SICStus 3.8.5 the new library predicate
terms:term_variables_bag(Term,Vars) can be used to this avail.
It is O(n) in the size of Term's representation, with Vars in the
order as they occur (left-to-right) in Term.

--
Ulrich Neumerkel http://www.complang.tuwien.ac.at/ulrich/



Sun, 25 May 2003 03:00:00 GMT  
 portability of floats




...
|> >those not on the XSB mailing list, there is perhaps a small challenge
|> >here. The sorting issue came up there as well.
|>
|> O(n) in ISO-Prolog seems only possible if you assume finite lists.

For non-finite lists, I cannot use ISO Prolog.

|> With the advent of SICStus 3.8.5 the new library predicate
|> terms:term_variables_bag(Term,Vars) can be used to this avail.
|> It is O(n) in the size of Term's representation, with Vars in the
|> order as they occur (left-to-right) in Term.

But terms:term_variables_bag/2 isn't ISO either. The challenge is to do it
in pure ISO Prolog.

Cheers

Bart Demoen



Mon, 26 May 2003 15:48:07 GMT  
 portability of floats

Quote:

> BTW, to remove duplicates from a list of variables, you shouldn't use
> sorting at all: sorting is O(nlog(n)) and you can do it in O(n) - for
> those not on the XSB mailing list, there is perhaps a small challenge
> here.

Here's one solution:

no_duplicates(Vars, NoDupVars) :-
        copy_term(Vars, Flags),
        collect_vars(Vars, Flags, NoDupVars).

collect_vars([], [], []).
collect_vars([X|Xs], [Flag|Flags], Ys) :-
        ( var(Flag) ->
            Flag = seen,
            Ys = [X|Ys0],
            collect_vars(Xs, Flags, Ys0)
        ;
            collect_vars(Xs, Flags, Ys)
        ).

--
 Joachim Schimpf              /             phone: +44 20 7594 8187

 London SW7 2AZ, UK         /    http://www.icparc.ic.ac.uk/eclipse



Mon, 26 May 2003 03:00:00 GMT  
 
 [ 14 post ] 

 Relevant Pages 

1. An alternative to floating point (was Re: Floating point non-exactness)

2. MSBIN float format <-> IEEE float format

3. VAX float to IEEE float converter

4. Float/Short Float Conversion

5. Getting a floating point number from a float-object

6. vax floating pont to unix floating point

7. IBM 370 Floating point to IEEE floating point

8. Order of float and float precision contaigon defined?

9. Coercing single-float to double-float

10. Dolphin portability to GNU Smalltalk and Squeak

11. portability question

12. portability of apl

 

 
Powered by phpBB® Forum Software