Changing the Division Operator -- PEP 238, rev 1.12
Author Message
Changing the Division Operator -- PEP 238, rev 1.12

Here's a new revision of PEP 238.  I've incorporated clarifications of
issues that were brought up during the discussion of rev 1.10 -- from
typos via rewording of ambiguous phrasing to the addition of new open
issues.  I've decided not to go for the "quotient and ratio"
terminology -- my rationale is in the PEP.

IF YOU WANT ME TO SEE YOUR COMMENTS, DON'T CHANGE THE SUBJECT OF YOUR
FOLLOWUP.  I cannot follow all of c.l.py, but I will follow this one
thread.  Please don't fragment the thread -- I won't see other

--Guido van Rossum (home page: http://www.*-*-*.com/ ~guido/)

PEP: 238
Title: Changing the Division Operator
Version: \$Revision: 1.12 \$

Status: Draft
Type: Standards Track
Created: 11-Mar-2001
Python-Version: 2.2
Post-History: 16-Mar-2001, 26-Jul-2001, 27-Jul-2001

Abstract

The current division (/) operator has an ambiguous meaning for
numerical arguments: it returns the floor of the mathematical
result of division if the arguments are ints or longs, but it
returns a reasonable approximation of the division result if the
arguments are floats or complex.  This makes expressions expecting
float or complex results error-prone when integers are not
expected but possible as inputs.

We propose to fix this by introducing different operators for
different operations: x/y to return a reasonable approximation of
the mathematical result of the division ("true division"), x//y to
return the floor ("floor division").  We call the current, mixed
meaning of x/y "classic division".

Because of severe backwards compatibility issues, not to mention a
major flamewar on c.l.py, we propose the following transitional
measures (starting with python 2.2):

- Classic division will remain the default in the Python 2.x
series; true division will be standard in Python 3.0.

- The // operator will be available to request floor division
unambiguously.

- The future division statement, spelled "from __future__ import
division", will change the / operator to mean true division
throughout the module.

- A command line option will enable run-time warnings for classic
division applied to int or long arguments; another command line
option will make true division the default.

- The standard library will use the future division statement and
the // operator when appropriate, so as to completely avoid
classic division.

Motivation

The classic division operator makes it hard to write numerical
expressions that are supposed to give correct results from
arbitrary numerical inputs.  For all other operators, one can
write down a formula such as x*y**2 + z, and the calculated result
will be close to the mathematical result (within the limits of
numerical accuracy, of course) for any numerical input type (int,
long, float, or complex).  But division poses a problem: if the
expressions for both arguments happen to have an integral type, it
implements floor division rather than true division.

The problem is unique to dynamically typed languages: in a
statically typed language like C, the inputs, typically function
arguments, would be declared as double or float, and when a call
passes an integer argument, it is converted to double or float at
the time of the call.  Python doesn't have argument type
declarations, so integer arguments can easily find their way into
an expression.

The problem is particularly pernicious since ints are perfect
substitutes for floats in all other circumstances: math.sqrt(2)
returns the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0
return the same value, and so on.  Thus, the author of a numerical
routine may only use floating point numbers to test his code, and
believe that it works correctly, and a user may accidentally pass
in an integer input value and get incorrect results.

Another way to look at this is that classic division makes it
difficult to write polymorphic functions that work well with
either float or int arguments; all other operators already do the
right thing.  No algorithm that works for both ints and floats has
a need for truncating division in one case and true division in
the other.

The correct work-around is subtle: casting an argument to float()
is wrong if it could be a complex number; adding 0.0 to an
argument doesn't preserve the sign of the argument if it was minus
zero.  The only solution without either downside is multiplying an
argument (typically the first) by 1.0.  This leaves the value and
sign unchanged for float and complex, and turns int and long into
a float with the corresponding value.

It is the opinion of the authors that this is a real design bug in
Python, and that it should be fixed sooner rather than later.
Assuming Python usage will continue to grow, the cost of leaving
this bug in the language will eventually outweigh the cost of
fixing old code -- there is an upper bound to the amount of code
to be fixed, but the amount of code that might be affected by the
bug in the future is unbounded.

Another reason for this change is the desire to ultimately unify
Python's numeric model.  This is the subject of PEP 228[0] (which
is currently incomplete).  A unified numeric model removes most of
the user's need to be aware of different numerical types.  This is
good for beginners, but also takes away concerns about different
numeric behavior for advanced programmers.  (Of course, it won't
remove concerns about numerical stability and accuracy.)

In a unified numeric model, the different types (int, long, float,
complex, and possibly others, such as a new rational type) serve
mostly as storage optimizations, and to some extent to indicate
orthogonal properties such as inexactness or complexity.  In a
unified model, the integer 1 should be indistinguishable from the
floating point number 1.0 (except for its inexactness), and both
should behave the same in all numeric contexts.  Clearly, in a
unified numeric model, if a==b and c==d, a/c should equal b/d
(taking some liberties due to rounding for inexact numbers), and
since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should also
equal 0.5.  Likewise, since 1//2 equals zero, 1.0//2.0 should also
equal zero.

Variations

Aesthetically, x//y doesn't please everyone, and hence several
variations have been proposed: x div y, or div(x, y), sometimes in
combination with x mod y or mod(x, y) as an alternative spelling
for x%y.

We consider these solutions inferior, on the following grounds.

- Using x div y would introduce a new keyword.  Since div is a
popular identifier, this would break a fair amount of existing
code, unless the new keyword was only recognized under a future
division statement.  Since it is expected that the majority of
code that needs to be converted is dividing integers, this would
greatly increase the need for the future division statement.
Even with a future statement, the general sentiment against
adding new keywords unless absolutely necessary argues against
this.

- Using div(x, y) makes the conversion of old code much harder.
Replacing x/y with x//y or x div y can be done with a simple
query replace; in most cases the programmer can easily verify
that a particular module only works with integers so all
occurrences of x/y can be replaced.  (The query replace is still
needed to weed out slashes occurring in comments or string
literals.)  Replacing x/y with div(x, y) would require a much
more intelligent tool, since the extent of the expressions to
the left and right of the / must be analyzed before the
placement of the "div(" and ")" part can be decided.

Alternatives

In order to reduce the amount of old code that needs to be
converted, several alternative proposals have been put forth.
Here is a brief discussion of each proposal (or category of
proposals).  If you know of an alternative that was discussed on
c.l.py that isn't mentioned here, please mail the second author.

- Let / keep its classic semantics; introduce // for true
division.  This still leaves a broken operator in the language,
and invites to use the broken behavior.  It also shuts off the
road to a unified numeric model a la PEP 228[0].

- Let int division return a special "portmanteau" type that
behaves as an integer in integer context, but like a float in a
float context.  The problem with this is that after a few
operations, the int and the float value could be miles apart,
it's unclear which value should be used in comparisons, and of
course many contexts (like conversion to string) don't have a
clear integer or float context.

- Use a directive to use specific division semantics in a module,
rather than a future statement.  This retains classic division
as a permanent wart in the language, requiring future
generations of Python programmers to be aware of the problem and
the remedies.

- Use "from __past__ import division" to use classic division
semantics in a module.  This also retains the classic division
as a permanent wart, or at least for a long time (eventually the
past division statement could raise an ImportError).

- Use a directive (or some other way) to specify the Python
version for which a specific piece of code was developed.  This
requires future Python interpreters to be able to emulate
*exactly* several previous versions of Python, and moreover to
do so for
...

Wed, 14 Jan 2004 03:48:07 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:
> I prefer the
>       terminology to be slightly awkward if that avoids unambiguity.

Yes, avoiding unambiguity is very important...
--
David Eppstein       UC Irvine Dept. of Information & Computer Science

Wed, 14 Jan 2004 04:14:26 GMT
Changing the Division Operator -- PEP 238, rev 1.12
there is just one unpleasan't thing left over:
in the transition phase its __div__, __truediv__, __floordiv__ and their
respective 'i' version for arg assignment. this solution allows the object
to "see" what the caller wants for a division type. ok, but..

classes would need a change when upgrading from 2.x to 3.x ( __truediv__
will become the same as __div__ in 3.x but its not with "from __future_
import division" in 2.x)? what would 3.x do? search for __truediv__ and
then for __div__? that would give wrong results with old modules that only
have a __div__ (emulating the missbehaviour) and no __truediv__. (ok i
don't realy belive in classes emulating some missbehaviour, see next
paragraph)

wouldn't it be better when only __div__ and __floordiv__ were needed?
classes could not provide the two diffrent meanings of '/' but how many
classes do realy need this? when somebody is implementing his own number
object he will anyway provide a true division, wouldn't he?
this would also *not* introduce a new name that's later unused/merged with
__div__. (remember: one obvious way to do it)

and now for the positive things...
I can agree with the motivation section. Now that the big picture mostly
clear - i like it. (Can't wait for a unified numeric model, it sounds good
;-)

chris

the following two reasons have changed my mind from "oh no! - code breaks"
to "yes! let's do it":

Quote:
> PEP: 238
...
>     It is the opinion of the authors that this is a real design bug in
>     Python, and that it should be fixed sooner rather than later.
>     Assuming Python usage will continue to grow, the cost of leaving
>     this bug in the language will eventually outweigh the cost of
>     fixing old code -- there is an upper bound to the amount of code
>     to be fixed, but the amount of code that might be affected by the
>     bug in the future is unbounded.

>     Another reason for this change is the desire to ultimately unify
>     Python's numeric model.  This is the subject of PEP 228[0] (which
>     is currently incomplete).  A unified numeric model removes most of
>     the user's need to be aware of different numerical types.  This is
>     good for beginners, but also takes away concerns about different
>     numeric behavior for advanced programmers.  (Of course, it won't
>     remove concerns about numerical stability and accuracy.)

...

--

Wed, 14 Jan 2004 07:55:20 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:
> Here's a new revision of PEP 238.  (...)

I'm sure someone else will find something to comment on, but it looks
good to me.  No suggestions at this point.

One question though - is it anticipated that the items in the Open
section (e.g., the generation of OverflowError) need to be resolved as
part of the PEP process, or will it be postponed until implementation
and then the PEP adjusted?  I'm unlikely to be coding in cases
affected by the decision, and see arguments on both sides, so I don't
have a particular preference.

--
-- David
--
/-----------------------------------------------------------------------\

|             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
/  860 C{*filter*}Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/

Wed, 14 Jan 2004 08:01:55 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:

> Here's a new revision of PEP 238.  I've incorporated clarifications of
> issues that were brought up during the discussion of rev 1.10

Thanks.

Originally, I was uncomfortable with the change.  But now I'm starting
to warm up to it nicely.  The threads about non-programmers (eg.,
medical technicians) getting caught by classic division was quite
compelling.  Sometimes I forget that programming itself isn't the goal.

I think my original discomfort was due to using C for so long (and
damaging my brain along the way).  But maybe C is like tobacco, and if
I quit, maybe my brain will eventually heal.  One can hope.  :-)

-Mark

Wed, 14 Jan 2004 08:02:39 GMT
Changing the Division Operator -- PEP 238, rev 1.12

|
| Semantics of Floor Division
|
|     Floor division will be implemented in all the Python numeric
|     types, and will have the semantics of
|
|         a // b == floor(a/b)
|
| [...]
|     For complex numbers, // raises an exception, since float() of a
|     complex number is not allowed.

Shouldn't this read, "since floor() of a complex number is not allowed"?
That would make more sense in context..

--
Cliff Crawford            http://www.sowrong.org/
A sign should be posted over every campus toilet:
"This flush comes to you by courtesy of capitalism."
-- Camille Paglia

Wed, 14 Jan 2004 09:31:46 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:

> |     For complex numbers, // raises an exception, since float() of a
> |     complex number is not allowed.

> Shouldn't this read, "since floor() of a complex number is not allowed"?
> That would make more sense in context..

Yes.  But it's the same difference: floor() of a complex doesn't work
because float() of a complex doesn't work. :-)

Wed, 14 Jan 2004 10:34:57 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:

> there is just one unpleasan't thing left over:
> in the transition phase its __div__, __truediv__, __floordiv__ and their
> respective 'i' version for arg assignment. this solution allows the object
> to "see" what the caller wants for a division type. ok, but..

> classes would need a change when upgrading from 2.x to 3.x ( __truediv__
> will become the same as __div__ in 3.x but its not with "from __future_
> import division" in 2.x)? what would 3.x do? search for __truediv__ and
> then for __div__? that would give wrong results with old modules that only
> have a __div__ (emulating the missbehaviour) and no __truediv__. (ok i
> don't realy belive in classes emulating some missbehaviour, see next
> paragraph)

> wouldn't it be better when only __div__ and __floordiv__ were needed?
> classes could not provide the two diffrent meanings of '/' but how many
> classes do realy need this? when somebody is implementing his own number
> object he will anyway provide a true division, wouldn't he?
> this would also *not* introduce a new name that's later unused/merged with
> __div__. (remember: one obvious way to do it)

I have to think some more about this.  Maybe nb_true_divide should
look for __truediv__ first and then fall back to __div__.  Not so for
nb_floor_divide though: it is a new operator.

Wed, 14 Jan 2004 10:37:16 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:

> One question though - is it anticipated that the items in the Open
> section (e.g., the generation of OverflowError) need to be resolved as
> part of the PEP process, or will it be postponed until implementation
> and then the PEP adjusted?  I'm unlikely to be coding in cases
> affected by the decision, and see arguments on both sides, so I don't
> have a particular preference.

Generally, a PEP isn't finished until the implementation is done --
in the last stages of implementation, details of a PEP often get
corrected.  So the answer to your question is "Yes". :)

(For some PEPs, the PEP follows the implementation -- this seems to be
what's happening for type/class unification, where I'm always ahead of
the PEP in my code...)

Wed, 14 Jan 2004 10:39:24 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:

> I think my original discomfort was due to using C for so long (and
> damaging my brain along the way).  But maybe C is like tobacco, and if
> I quit, maybe my brain will eventually heal.  One can hope.  :-)

There's also the phenomenon Tim calls "Extreme Fear of Floating
Point".  That used to be a rational fear (no pun intended :-) but that
was long ago.

Wed, 14 Jan 2004 10:40:34 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:
>     The current division (/) operator has an ambiguous meaning for
>     numerical arguments: it returns the floor of the mathematical
>     result of division if the arguments are ints or longs, but it
>     returns a reasonable approximation of the division result if the
>     arguments are floats or complex.  This makes expressions
expecting
>     float or complex results error-prone when integers are not
>     expected but possible as inputs.

I previously suggested at this point:
Quote:
> > and vice versa for integer expressions getting float input

which I should revise to include a key stipulation:
and vice versa for integer expressions getting float integer input

Quote:
> That's not the same argument -- expressions expecting ints are
> generally broken when they receive floats; expression expecting
floats
> are *only* broken when they receive ints IF THEY USE DIVISION.  The
> PEP cannot do anything about floats passed in where ints are
expected
> -- such usage is broken.

I disagree; due to my leaving a key word out, you have missed one of
the virtues of the proposal that convinced me to support it.  An
expression that is valid for int integers (ie, 387) is just as valid
for float integers (ie, 387.0), which you claim to be +/- the same
thing (modulo type).  With //, such expressions will give the same
numerical result with either input,  whereas they now, with /,
generally give different numerical results.

Example: 20//6 = 3,   20.0//6.0 = 3.0 (hooray),   20.0/6.0 = 3.5 (ugh,
if I want 3==3.0)

A corollary effect: suppose I publish a discrete algorithm using
Python and someone runs it on a typical hand calculator that converts
int literals to their float equivalent and I explain that 'i//j' means
to enter 'i / j = floor' (with the flooring done by substraction if
necessary) and the person does just that.  The answer will be correct!
------

Me:

Quote:
> > The only way I can think of to 'globally' turn on new division on
my
> > own Windows computer is to rebind .py and .pyc to python.bat
> > containing '...python.exe -Dnew'.
Guido:
> Do I need to add that to the PEP?

and in response to someone else

Quote:
> What exactly would you like to see addressed in the PEP?  Should I
> spell out that you have to do a query replace of / to //,approving
the
> replace if the arguments are integers?  That seems pretty trivial.

Right now you are writing to persuade and satisfy people who are
mostly experienced programmers.  Later it will remain to be read by a
wider variety of people, including beginners.  Have to? No.
Advisable? Maybe so.  Even with the rewrite, the proposal may still
have enough shock value to new readers to inhibit some people's normal
thought process.

Terry J. Reedy

Wed, 14 Jan 2004 11:27:41 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:
>Originally, I was uncomfortable with the change.  But now I'm starting
>to warm up to it nicely.  The threads about non-programmers (eg.,
>medical technicians) getting caught by classic division was quite
>compelling.  Sometimes I forget that programming itself isn't the goal.

Would be a lot more compelling were they reports *from* rather than
reports *about*. There probably was once such a technician.  Until
I hear from him (or them), I give the entire sub-thread weight in
range -0 to 0.

Caught how.  Typo and correct. Happened  to be me more than once.
Personally. Typo and 15 minute debug. Personally. At least once.
Are you compelled? Why?

Hearsay is forbidden as evidence in formal proceedings for
good reason. Should be informally forbidden here, IMO.

No opportunity to cross examine, to establish credibility and context.

On the merits only, please.

All of which is without comment as to / or //.  My issues/concerns
have been mostly about process.  Which is why I felt it necessary
to pop in here.

Reasoned PEPs have gone a long way toward calming the waters.

Next time round, hopefully, no Queen for a Day stories.

ART

Wed, 14 Jan 2004 11:11:38 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:
>API Changes
>    - Overloaded operator methods:

>      __div__(), __floordiv__(), __truediv__();

>      __idiv__(), __ifloordiv__(), __itruediv__().
>    In Python 3.0, the classic division semantics will be removed; the
>    classic division APIs will become synonymous with true division.

I think you mentioned that __truediv__ and __itruediv__ will go away
at when we reach 3.0; I think that would be good to include in the PEP.

I also still think that __truediv__ and __itruediv__ cause more
problem than they solve; I've thought about the issue some more,
and I hope I can now explain a little better why.

Consider an existing
class Foo:
def __div__(self, other):
...

With your proposal, this will break as soon as it's called from
a module importing the future division.

The author of the original class foo then has the option of:

a)      simply adding __truediv__ = __div__

b)      changing __div__ to __truediv__, requiring all users
of the class to import the future division;
knowing that it will break again when we hit 3.0.

c)      implementing __truediv__ with different semantics
than __div__, so that division of foo can behave
differently depending on wether the user imports
the future division or not.

d)      and of course, telling me it will work again in 3.0,
so he doesn't want to spend any effort on changing it
(or not telling me anything, as that may be).

I'm just totally unconvinced that the advantages of (c)
outweighs both the immediate breakage, and the possibilities
for options (b) and (d).

That's why I would like to ask:
**************************************************************
**                                                          **
** Who out there has a need for (c), please make your case! **
**                                                          **
**************************************************************

If there's really a demonstrated need for this, I would suggest
letting x/y, with future division imported, fall back to __div__
if there's no __truediv__ (and perhaps the other way around after
we switch to 3.0, but I doubt that would be needed).

If there's no demonstrated need, I repeat my suggestion to simply keep
__div__/__idiv__ for x/y and use __floordiv__/__ifloordiv__ for x//y.

/Paul

Wed, 14 Jan 2004 21:31:39 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:

> Consider an existing
>    class Foo:
>        def __div__(self, other):
>            ...

> With your proposal, this will break as soon as it's called from
> a module importing the future division.

And that's exactly right, because we don't know if the type is trying
to "look like" a float or an int, so we don't know whether to map
__floordiv__ to __div__ or __truediv__.

This is only a problem when the user of the class uses true or floor
division, and I think it's fair that the system shouldn't attempt to
guess.

Wed, 14 Jan 2004 23:42:28 GMT
Changing the Division Operator -- PEP 238, rev 1.12

Quote:

> Would be a lot more compelling were they reports *from* rather than

Agreed.  But what struck me was that when a non-programmer came to a
Python discussion list asking whether Python was suitable for his
project, several people warning him about the pitfalls of division (and
not much else).

I don't recall whether the experienced programmers explicitly said they
had fallen into the trap.  Considering how many of them jumped right to
the subject of division, it seems plausible that they had experienced
it themselves.  And something made enough of an impression on them to
make the issue the first one they mentioned.

I suppose it could be the computer equivalent of hypochondria -- people
hear about an obscure bug and it suddenly becomes The One True Bug even
though neither they, their friends, family, etc. actually encounter the
bug.  But the comments sounded like they came more from experience than
fear.

Quote:
> Are you compelled? Why?

Because it helped me understand the motivation for the PEP.  It helped
convince me that there was indeed a problem, and that it was worthy of
a fix.

My current work and leisure programming almost never uses floating
point.  I primarily use C, a statically typed language; when a routine
expects a float as input, it gets a float, or the program doesn't
compile.  And when I last did non-trivial numeric work, it was with a
version of BASIC that only supported floats; what looked like integers
were actually floats that just happened to have all zeroes after the
decimal point.  So, I've never really had a problem where integer
division happened when I was expecting float division.

The current division operator is perfectly understandable to me.  You
give it integer inputs and you get an integer output.  You feed it
float inputs and get a float output.  I know what I'm giving it, so I
know what it will produce.  It's really two different operators (float
division and int division) that happen to have the same name.  It works
the same way in C (though the compiler knows at compile time whether it
will be float or int division).  Given my background, the behavior was
what I expected, so I didn't think it was broken.

Those other threads reminded me that Python is approachable.  People
will use it to help solve a problem in their area of expertise.  Those
people may not be as aware of the distinction between ints and floats.
And since Python is so easy to use interactively, I can see someone
typing in an integer literal and it getting to a routine that assumed a
float.  Python would behave differently than a typical pocket
calculator (where everything is a float).

If I write a routine that expects floats, and I don't have control over
what calls that routine, I'll probably check the inputs or explicitly
convert them to floats.  Would a non-programmer do that?  Maybe
(probably?) not.  They may (probably?) do less testing.  Since they
probably wouldn't get a compile or runtime error, just incorrect
results, I think it would take longer to notice the error.

-Mark

Sat, 17 Jan 2004 02:13:58 GMT

 Page 1 of 2 [ 26 post ] Go to page: [1] [2]

Relevant Pages