Builtin dict should be callable, since a dict defines a funct ion 
Author Message
 Builtin dict should be callable, since a dict defines a funct ion

Quote:

> Sent: Thursday, December 19, 2002 9:00 PM

> On Thu, 19 Dec 2002 18:17:03 -0800, Erik Max Francis


> > But why would this be helpful?  A dictionary implements a mappable
> > interface; a function implements are callable interface.  They're
> > different interfaces; why would dovetailing them into the
> > same interface be beneficial?

> You can pass a reference to a dict to something that expects
> a function, e.g., a sort. You could pass it as a memoized-result
> proxy function in some context and catch KeyError to take care of
> LRU caching update, etc.

> I'm sure many uses would turn up once the obstacle was removed, and
> people thought of functions as mappings and vice versa ;-)

You can already do this.

Let's say you need a function which works like a (callable)
dictionary for retrieving values. Instead of creating a new
function or subclassing dict, pass it a reference to:

    d.__getitem__

and voila, there's your reference to a function-behaving-like-
a-dictionary.

-gustavo



Wed, 08 Jun 2005 02:59:48 GMT  
 Builtin dict should be callable, since a dict defines a funct ion

Quote:


>> Sent: Thursday, December 19, 2002 9:00 PM

>> On Thu, 19 Dec 2002 18:17:03 -0800, Erik Max Francis


>> > But why would this be helpful?  A dictionary implements a mappable
>> > interface; a function implements are callable interface.  They're
>> > different interfaces; why would dovetailing them into the
>> > same interface be beneficial?

>> You can pass a reference to a dict to something that expects
>> a function, e.g., a sort. You could pass it as a memoized-result
>> proxy function in some context and catch KeyError to take care of
>> LRU caching update, etc.

>> I'm sure many uses would turn up once the obstacle was removed, and
>> people thought of functions as mappings and vice versa ;-)

>You can already do this.

>Let's say you need a function which works like a (callable)
>dictionary for retrieving values. Instead of creating a new
>function or subclassing dict, pass it a reference to:

>    d.__getitem__

>and voila, there's your reference to a function-behaving-like-
>a-dictionary.

True, but it is not quite the same thing. On the surface I was not
asking for more, so your answer (and the same pointed out by others)
is right on as far as it goes (I was a bit chagrined not to have thought
of this obvious thing in the first place). But I still have a nagging
dissatisfaction with the answer.

If I know that a dict will have a _default_ function-style interface
as if its class were defined with __call__ = __getitem__, then I can
make an anonymous collection of function and dict objects, e.g., a list

  fdmix = [f1,f2,d1,d2]

and depend on the callable assumption for all elements.

Now what power does this give me over the following list?

  fdmix2 = [f1,f2,d1.__getitem__, d2.__getitem__]

Well, this list does not carry the same information. I have been
forced to throw away the possibility of distinguishing and acting
differently based on type _if_ that is useful for another use of
the list (e.g., passing the collection to different places).

I admit that a subclass will do what I want, but there is a performance
hit for some reason (see below).

Note that speed could make a lot of difference if you were passing a function
that processed pixels for example:

 >>> import timefuns
 >>> def noop(k): pass
 ...
 >>> d={'xx':123}
 >>> class D(dict): __call__ = dict.__getitem__
 ...
 >>> dc = D(d)
 >>> df = d.__getitem__
 >>> dc.__name__ = 'D inst'
 >>> calls = [(noop,'xx'),(dc,'xx'),(df,'xx')]
 >>> timefuns.timefuns(100000, calls)
            timing oh:  0.000015  ratio
                 noop:  0.000009   1.00
               D inst:  0.000012   1.29
          __getitem__:  0.000004   0.48

Timing overhead is time measured by t1-t0 in
   t0=time.clock()
   t1=time.clock()
where otherwise there is a call to the function as a single line
in between, in thefunction timing loop.

This overhead is subtracted out to get a better relative time between
the functions. The ratios are wrt to the first function in the list.

Note that that d.__getitem__ is almost twice as fast as even just
calling a noop function with the same single parameter.

So I think there is more than one argument in favor of this simple
backwards-compatible change ;-)

Regards,
Bengt Richter



Wed, 08 Jun 2005 23:46:53 GMT  
 Builtin dict should be callable, since a dict defines a funct ion


Quote:
> If I know that a dict will have a _default_ function-style interface
> as if its class were defined with __call__ = __getitem__, then I can
> make an anonymous collection of function and dict objects, e.g., a
list

>   fdmix = [f1,f2,d1,d2]

> and depend on the callable assumption for all elements.

> Now what power does this give me over the following list?

>   fdmix2 = [f1,f2,d1.__getitem__, d2.__getitem__]

> Well, this list does not carry the same information. I have been
> forced to throw away the possibility of distinguishing and acting
> differently based on type _if_ that is useful for another use of
> the list (e.g., passing the collection to different places).

> I admit that a subclass will do what I want, but there is a
performance
> hit for some reason (see below).

Since I disagree that you made the proper comparison (see below), I
disagree with the conclusion.

Quote:

> Note that speed could make a lot of difference if you were passing a
function
> that processed pixels for example:

For speed, pixel processing should be done in C.

- Show quoted text -

Quote:
>  >>> import timefuns
>  >>> def noop(k): pass
>  ...
>  >>> d={'xx':123}
>  >>> class D(dict): __call__ = dict.__getitem__
>  ...
>  >>> dc = D(d)
>  >>> df = d.__getitem__
>  >>> dc.__name__ = 'D inst'
>  >>> calls = [(noop,'xx'),(dc,'xx'),(df,'xx')]
>  >>> timefuns.timefuns(100000, calls)
>             timing oh:  0.000015  ratio
>                  noop:  0.000009   1.00
>                D inst:  0.000012   1.29
>           __getitem__:  0.000004   0.48

> Timing overhead is time measured by t1-t0 in
>    t0=time.clock()
>    t1=time.clock()
> where otherwise there is a call to the function as a single line
> in between, in thefunction timing loop.

> This overhead is subtracted out to get a better relative time
between
> the functions. The ratios are wrt to the first function in the list.

> Note that that d.__getitem__ is almost twice as fast as even just
> calling a noop function with the same single parameter.

> So I think there is more than one argument in favor of this simple
> backwards-compatible change ;-)

I think there is a mis-comparison here.  You are asking that dicts be
augmented to look like your class D, so that d['a'] == d('a').
However, your 'test' above compares d('a') to d.__getitem__('a'), not
d['a'], avoiding most overhead, and finds the latter faster.  So what?
If you got the augmentation, it would run at the slower speed, due to
the need to look up .__call__().  So this result does not support your
proposal.  I could even see it as an argument against.

What you have shown (thank you) is that the exposure of internal
methods that came with new-style classes lets us speed up dict lookups
by bypassing the overhead of interpreting  the d[key] syntax, which
has to start with determining the type(d) (seq or dict?).  Replacing
d[key] with d.__getitem__(key) (where d__getitem__ is looked up just
once) implicitly 'declares' the type of d within the loop by bypassing
repeated lookup of its type.

Terry J. Reedy



Thu, 09 Jun 2005 01:27:50 GMT  
 Builtin dict should be callable, since a dict defines a funct ion

Quote:



>> If I know that a dict will have a _default_ function-style interface
>> as if its class were defined with __call__ = __getitem__, then I can
>> make an anonymous collection of function and dict objects, e.g., a
>list

>>   fdmix = [f1,f2,d1,d2]

>> and depend on the callable assumption for all elements.

>> Now what power does this give me over the following list?

>>   fdmix2 = [f1,f2,d1.__getitem__, d2.__getitem__]

>> Well, this list does not carry the same information. I have been
>> forced to throw away the possibility of distinguishing and acting
>> differently based on type _if_ that is useful for another use of
>> the list (e.g., passing the collection to different places).

>> I admit that a subclass will do what I want, but there is a
>performance
>> hit for some reason (see below).

>Since I disagree that you made the proper comparison (see below), I
>disagree with the conclusion.

>> Note that speed could make a lot of difference if you were passing a
>function
>> that processed pixels for example:

>For speed, pixel processing should be done in C.

I can do that, but if I am interactively experimenting with visual effects,
it can be faster to use python than set up a an automatic build
that will let me cycle experiments as fast in C. Plus Python is more
pleasant to code in ;-)

- Show quoted text -

Quote:

>>  >>> import timefuns
>>  >>> def noop(k): pass
>>  ...
>>  >>> d={'xx':123}
>>  >>> class D(dict): __call__ = dict.__getitem__
>>  ...
>>  >>> dc = D(d)
>>  >>> df = d.__getitem__
>>  >>> dc.__name__ = 'D inst'
>>  >>> calls = [(noop,'xx'),(dc,'xx'),(df,'xx')]
>>  >>> timefuns.timefuns(100000, calls)
>>             timing oh:  0.000015  ratio
>>                  noop:  0.000009   1.00
>>                D inst:  0.000012   1.29
>>           __getitem__:  0.000004   0.48

>> Timing overhead is time measured by t1-t0 in
>>    t0=time.clock()
>>    t1=time.clock()
>> where otherwise there is a call to the function as a single line
>> in between, in thefunction timing loop.

>> This overhead is subtracted out to get a better relative time
>between
>> the functions. The ratios are wrt to the first function in the list.

>> Note that that d.__getitem__ is almost twice as fast as even just
>> calling a noop function with the same single parameter.

>> So I think there is more than one argument in favor of this simple
>> backwards-compatible change ;-)

>I think there is a mis-comparison here.  You are asking that dicts be
>augmented to look like your class D, so that d['a'] == d('a').

I'm not sure what you mean by 'augmented', but the point is to be
more efficient than the class D, not adopt its inefficiencies.

Quote:
>However, your 'test' above compares d('a') to d.__getitem__('a'), not
>d['a'], avoiding most overhead, and finds the latter faster.  So what?
>If you got the augmentation, it would run at the slower speed, due to

My idea for the change must be different from what you are thinking.

Quote:
>the need to look up .__call__().  So this result does not support your
>proposal.  I could even see it as an argument against.

The part of the timing code that does the calling is

        240 SET_LINENO              41
        243 LOAD_FAST                4 (clock)
        246 CALL_FUNCTION            0
        249 STORE_FAST              12 (t0)

        252 SET_LINENO              42
        255 LOAD_FAST                7 (f)
        258 LOAD_FAST                3 (args)
        261 CALL_FUNCTION_VAR        0
        264 STORE_FAST              13 (dummy)

        267 SET_LINENO              43
        270 LOAD_FAST                4 (clock)
        273 CALL_FUNCTION            0
        276 STORE_FAST              10 (t1)

where f is bound to whatever we are testing and args is the arg tuple, I also do
a dummy store. Are you saying the C interpretation of CALL_FUNCTION_VAR
would be slower to find __call__ than __getitem__ if neither were overridden
and both pointed to the same place in C? I would think it would be hard to measure
a difference.

Quote:
>What you have shown (thank you) is that the exposure of internal
>methods that came with new-style classes lets us speed up dict lookups
>by bypassing the overhead of interpreting  the d[key] syntax, which

Whoa, the syntax isn't interpreted in the timing loop unless we are
compiling source every time. There seems to be more to it (see below).

Quote:
>has to start with determining the type(d) (seq or dict?).  Replacing
>d[key] with d.__getitem__(key) (where d__getitem__ is looked up just
>once) implicitly 'declares' the type of d within the loop by bypassing
>repeated lookup of its type.

I think that's a wrong conclusion from what I did. I just showed that
d.__getitem__ as a dict method wrapper was faster than getting the same
by dynamically looking up __call__ on the subclass instance every time.

The fair comparison would have been d.__getitem__(k) vs dc.__call__(k),
not d.__getitem__(k) vs dc(k). But d[k] is still faster than either one
for the current (2.2.2 windows) implementation, it looks liketo me.
I tried to test it:

 >>> f1 = lambda d={1:2}:d[1]
 >>> f2 = lambda d={1:2}.__getitem__:d(1)
 >>> import dis
 >>> dis.dis(f1)
           0 SET_LINENO               1
           3 LOAD_FAST                0 (d)
           6 LOAD_CONST               1 (1)
           9 BINARY_SUBSCR
          10 RETURN_VALUE
 >>> dis.dis(f2)
           0 SET_LINENO               1
           3 LOAD_FAST                0 (d)
           6 LOAD_CONST               1 (1)
           9 CALL_FUNCTION            1
          12 RETURN_VALUE
 >>> f1.func_defaults
 ({1: 2},)
 >>> f2.func_defaults
 (<method-wrapper object at 0x007DBB50>,)
 >>> ^Z

 [ 7:50] C:\pywk\clp>timefuns.py -L "lambda d=1:d" -L "lambda d={1:2}:d[1]" -L "lambda d={1:2}.__
 getitem__:d(1)" -n 100000
            timing oh:  0.000013  ratio
             <lambda>:  0.000006   1.00
             <lambda>:  0.000009   1.53
             <lambda>:  0.000012   1.96

For reference 1.00 base, I added a lambda that has a default arg d and only returns it
instead of returning d[1] or d(1),

 >>> f = lambda d=1:d
 >>> import dis
 >>> dis.dis(f)
           0 SET_LINENO               1
           3 LOAD_FAST                0 (d)
           6 RETURN_VALUE

so subtracting that out should show the d[1] vs d(1) times without the lambda
call overhead. So d[1] to d(1) seems to be about 1:2 (9-6:12-6 or 53:96)

I.e., it appears that when you are dealing with a locally bound directory, d[1] is faster than
using the __getitem__ method wrapper, even if you pre-bind that to a local variable (default arg
in both cases).

It seems that BINARY_SUBSCR gets very quickly to the C PyObject_GetItem subroutine, whereas
the other route is much more involved, judging from a cursory look at ceval.c and friends.

Regards,
Bengt Richter



Thu, 09 Jun 2005 23:56:32 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. Builtin dict should be callable, since a dict defines a function

2. expanding a list and dict to a *args and **dict

3. Dict can't update with dict-like objects

4. NewBuiltins: added dict() -- inverse of dict.items()

5. Dict to String and String to Dict with Visual Works

6. builtin/library.__doc__umentation?

7. How to get the Callable of builtin functions

8. callable() builtin-function

9. Tcl8.5: [string map $dict ...] and ordering inside $dict

10. Dict. to String and String to Dict. with Visual Works

11. Edit in place and dict rules

12. Recreating a Btreive Dict

 

 
Powered by phpBB® Forum Software