[Python-ideas] Adding __getter__ to compliment __iter__.

Ron Adam ron3200 at gmail.com
Thu Jul 18 08:12:38 CEST 2013



On 07/17/2013 09:27 PM, Steven D'Aprano wrote:
> On 18/07/13 06:51, Ron Adam wrote:
>
>> I played around with trying to find something that would work like the
>> example Nick put up and found out that the different python types are not
>> similar enough in how they do things to make a function that takes a
>> method or other operator work well.
>
> I don't understand this paragraph.

That what the next paragraph was explaining.  But not very well I guess.

> Functions that take
> methods/operators/other functions work perfectly well, they're called
> second-order functions. Decorators, factory functions, map, filter, and
> functools.reduce are all good examples of this.

Yes, In the example I was referring to, it took an operator that was used 
to select a method to combine, append, or extend a group of other objects. 
  While it looked simple, I think beginners would have a lot of trouble 
using it.


>> What happens is you either end up with widely varying results depending
>> on how the methods are implemented on each type, or an error because only
>> a few methods are very common on all types.  Mostly introspection methods.
>
> Yes. If you call a function f on arbitrary objects, some of those objects
> may be appropriate arguments to f, some may not. What's your point?

This proposal may improve some of those cases.

Correct, and we can do better when it comes to moving content into 
containers.  The iter protocol is pretty good for getting content out, It's 
a single way of extracting data that is the same for a lot of types.

What is missing is the inverse of that, a single uniform way of getting 
data into containers.


>> I believe this to be stronger underlying reason why functions like reduce
>> and map were removed. And it's also a good reason not to recommend
>> functions like sum() for things other than numbers.
>
> reduce and map have not been removed. map hasn't even been moved out of the
> builtins.

I should of checked, It was discussed at one time.  I tend to write my own 
functions of that sort.  Usually with a comprehension or generator expression.


>> To use functions similar to that, you really have to think about what
>> will happen in each case because the gears of the functions and methods
>> are not visible in the same way a comprehension or generator expression is.
>
> I don't understand this sentence.

When you look at a generator expression, you can see what it does.


>> It's too late to change how a lot of those methods work and I'm not sure
>> it will still work very well.
>>
>> One of the most consistent protocols python has is the iterator and
>> generator protocols.  The reason they work so well is that they need to
>> interface with for-loops and nearly all containers support that.
>>
>> examples...
>>
>>>>> a = [1,2,3]
>>>>> iter(a)
>> <list_iterator object at 0x7f3bc9306e90>
>
> What point are you trying to make? Builtins have custom iterator types.
> And? That's an implementation choice. One might make different choices:

And a good choice.  We should do more of that.  :-)


> py> type(iter(set([]))) is type(iter(frozenset([])))
> True
>
> Sets and frozen sets, despite being different types, share the same
> iterator type.

No problem, thats good too.


>> And is why chain is the recommended method of joining multiple
>> containers.  This really only addresses getting stuff OUT of containers.
>>
>> PEP 448's * unpacking in comprehensions helps with the problem of putting
>> things into containers.  But that isn't the PEP's main point.
>>
>
>
> Now we come to your actual proposal:

I probably could of left out most of the above and put the proposal first. 
  It's just how it came to me.


>> What I'm thinking of is the inverse operation of an iter.  Lets call it a
>> "getter".
>> You would get a getter the same way you get an iter.
>>
>>      g = getter(obj)
>>
>> But instead of __next__ or send() methods, it would have an iter_send(),
>> or isend() method.  The isend method would takes an iter object, or an
>> object that iter() can be called on.
>>
>> The getter would return either the object it came from, or a new object
>> depending on weather or not it was created from a mutable or immutable obj.
>>
>>
>> Mutable objects...
>>
>>      g = getter(A)     # A needs a __getter__ method.
>>      A = g.isend(B)
>
> What's B? Why is it needed as an argument, since g was fully specified by A
> only. That is:

The example is, moving items from B to A.  The object B is fed into isend, 
then it's contents are read into A.  So the getter, is an input interface 
to A.  Just as an iter is an output interface for the object it is created 
from.  Any iter can work with any getter.  So you can tranfer the contents 
of any iterable to any other iterable.  (Dictionaries would still need 
(key, value) pairs.)

In the case of immutable objects, the getter creates a new object of the 
same type as it's from.  So in the above, A = g.isend(B),  'A', is a new 
object contianing (A+B) if A was immutable.


> g.isend(B)
> g.isend(None)
> g.isend(42)

> etc. should all return the same A, so what is the purpose of passing the
> argument?

Only g.isend(B) would work here, if B can be iterated.  The others would 
raise an exception.  The getter, from A, reads the contents of B into A's 
storage space.

In the case of ordered objects though it always add to the end.  So you 
might need the reversed() function to add to the begining.  There is an 
option of telling the getter how to star when it's created.  Possibly by 
passing it a slice object.  Unordered types could just ignore it.


>>      A += B            # extend
>
> Since we don't know what A is, we cannot know in advance that it has an
> __iadd__ method that is the same as extend.

It dosen't need to know...and it dosen't need an __iadd__, it needs a 
__getter__.  I was just trying to show the equivalent operation.  The 
__getter__ always takes an iterator, or an object that can be iterated.

This is it.

      """A getter is an objects iterator input interface."""


> I don't really understand why I would want to do this:
>
> start with an object A
> call getter(A) to create a "getter" object g
> call g.isend() to get A back again
> call some method on A
>
> when I could just do this:
>
> start with an object A
> call some method on A

Sorry, I wasn't clear.  No need to call a method on A.

A getter could insert a lot of data into an object very fast.  It's works 
like the extend method on lists, but could work on many other types and 
even transfer data from different types.  It creates a uniform way to move 
data into objects, just like there is a uniform way to get data out of objects.


> Nor do I understand what this has to do with iterators and generators, or
> why the method is called "isend" (iter_send). As far as I can tell, the
> only similarity between your getter and the built-in iter is that they are
> both functions that take a single argument.

They are both generators.  And iter uses next(g), while a getter uses 
g.send(seq).  It doesn't need to be called isend(), but in this case I 
think it's a helpful hint that it requires a sequence or an iterator of 
some type.


>> Mutable objects...
>>
>>      g = getter(A)
>>      C = g.isend(B)
>>
>>      C = A + B         # join
>>
>> The point, is to have something that works on many types and is as
>> consistent in how it's defined as the iter protocol.  Having a strict and
>> clear definition is a very important!
>
> This last sentence is very true. Would you like to give us a strict and
> clear definition of your getter proposal?
>
>
>> The internal implementation of a getter could do a direct copy to make it
>> faster, like slicing does, but that would be a private implementation
>> detail.
>
> A direct copy of what? A? Then why not spell it like this?

   def extend_items(A, B):
      """ Add the content of the sub_items in B to A. """
      # A must be mutable for this function to work.
      g = A.getter()    # g is an input interface to "A"
      for sub_list in B:
          g.isend(sub_list)   # inserts the contents of sub_list into A


   A = [list of very many items.]
   B = [bunch of large sub lists to add to A]
   extend_items(A, B)


You could write that using list.extend() with the same result.


But consider the same example but with dictionaries.

     A = {dictonary of words with frequency counts for index}
     B = [{dictionaries of word counts for each chapter}, {...}, {...}, ...]

B is a list of dictonaries to be added to A.

     extend_items(A, B)

In this case, it would work like the dictionairies update() method.  But we 
didnt' need to change the function to get that.  The dictionaries getter 
does that part for us.


So it's exactly the same interface and the same function works on both of 
them without any special casing or testing of types.

This is the main part I'm trying to communicate.


> A = copy.copy(A)

> instead of
>
> A = getter(A).isend(23)

23 isn't a container.  It won't work if the object can't be iterated. So 
you would need to write that as...

   A = getter(A).isend([23])   # append 23 to A using a getter.
                              # or extend([23])

That is why the method on a getter is names 'isend' rather than just 
'send'.  It's really the same thing, but the 'isend' is a reminder that it 
needs an iterator.


A __getter__ method on a list object might be..

    def __getter__(self):
        def g():
            seq = yield
            self.extend(seq)
            return self
        gtr = g()
        next(gtr)       # start it, so send method will work.
        return gtr


And on a dictionary:

     def __getter__(self):
         def g():
             seq = yield
             self.update(seq)
             return self
         getter = g()
         next(getter)
         return getter


on a string:    (bytes and tuples are very much like this.)

     def __getter__(self):
          def g():
             seq = yield
             return self + seq
          getter = g()
          next(getter)
          return getter

etc... It's pretty simple, but builtin versions of these would not need to 
use the 'extend', 'update', or '__add__' methods, but can do the eqivalent 
directly bypassing the method calls.


Then what you have is a input protocol that complements the iter output 
protocol.


>> They don't replace generator expressions or comprehensions.  Those
>> generally will do something with each item.
>>
>> Functions like extend() and concat() could be implemented with
>> *getter-iters*, and work with a larger variety of objects with much less
>> work and special handling.
>>
>>       def extend(A, B):
>>          return getter(A).isend(B)
>>
>>       def concat(A, B):
>>          """ Extend A with multiple containers from B. """
>>          g = getter(A)
>>          if g.isend() is not A:
>>              raise TypeError("can't concat immutable arg, use merge()")
>>          for x in B:
>>             g.isend(x)
>>          return A
>
> How is that better than this?
>
>      def concat(A, B):
>          """ Extend A with multiple containers from B. """
>          for x in B:
>              A.extend(x)
>          return A

Try it with a dictionary, a set, tuples, and other objects that don't have 
an extend method.


> (But note, that's not the definition of concat() I would expect. I would
> expect concat to return a new object, not modify A in place.)

concat is just an exmaple of how a getter could be used.  To write a 
general purpose concat function that works with different container types 
without getters, isn't as easy.


>> Expecting many holes to be punched in this idea ...
>>     But hope not too many.  ;-)
>
>
> I'm afraid that to me the idea seems too incoherent to punch holes in it.

Yes, and is why I apologised for the not to concise writing.  :/

Cheers,
    Ron











More information about the Python-ideas mailing list