Elementwise -//- first release -//- Element-wise (vectorized) function, method and operator support for iterables in python.

Nathan Rice nathan.alexander.rice at gmail.com
Wed Dec 21 13:09:01 EST 2011


>> On Tue, Dec 20, 2011 at 8:37 PM, Joshua Landau
>> <joshua.landau.ws at gmail.com> wrote:
>> > On 21 December 2011 00:24, Nathan Rice <nathan.alexander.rice at gmail.com>
>> > wrote:
>> >> efoo_res = ((efoo2.capitalize() + " little indian").split("
>> >> ").apply(reversed) * 2).apply("_".join) # note that you could do
>> >> reversed(...) instead, I just like to read left to right
>> >> efoo_res.parent.parent.parent # same as ((efoo2.capitalize() + "
>> >> little indian").split(" ") in case you need to debug something and
>> >> want to look at intermediate values
>> >
>> > How is any of this better than the elementwise operators ("~")? People
>> > should be able to expect len(x) to always return a number or raise an
>> > error.
>> > I know it's not part of the spec, but a lot breaks without these
>> > guarantees.
>> > When "str(x)" isn't a string, all the formatting code breaks*. And when
>> > the
>> > other suggestion ("~str(x)" or "str~(x)" or something similar) has all
>> > the
>> > benifits and none of the drawbacks, why should I use this?
>>
>> len() will always return a number or raise an error, just like the
>> type functions (bool/int/etc) return that type or raise an error.  The
>> interpreter guarantees that for you.
>
>
> The point wasn't that either way was better, but that with this
> implementation you get neither choice ("len(x)" vs "len~(x)") or
> reliability.

If len didn't have the hard coded behavior, you would have the choice
of len(x) or len(x.each).  Since a lot of code relies on len()
returning an it, I think it is fine to accept that you have to use
x.each.apply(len).  I agree that this is a case where some kind of
elementwise designation in the syntax would be better; if there was a
∀ character on the keyboard and more people knew what it meant I would
have fewer reservations.

>
> The reliability point works like this: You want to elementwise a few
> functions, that before you were doing on a single item.
> BEFORE:
> item = foreignfunc1(item)
> item = foreignfunc2(item)
> item = foreignfunc3(item)
>
> NOW (your method):
> item = ElementwiseProxy(item)
> item = foreignfunc1(item)
> item = foreignfunc2(item)
> item = foreignfunc3(item)
> item = list(item)

well, I would say it more like:

item.each.apply(foreignfunc1).apply(foreignfunc2).apply(foreignfunc3)
# I like to read left to right, what can I say?

and I wouldn't list() it right away, since it is nice and lazy.

> You might think your method works. But what if foreignfunc is "str"? And you
> can't blacklist functions. You can't say everything works bar A, B and C.
> What if you get: "lambda x:str(x)"? You can't blacklist that. That makes the
> ElementwiseProxy version buggy and prone to unstable operation. If it's
> consistent, fine. But it's not.

I don't need to blacklist anything.  Everything that has funny
behavior like str goes through special methods on the class (that I
know of), and isn't hooked through __getattribute__, so I just handle
it somewhat normally.

>> 1. Because everything is handled in terms of generator chains, all
>> operations on an ElementwiseProxy are evaluated lazily.  With
>> element-wise operator overloading you would need to perform each
>> operation immediately.
>
>
> I agree this can be a preferred advantage.But you still have to choose one.
> "~" could be lazy, or it could be eager. But in both implementations you
> have to choose. That said, you have map and imap, and so you could have
> ElemetwiseProxy and eElementwiseProxy (eager), and you can have "~" and
> "i~". Remember that the syntax I'm using is just for PEP consistency. Some
> more creative people can find a syntax that works.

>From my perspective the strength of operators is that they are
intuitive, since we use them constantly in other areas; I can make a
good guess about what X * Y or X + Y means in various contexts.  When
you introduce operators that don't have any well ingrained, standard
meaning, you just make the syntax cryptic.

>>
>> 2. As a result of #1, you can "undo" operations you perform on an
>> ElementwiseProxy with the parent property.
>
>
> Use case? If this is actually a wanted feature, "parent" could be made a
> general property of iterators.
> (x for x in foo).parent == foo
> I think that's a separate proposal that isn't intrinsic to this idea.

One quick use case:  I want to debug something big, slow and nasty; .
I set breakpoints with conditions where it seems like the issue lies
from the stack trace.  Unfortunately I missed the root cause, and the
variables that would help me debug it have been garbage collected.  I
can go through the entire process again, try to set better breakpoints
and cross my fingers, or iterate over some subset of parent operations
right there.  I think I'll take the latter.

>> 3. This still works if the person who created the class you're working
>> with doesn't add support for element-wise operators.  Sure, you could
>> monkey patch their code, but that can lead to other problems down the
>> line.

As I understood it, the elementwise operators in the PEP weren't
language magic, but new hooks to special methods.  If it is language
magic, that will probably have very far reaching implications in the
code.

>>
>> 4. There isn't an obvious/intuitive character for element-wise
>> versions of operators, and fewer symbols is better than more IMHO
>> (see: Perl).  Also, if you use the ElementwiseProxyMixin, you can
>> sprinkle element-wise stuff in neatly just by using "variable.each"
>> where you would use "~" in your examples.
>
>
> Obvious/intuitive is meaningless for something so potentially common. "**"
> is far more obscure ("*" is obscure enough). Why is a double star the
> exponent character? Because it works. "~" would be easy to learn and
> consistent, once a standard was agreed. And again, I'm only using that
> syntax because it's the one in the PEP. We're not going to get to Perl, I
> hope, with this addition.

I agree ** is pretty bad (xor should not have gotten first dibs on ^).
 The only saving grace for this scheme is that it is to some degree a
convention in programming.

> Explicit > Implicit. If someone passes you a "variable.each", things go
> haywire (see: first section of reply). "~" is explicit. It doesn't have
> these problems. I also think explicit looks nicer. 'Cause it is, y'know :P

Sure, I can't prevent people from doing bad things though.  People
will always find ways to shoot themselves (and potentially others
around them) in the foot.

> I'll hammer my point in again.
>
> "foo~(x)" is elementwise.
> "typeofx(foo(ElementwiseProxy(x)))" may be elementwise, depending on whether
> foo is a special case, or uses a special case, or is an obscure bug if it
> uses both a special case and a non special case.
>
> "x~.foo" is an elementwise "getattr" on a list of things.
> "ElementwiseProxy(x).foo" is elementwise or a *ramble ramble* *exceptions*
> *special cases*.

I agree that having elementwise functionality at a deeper level in
python would avoid some of the inconsistencies (of which there are
really not very many!)

Nathan



More information about the Python-list mailing list