[Cython] array.array member renaming

Stefan Behnel stefan_ml at behnel.de
Tue Jun 4 14:27:15 CEST 2013


Nikita Nemkin, 04.06.2013 12:17:
> On Tue, 04 Jun 2013 14:47:47 +0600, Stefan Behnel wrote:
>> Nikita Nemkin, 04.06.2013 10:29:
>>> I just wanted to say that this
>>> https://github.com/cython/cython/commit/a3ace265e68ad97c24ce2b52d99d45b60b26eda2#L1L73
>>>
>>> renaming seems totally unnecessary as it makes any array code
>>> verbose and ugly. I often have to create extra local variables
>>> just to avoid endless something.data.as_ints repetition.
>>
>> Are one-shot operations on arrays really so common for you that the
>> explicit "unpacking" step matters for your code?
> 
> I use array in most places where you would normally see bare pointer and
> malloc/PyMem_Malloc. Automatic memory management FTW.
> 
> Many people would do the same if they knew about arrays
> and a special support for them that Cython provides.
> (Personally, I had discovered it by browsing standard include .pxd files)
> 
> Array class members also have "self." prepended which does not help brevity.
> So, yeah, it matters. Sure I can live with overly verbose names,
> but there is certainly room for improvement.
> 
> ATM I have 96 cases of ".data.as_XXX" in my codebase and that's after
> folding some of them using local variables
> (like "cdef int* segments = self.segments.data.as_ints").

And the local assignment also resolves the pointer indirection for "self"
here, which the C compiler can't really reason about otherwise.


>>> What was the reason for ranaming? It would be really nice to
>>> reintroduce old names (_i, _d etc).
>>
>> IMHO, the explicit names read better and make it clear what happens.
> 
> Indexing makes it clear enough that, well, indexing happens.
> Direct array access is sort of magic anyway.
> Here is an example of unnecessary verbosity:
> 
>     while width + piDx.data.as_ints[start] < maxWidth:
>         width += piDx.data.as_ints[start]
>         start += 1

Agreed that it's more verbose than necessary, but my gut feeling is still:
if it's worth shorting, it's worth assigning. If it's not worth assigning,
it's likely not worth shortening either.

IIRC, the reason why there's a redundant ".data." bit in there is a)
because of C declaration issues and b) because we wanted to keep the
namespace impact on the Python array object interface as low as possible.


>> Also, I think the original idea was that most people shouldn't access the
>> field directly and use memory views and the buffer interface instead, at
>> least for user provided data. It might be a little different for arrays
>> that are only used internally.
> 
> When using buffer interface, it really doesn't matter if user have passed
> an array or ndarray or whatever. Buffer interface covers everything,
> array-specific declarations are irrelevant.
> 
> But when I know that the variable is an array, buffer declaration,
> acquisition and release code is dead weight (especially for class
> members which can't have buffer declaration attached to themselves,
> necessitating an extra local variable to declare a fast access view).

That's what I meant with "only used locally".

So, I do see your problem, but it's not obvious to me that it's worth doing
something about it. Especially not something as broad as duplicating the
direct access interface.

Stefan



More information about the cython-devel mailing list