Negative array indicies and slice()

Andrew Robinson andrew3 at r3dsolutions.com
Thu Nov 1 07:32:54 EDT 2012


On 11/01/2012 07:12 AM, Ethan Furman wrote:
> Andrew Robinson wrote:
>>   On 10/31/2012 02:20 PM, Ian Kelly wrote:
>>> On Wed, Oct 31, 2012 at 7:42 AM, Andrew Robinson  wrote:
>>>> Then; I'd note:  The non-goofy purpose of slice is to hold three
>>>> data values;  They are either numbers or None.  These *normally*
>>>> encountered values can't create a memory loop.
>>>> So, FOR AS LONG, as the object representing slice does not contain
>>>> an explicit GC pair; <snip>

A little review...
The premise of my statement here, is that Tim Peter's closed the Bug report;

http://bugs.python.org/issue1501180
With the *reason* being that using GC was *goofy* on account of what slice() was intended to hold, None and a number.  So, My first attempt at bug fix was simply to take Tim Peter's at his word... since we all assume he *isn't* a "Bloody Idiot".  Hey isn't that a swear-word somewhere in the world?  Its not where I live, but I seem to recall... oh, well... whatever.

>> I missed something... (but then that's why we're still talking about 
>> it...)
>>
>> Reading the PEP, it notes that *only* integers (or longs) are 
>> permitted in slice syntax.
>
> Keep in mind that PEPs represent Python /at that time/ -- as Python
> moves forward, PEPs are not updated (this has gotten me a couple times).
And, since I am reading them in the order written (but in 3.0) trying to 
get the whole of Python into my mind on the journey to prep for porting 
it into a tiny chip -- I'm frustrated by not being finished yet...

> Furman, actually.  :)
:-!

>
> And my values do *not* convert to indices (at least, not automatically).
Ahhh.... (Rhetorical & sarcastic) I was wondering how you added index() 
method to strings, not access it, and still be following the special PEP 
we are talking about,when you gave that example using unwrapped strings.

--------------------------

Hmmmm.... was that PEP the active state of Python, when Tim rejected the 
bug report?  eg: have we "moved on" into a place where the bug report 
ought to be re-issued since that PEP is now *effectively* passe, and Tim 
could thus be vindicated from being a "b... Idiot?"  (Or has he been 
given the 1st place, Python Twit award -- and his *man* the bug list 
been stripped?)

> In other words, the slice contains the strings, and my code calculates
> the offsets -- Python doesn't do it for me.
>
> ~Ethan~

I see, so the problem is that PEP wants you to implement the index(), 
but that is going to cause you to subclass string, and add a wrapper 
interface every time you need to index something.
eg: doing something llke ---   mydbclass[ MyString( 'fromColumn' ) : 
MyString( 'toColum' ) ] and the code becomes a candy machine interface 
issue (Chapter 5, Writing Solid Code).

My favorite line there uses no swearing .... "If they had just taken an 
extra *30* seconds thinking about their design, they could have saved 
me, and I'm sure countless others, from getting something they didn't 
want."   I laugh, if they didn't get it already -- an extra *30* seconds 
is WAAAAY to optimistic.  Try minutes at least, will a policeman glaring 
over their shoulder.

But anyhow --- The problem lies in *when* the conversion to an integer 
is to take place, not so much if it is going to happen.  Your indexes, 
no matter how disguised, eventually will become numbers; and you have a 
way that minimizes coding cruft (The very reason I started the thread, 
actually... subclassing trivially to fix candy machine interfaces leads 
to perpetual code increases -- In cPython source-code, "realloc" 
wrappers and "malloc" wrappers are found .... I've seen these wrappers 
*re*-invented in nearly every C program I've every looked at! Talk about 
MAN-hours, wasted space, and cruft.)

So; is this a reasonable summary of salient features (status quo) ?

  * Enforcing strict numerical indexes (in the slice [::] operator)
    causes much psychological angst when attempting to write clear code
    without lots of wrapper cruft.
  * Pep 357 merely added cruft with index(), but really solved nothing. 
    Everything index() does could be implemented in __getitem__ and
    usually is.
  * slice().xxxs are merely a container for *whatever* was passed to [::]
  * slice() is
  * slice is also a full blown object, which implements a trivial method
    to dump the contents of itself to a tuple.
  * presently slice() allows memory leaks through GC loops.
  * Slice(), even though an object with a constructor, does no error
    checking to deny construction of memory leaks.

If people would take 30 seconds to think about this.... the more details 
added -- the more comprehensive can be my understanding -- and perhaps a 
consensus reached about the problem.
These are a list of relevant options, without respect to feasability.

  * Don't bother to fix the bug; allow Python to crash with a subtle bug
    that often take weeks to track down by the very small minority doing
    strange things (Equivalent to the "monkey patch" syndrome of
    D'Aprano; BTW: The longer the bug is left unfixed, the more people
    will invent "uses" for it )
  * Convert the specially coded Slice() object into a normal Python
    object (essentially adds the GC);  This can be a "named" tuple, or
    an immutable object with slots..., or just adding GC to existing object.
  * Work through the difficult problem of guessing all the different
    ways people will want to use slice() to represent indexes; and then
    raise exceptions IN THE SLICE CONSTRUCTOR when a usage is tried
    outside these ways. (caveat, if a reasonable idea for loops is found
    -- we *must* implement GC.)
  * Improve the GC so that memory loops aren't a problem in the first place.

And here's a list of qualms:

  * Adding bytes of memory for GC isn't worth it...
  * Removing bytes of GC from other objects when loops would be rare
    isn't worth it....
  * Doing things which violate the present *theoretical* API will cause
    *subtle* bugs...
  * Fixing something in a way breaking a general paradigm (no GC)
    enlarges code, and makes generalizations about the language full of
    complex exceptions hard to code for.

Did I miss anything important???

So finally, here's a list of missing information, and I'd *REALLY* 
appreciate someone helping me figure out how to profile memory usage in 
Python with an example, or link, for solving any of the following.

  * As a fraction of the used memory (at fixed points in time) of a
    typical Python program:
      o How much is used by GC?
      o How much is used by slice objects?
      o How much will be used by additional GC if added to slice objects
        (can be calculated).
      o How much is used by generic tuples?
  * As a fraction of tuples used for generic purposes in Python
      o How many contain only data elements that are basic Python types
        without ability to create loops; Strings, floats, self contained
        objects.  (I'm thinking a flag might be useful to decide when GC
        may be omitted in these cases).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20121101/ad48fed9/attachment.html>


More information about the Python-list mailing list