From l.mastrodomenico at gmail.com  Sun Jul  1 00:18:53 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Sun, 1 Jul 2007 00:18:53 +0200
Subject: [Python-3000] PEP 368: Standard image protocol and class
Message-ID: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>

Hi everyone,

I have submitted a new PEP:

    http://www.python.org/dev/peps/pep-0368/

It starts from a Pete Shinners' suggestion and from the consideration
that there are a lot of Python libraries that use image objects, but
almost all of them have implemented their own image classes,
incompatible with everyone else's (and often not very pythonic).

The PEP tries to improve the situation by defining a standard image
protocol: in practice this is a definition of how a minimal
"image-like" object should look and act in Python. Its details are
carefully chosen to allow existing image classes in Tkinter, PIL,
wxPython and pygame to implement it without breaking backward
compatibility with their existing user bases.

It also proposes the inclusion in the standard library of a fast and
efficient default implementation of the new protocol.

The PEP is long and detailed, but it's not in any way meant to be a
take-it-or-leave-it deal: I'm open to any change, even radical, to
improve it.

It isn't py3k-specific (and it has a low number), but I posted here
anyway because IMHO the main question is if and how to include this in
Python 3.0; then, if the PEP is accepted, I'll backport the new
classes to Python 2.6.

Any suggestion or criticism is welcome; I'll also solicit feedback
from external libraries developers that might be interested in
implementing the new protocol.

Regards

-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com

From robert.kern at gmail.com  Sun Jul  1 00:33:07 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 30 Jun 2007 17:33:07 -0500
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
Message-ID: <f66lnb$h1q$1@sea.gmane.org>

Lino Mastrodomenico wrote:
> Hi everyone,
> 
> I have submitted a new PEP:
> 
>     http://www.python.org/dev/peps/pep-0368/
> 
> It starts from a Pete Shinners' suggestion and from the consideration
> that there are a lot of Python libraries that use image objects, but
> almost all of them have implemented their own image classes,
> incompatible with everyone else's (and often not very pythonic).

Could you build this on top of the new buffer protocol that we're working on?

  http://www.python.org/dev/peps/pep-3118/

Enabling this kind of data sharing is precisely what the new buffer type is
intended for.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Jul  1 00:36:08 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 30 Jun 2007 17:36:08 -0500
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <f66lnb$h1q$1@sea.gmane.org>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
	<f66lnb$h1q$1@sea.gmane.org>
Message-ID: <f66lt0$h1q$2@sea.gmane.org>

Robert Kern wrote:
> Lino Mastrodomenico wrote:
>> Hi everyone,
>>
>> I have submitted a new PEP:
>>
>>     http://www.python.org/dev/peps/pep-0368/
>>
>> It starts from a Pete Shinners' suggestion and from the consideration
>> that there are a lot of Python libraries that use image objects, but
>> almost all of them have implemented their own image classes,
>> incompatible with everyone else's (and often not very pythonic).
> 
> Could you build this on top of the new buffer protocol that we're working on?
> 
>   http://www.python.org/dev/peps/pep-3118/

Never mind. I found the reference in your PEP.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From l.mastrodomenico at gmail.com  Sun Jul  1 03:00:29 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Sun, 1 Jul 2007 03:00:29 +0200
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
Message-ID: <cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>

Here's the full text of the PEP's current draft, so you can comment
directly on it (thanks to Collin Winter for the suggestion):

PEP: 368
Title: Standard image protocol and class
Version: $Revision: 56133 $
Last-Modified: $Date: 2007-06-30 21:07:03 +0200 (sab, 30 giu 2007) $
Author: Lino Mastrodomenico <l.mastrodomenico at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Jun-2007
Python-Version: 2.6, 3.0
Post-History:


Abstract
========

The current situation of image storage and manipulation in the Python
world is extremely fragmented: almost every library that uses image
objects has implemented its own image class, incompatible with
everyone else's and often not very pythonic.  A basic RGB image class
exists in the standard library (``Tkinter.PhotoImage``), but is pretty
much unusable, and unused, for anything except Tkinter programming.

This fragmentation not only takes up valuable space in the developers
minds, but also makes the exchange of images between different
libraries (needed in relatively common use cases) slower and more
complex than it needs to be.

This PEP proposes to improve the situation by defining a simple and
pythonic image protocol/interface that can be hopefully accepted and
implemented by existing image classes inside and outside the standard
library *without breaking backward compatibility* with their existing
user bases.  In practice this is a definition of how a minimal
*image-like* object should look and act (in a similar way to the
``read()`` and ``write()`` methods in *file-like* objects).

The inclusion in the standard library of a class that provides basic
image manipulation functionality and implements the new protocol is
also proposed, together with a mixin class that helps adding support
for the protocol to existing image classes.


Rationale
=========

A good way to have high quality modules ready for inclusion in the
Python standard library is to simply wait for natural selection among
competing external libraries to provide a clear winner with useful
functionality and a big user base.  Then the de-facto standard can be
officially sanctioned by including it in the standard library.

Unfortunately this approach hasn't worked well for the creation of a
dominant image class in the Python world: almost every third-party
library that requires an image object creates its own class
incompatible with the ones from other libraries.  This is a real
problem because it's entirely reasonable for a program to create and
manipulate an image using, e.g., PIL (the Python Imaging Library) and
then display it using wxPython or pygame.  But these libraries have
different and incompatible image classes, and the usual solution is to
manually "export" an image from the source to a (width, height,
bytes_string) tuple and "import" it creating a new instance in the
target format.  This approach *works*, but is both uglier and slower
than it needs to be.

Another "solution" that has been sometimes used is the creation of
specific adapters and/or converters from a class to another (e.g. PIL
offers the ``ImageTk`` module for converting PIL images to a class
compatible with the Tkinter one).  But this approach doesn't scale
well with the number of libraries involved and it's still annoying for
the user: if I have a perfectly good image object why should I convert
before passing it to the next method, why can't it simply accept my
image as-is?

The problem isn't by any stretch limited to the three mentioned
libraries and has probably multiple causes, including two that IMO are
very important to understand before solving it:

* in today's computing world an image is a basic type not strictly
  tied to a specific domain.  This is why there will never be a clear
  winner between the image classes from the three libraries mentioned
  above (PIL, wxPython and pygame): they cover different domains and
  don't really compete with each other;

* the Python standard library has never provided a good image class
  that can be adopted or imitated by third part modules.
  ``Tkinter.PhotoImage`` provides basic RGB functionality, but it's by
  far the slowest and ugliest of the bunch and it can be instantiated
  only after the Tkinter root window has been created.

This PEP tries to improve this situation in four ways:

1. It defines a simple and pythonic image protocol/interface (both on
   the Python and the C side) that can be hopefully accepted and
   implemented by existing image classes inside and outside the
   standard library *without breaking backward compatibility* with
   their existing user bases.

2. It proposes the inclusion in the standard library of three new
   classes:

   * ``ImageMixin`` provides almost everything necessary to implement
     the new protocol; its main purpose is to make as simple as
     possible to support this interface for existing libraries, in
     some cases as simple as adding it to the list of base classes and
     doing minor additions to the constructor.

   * ``Image`` is a subclass of ``ImageMixin`` and will add a
     constructor that can resize and/or convert an image between
     different pixel formats.  This is intended to provide a fast and
     efficient default implementation of the new protocol.

   * ``ImageSize`` is a minor helper class.  See below for details.

3. ``Tkinter.PhotoImage`` will implement the new protocol (mostly
   through the ``ImageMixin`` class) and all the Tkinter methods that
   can receive an image will be modified the accept any object that
   implements the interface.  As an aside the author of this PEP will
   collaborate with the developers of the most common external
   libraries to achieve the same goal (supporting the protocol in
   their classes and accepting any class that implements it).

4. New ``PyImage_*`` functions will be added to the CPython C API:
   they implement the C side of the protocol and accept as first
   parameter **any** object that supports it, even if it isn't an
   instance of the ``Image``/``ImageMixin`` classes.

The main effects for the end user will be a simplification of the
interchange of images between different libraries (if everything goes
well, any Python library will accept images from any other library)
and the out-of-the-box availability of the new ``Image`` class.  The
new class is intended to cover simple but common use cases like
cropping and/or resizing a photograph to the desired size and passing
it an appropriate widget for displaying it on a window, or darkening a
texture and passing it to a 3D library.

The ``Image`` class is not intended to replace or compete with PIL,
Pythonmagick or NumPy, even if it provides a (very small) subset of
the functionality of these three libraries.  In particular PIL offers
very rich image manipulation features with *dozens* of classes,
filters, transformations and file formats.  The inclusion of PIL (or
something similar) in the standard library may, or may not, be a
worthy goal but it's completely outside the scope of this PEP.


Specification
=============

The ``imageop`` module is used as the *default* location for the new
classes and objects because it has for a long time hosted functions
that provided a somewhat similar functionality, but a new module may
be created if preferred (e.g. a new "``image``" or "``media``" module;
the latter may eventually include other multimedia classes).

``MODES`` is a new module level constant: it is a set of the pixel
formats supported by the ``Image`` class.  Any image object that
implements the new protocol is guaranteed to be formatted in one of
these modes, but libraries that accept images are allowed to support
only a subset of them.

These modes are in turn also available as module level constants (e.g.
``imageop.RGB``).

The following table is a summary of the modes currently supported and
their properties:

========= =============== ========= =========== ======================
  Name       Component    Bits per  Subsampling        Valid
             names        component                    intervals
========= =============== ========= =========== ======================
L         l (lowercase L) 8         no          full range
L16       l               16        no          full range
L32       l               32        no          full range
LA        l, a            8         no          full range
LA32      l, a            16        no          full range
RGB       r, g, b         8         no          full range
RGB48     r, g, b         16        no          full range
RGBA      r, g, b, a      8         no          full range
RGBA64    r, g, b, a      16        no          full range
YV12      y, cr, cb       8         1, 2, 2     16-235, 16-240, 16-240
JPEG_YV12 y, cr, cb       8         1, 2, 2     full range
CMYK      c, m, y, k      8         no          full range
CMYK64    c, m, y, k      16        no          full range
========= =============== ========= =========== ======================

When the name of a mode ends with a number, it represents the average
number of bits per pixel.  All the other modes simply use a byte per
component per pixel.

No palette modes or modes with less than 8 bits per component are
supported.  Welcome to the 21st century.

Here's a quick description of the modes and the rationale for their
inclusion; there are four groups of modes:

1. **grayscale** (``L*`` modes): they are heavily used in scientific
   computing (those people may also need a very high dynamic range and
   precision, hence ``L32``, the only mode with 32 bits per component)
   and sometimes it can be useful to consider a single component of a
   color image as a grayscale image (this is used by the individual
   planes of the planar images, see ``YV12`` below); the name of the
   component (``'l'``, lowercase letter L) stands for luminance, the
   second optional component (``'a'``) is the alpha value and
   represents the opacity of the pixels: alpha = 0 means full
   transparency, alpha = 255/65535 represents a fully opaque pixel;

2. **RGB\* modes**: the garden variety color images.  The optional
   alpha component has the same meaning as in grayscale modes;

3. **YCbCr**, a.k.a. YUV (``*YV12`` modes).  These modes are planar
   (i.e. the values of all the pixel for each component are stored in
   a consecutive memory area, instead of the usual arrangement where
   all the components of a pixel reside in consecutive bytes) and use
   a 1, 2, 2 (a.k.a. 4:2:0) subsampling (i.e. each pixel has its own Y
   value, but the Cb and Cr components are shared between groups of
   2x2 adjacent pixels) because this is the format that's by far the
   most common for YCbCr images.  Please note that the V (Cr) plane is
   stored before the U (Cb) plane.

   ``YV12`` is commonly used for MPEG2 (including DVDs), MPEG4 (both
   ASP/DivX and AVC/H.264) and Theora video frames.  Valid values for
   Y are in range(16, 236) (excluding 236), and valid values for Cb
   and Cr are in range(16, 241).  ``JPEG_YV12`` is similar to
   ``YV12``, but the three components can have the full range of 256
   values.  It's the native format used by almost all JPEG/JFIF files
   and by MJPEG video frames.  The "strangeness" of these two wrt all
   the other supported modes derives from the fact that they are
   widely used that way by a lot of existing libraries and
   applications; this is also the reason why they are included (and
   the fact that they can't losslessly converted to RGB because YCbCr
   is a bigger color space); the funny 4:2:0 planar arrangement of the
   pixel values is relatively easy to support because in most cases
   the three planes can be considered three separate grayscale images;

4. **CMYK\* modes** (cyan, magenta, yellow and black) are subtractive
   color modes, used for printing color images on dead trees.
   Professional designers love to pretend that they can't live without
   them, so here they are.


Python API
----------

See the examples_ below.

In Python 2.x, all the new classes defined here are new-style classes.


Mode Objects
''''''''''''

The mode objects offer a number of attributes and methods that can be
used for implementing generic algorithms that work on different types
of images:

``components``

    The number of components per pixel (e.g. 4 for an RGBA image).

``component_names``

    A tuple of strings; see the column "Component names" in the above
    table.

``bits_per_component``

    8, 16 or 32; see "Bits per component" in the above table.

``bytes_per_pixel``

    ``components * bits_per_component // 8``, only available for non
    planar modes (see below).

``planar``

    Boolean; ``True`` if the image components reside each in a
    separate plane.  Currently this happens if and only if the mode
    uses subsampling.

``subsampling``

    A tuple that for each component in the mode contains a tuple of
    two integers that represent the amount of downsampling in the
    horizontal and vertical direction, respectively.  In practice it's
    ``((1, 1), (2, 2), (2, 2))`` for ``YV12`` and ``JPEG_YV12`` and
    ``((1, 1),) * components`` for everything else.

``x_divisor``

    ``max(x for x, y in subsampling)``; the width of an image that
    uses this mode must be divisible for this value.

``y_divisor``

    ``max(y for x, y in subsampling)``; the height of an image that
    uses this mode must be divisible for this value.

``intervals``

    A tuple that for each component in the mode contains a tuple of
    two integers: the minimum and maximum valid value for the
    component.  Its value is ``((16, 235), (16, 240), (16, 240))`` for
    ``YV12`` and ``((0, 2 ** bits_per_component - 1),) * components``
    for everything else.

``get_length(iterable[integer]) -> int``

    The parameter must be an iterable that contains two integers: the
    width and height of an image; it returns the number of bytes
    needed to store an image of these dimensions with this mode.

Implementation detail: the modes are instances of a subclass of
``str`` and have a value equal to their name (e.g. ``imageop.RGB ==
'RGB'``) except for ``L32`` that has value ``'I'``.  This is only
intended for backward compatibility with existing PIL users; new code
that uses the image protocol proposed here should not rely on this
detail.


Image Protocol
''''''''''''''

Any object that supports the image protocol must provide the following
methods and attributes:

``mode``

    The format and the arrangement of the pixels in this image; it's
    one of the constants in the ``MODES`` set.

``size``

    An instance of the `ImageSize class`_; it's a named tuple of two
    integers: the width and the height of the image in pixels; both of
    them must be >= 1 and can also be accessed as the ``width`` and
    ``height`` attributes of ``size``.

``buffer``

    A sequence of integers between 0 and 255; they are the actual
    bytes used for storing the image data (i.e. modifying their values
    affects the image pixels and vice versa); the data has a
    row-major/C-contiguous order without padding and without any
    special memory alignment, even when there are more than 8 bits per
    component.  The only supported methods are ``__len__``,
    ``__getitem__``/``__setitem__`` (with both integers and slice
    indexes) and ``__iter__``; on the C side it implements the buffer
    protocol.

    This is a pretty low level interface to the image and the user is
    responsible for using the correct (native) byte order for modes
    with more than 8 bit per component and the correct value ranges
    for ``YV12`` images.  A buffer may or may not keep a reference to
    its image, but it's still safe (if useless) to use the buffer even
    after the corresponding image has been destroyed by the garbage
    collector (this will require changes to the image class of
    wxPython and possibly other libraries).  Implementation detail:
    this can be an ``array('B')``, a ``bytes()`` object or a
    specialized fixed-length type.

``info``

    A ``dict`` object that can contain arbitrary metadata associated
    with the image (e.g. DPI, gamma, ICC profile, exposure time...);
    the interpretation of this data is beyond the scope of this PEP
    and probably depends on the library used to create and/or to save
    the image; if a method of the image returns a new image, it can
    copy or adapt metadata from its own ``info`` attribute (the
    ``ImageMixin`` implementation always creates a new image with an
    empty ``info`` dictionary).

| ``bits_per_component``
| ``bytes_per_pixel``
| ``component_names``
| ``components``
| ``intervals``
| ``planar``
| ``subsampling``

    Shortcuts for the corresponding ``mode.*`` attributes.

``map(function[, function...]) -> None``

    For every pixel in the image, maps each component through the
    corresponding function.  If only one function is passed, it is
    used repeatedly for each component.  This method modifies the
    image **in place** and is usually very fast (most of the time the
    functions are called only a small number of times, possibly only
    once for simple functions without branches), but it imposes a
    number of restrictions on the function(s) passed:

    * it must accept a single integer argument and return a number
      (``map`` will round the result to the nearest integer and clip
      it to ``range(0, 2 ** bits_per_component)``, if necessary);

    * it must *not* try to intercept any ``BaseException``,
      ``Exception`` or any unknown subclass of ``Exception`` raised by
      any operation on the argument (implementations may try to
      optimize the speed by passing funny objects, so even a simple
      ``"if n == 10:"`` may raise an exception: simply ignore it,
      ``map`` will take care of it); catching any other exception is
      fine;

    * it should be side-effect free and its result should not depend
      on values (other than the argument) that may change during a
      single invocation of ``map``.

| ``rotate90() -> image``
| ``rotate180() -> image``
| ``rotate270() -> image``

    Return a copy of the image rotated 90, 180 or 270 degrees
    counterclockwise around its center.

``clip() -> None``

    Saturates invalid component values in ``YV12`` images to the
    minimum or the maximum allowed (see ``mode.intervals``), for other
    image modes this method does nothing, very fast; libraries that
    save/export ``YV12`` images are encouraged to always call this
    method, since intermediate operations (e.g. the ``map`` method)
    may assign to pixels values outside the valid intervals.

``split() -> tuple[image]``

    Returns a tuple of ``L``, ``L16`` or ``L32`` images corresponding
    to the individual components in the image.

Planar images also supports attributes with the same names defined in
``component_names``: they contain grayscale (mode ``L``) images that
offer a view on the pixel values for the corresponding component; any
change to the subimages is immediately reflected on the parent image
and vice versa (their buffers refer to the same memory location).

Non-planar images offer the following additional methods:

``pixels() -> iterator[pixel]``

    Returns an iterator that iterates over all the pixels in the
    image, starting from the top line and scanning each line from left
    to right.  See below for a description of the `pixel objects`_.

``__iter__() -> iterator[line]``

    Returns an iterator that iterates over all the lines in the image,
    from top to bottom.  See below for a description of the `line
    objects`_.

``__len__() -> int``

    Returns the number of lines in the image (``size.height``).

``__getitem__(integer) -> line``

    Returns the line at the specified (y) position.

``__getitem__(tuple[integer]) -> pixel``

    The parameter must be a tuple of two integers; they are
    interpreted respectively as x and y coordinates in the image (0, 0
    is the top left corner) and a pixel object is returned.

``__getitem__(slice | tuple[integer | slice]) -> image``

    The parameter must be a slice or a tuple that contains two slices
    or an integer and a slice; the selected area of the image is
    copied and a new image is returned; ``image[x:y:z]`` is equivalent
    to ``image[:, x:y:z]``.

``__setitem__(tuple[integer], integer | iterable[integer]) -> None``

    Modifies the pixel at specified position; ``image[x, y] =
    integer`` is a shortcut for ``image[x, y] = (integer,)`` for
    images with a single component.

``__setitem__(slice | tuple[integer | slice], image) -> None``

    Selects an area in the same way as the corresponding form of the
    ``__getitem__`` method and assigns to it a copy of the pixels from
    the image in the second argument, that must have exactly the same
    mode as this image and the same size as the specified area; the
    alpha component, if present, is simply copied and doesn't affect
    the other components of the image (i.e. no alpha compositing is
    performed).

The ``mode``, ``size`` and ``buffer`` (including the address in memory
of the ``buffer``) never change after an image is created.

It is expected that, if PEP 3118 is accepted, all the image objects
will support the new buffer protocol, however this is beyond the scope
of this PEP.


``Image`` and ``ImageMixin`` Classes
''''''''''''''''''''''''''''''''''''

The ``ImageMixin`` class implements all the methods and attributes
described above except ``mode``, ``size``, ``buffer`` and ``info``.
``Image`` is a subclass of ``ImageMixin`` that adds support for these
four attributes and offers the following constructor (please note that
the constructor is not part of the image protocol):

``__init__(mode, size, color, source)``

    ``mode`` must be one of the constants in the ``MODES`` set,
    ``size`` is a sequence of two integers (width and height of the
    new image); ``color`` is a sequence of integers, one for each
    component of the image, used to initialize all the pixels to the
    same value; ``source`` can be a sequence of integers of the
    appropriate size and format that is copied as-is in the buffer of
    the new image or an existing image; in Python 2.x ``source`` can
    also be an instance of ``str`` and is interpreted as a sequence of
    bytes.  ``color`` and ``source`` are mutually exclusive and if
    they are both omitted the image is initialized to transparent
    black (all the bytes in the buffer have value 16 in the ``YV12``
    mode, 255 in the ``CMYK*`` modes and 0 for everything else).  If
    ``source`` is present and is an image, ``mode`` and/or ``size``
    can be omitted; if they are specified and are different from the
    source mode and/or size, the source image is converted.

    The exact algorithms used for resizing and doing color space
    conversions may differ between Python versions and
    implementations, but they always give high quality results (e.g.:
    a cubic spline interpolation can be used for upsampling and an
    antialias filter can be used for downsampling images); any
    combination of mode conversion is supported, but the algorithm
    used for conversions to and from the ``CMYK*`` modes is pretty
    na?ve: if you have the exact color profiles of your devices you
    may want to use a good color management tool such as LittleCMS.
    The new image has an empty ``info`` ``dict``.


Line Objects
''''''''''''

The line objects (returned, e.g., when iterating over an image)
support the following attributes and methods:

``mode``

    The mode of the image from where this line comes.

``__iter__() -> iterator[pixel]``

    Returns an iterator that iterates over all the pixels in the line,
    from left to right.  See below for a description of the `pixel
    objects`_.

``__len__() -> int``

    Returns the number of pixels in the line (the image width).

``__getitem__(integer) -> pixel``

    Returns the pixel at the specified (x) position.

``__getitem__(slice) -> image``

    The selected part of the line is copied and a new image is
    returned; the new image will always have height 1.

``__setitem__(integer, integer | iterable[integer]) -> None``

    Modifies the pixel at the specified position; ``line[x] =
    integer`` is a shortcut for ``line[x] = (integer,)`` for images
    with a single component.

``__setitem__(slice, image) -> None``

    Selects a part of the line and assigns to it a copy of the pixels
    from the image in the second argument, that must have height 1, a
    width equal to the specified slice and the same mode as this line;
    the alpha component, if present, is simply copied and doesn't
    affect the other components of the image (i.e. no alpha
    compositing is performed).


Pixel Objects
'''''''''''''

The pixel objects (returned, e.g., when iterating over a line) support
the following attributes and methods:

``mode``

    The mode of the image from where this pixel comes.

``value``

    A tuple of integers, one for each component.  Any iterable of the
    correct length can be assigned to ``value`` (it will be
    automagically converted to a tuple), but you can't assign to it an
    integer, even if the mode has only a single component: use, e.g.,
    ``pixel.l = 123`` instead.

``r, g, b, a, l, c, m, y, k``

    The integer values of each component; only those applicable for
    the current mode (in ``mode.component_names``) will be available.

| ``__iter__() -> iterator[int]``
| ``__len__() -> int``
| ``__getitem__(integer | slice) -> int | tuple[int]``
| ``__setitem__(integer | slice, integer | iterable[integer]) ->
                                                              None``

    These four methods emulate a fixed length list of integers, one
    for each pixel component.


``ImageSize`` Class
'''''''''''''''''''

``ImageSize`` is a named tuple, a class identical to ``tuple`` except
that:

* its constructor only accepts two integers, width and height; they
  are converted in the constructor using their ``__index__()``
  methods, so all the ``ImageSize`` objects are guaranteed to contain
  only ``int`` (or possibly ``long``, in Python 2.x) instances;

* it has a ``width`` and a ``height`` property that are equivalent to
  the first and the second number in the tuple, respectively;

* the string returned by its ``__repr__`` method is
  ``'imageop.ImageSize(width=%d, height=%d)' % (width, height)``.

``ImageSize`` is not usually instantiated by end-users, but can be
used when creating a new class that implements the image protocol,
since the ``size`` attribute must be an ``ImageSize`` instance.


C API
-----

The available image modes are visible at the C level as ``PyImage_*``
constants of type ``PyObject *`` (e.g.: ``PyImage_RGB`` is
``imageop.RGB``).

The following functions offer a C-friendly interface to mode and image
objects (all the functions return ``NULL`` or -1 on failure):

``int PyImageMode_Check(PyObject *obj)``

    Returns true if the object ``obj`` is a valid image mode.

| ``int PyImageMode_GetComponents(PyObject *mode)``
| ``PyObject* PyImageMode_GetComponentNames(PyObject *mode)``
| ``int PyImageMode_GetBitsPerComponent(PyObject *mode)``
| ``int PyImageMode_GetBytesPerPixel(PyObject *mode)``
| ``int PyImageMode_GetPlanar(PyObject *mode)``
| ``PyObject* PyImageMode_GetSubsampling(PyObject *mode)``
| ``int PyImageMode_GetXDivisor(PyObject *mode)``
| ``int PyImageMode_GetYDivisor(PyObject *mode)``
| ``Py_ssize_t PyImageMode_GetLength(PyObject *mode, Py_ssize_t width,
                                     Py_ssize_t height)``

    These functions are equivalent to their corresponding Python
    attributes or methods.

``int PyImage_Check(PyObject *obj)``

    Returns true if the object ``obj`` is an ``Image`` object or an
    instance of a subtype of the ``Image`` type; see also
    ``PyObject_CheckImage`` below.

``int PyImage_CheckExact(PyObject *obj)``

    Returns true if the object ``obj`` is an ``Image`` object, but not
    an instance of a subtype of the ``Image`` type.

| ``PyObject* PyImage_New(PyObject *mode, Py_ssize_t width,
                          Py_ssize_t height)``

    Returns a new ``Image`` instance, initialized to transparent black
    (see ``Image.__init__`` above for the details).

| ``PyObject* PyImage_FromImage(PyObject *image, PyObject *mode,
                                Py_ssize_t width, Py_ssize_t height)``

    Returns a new ``Image`` instance, initialized with the contents of
    the ``image`` object rescaled and converted to the specified
    ``mode``, if necessary.

| ``PyObject* PyImage_FromBuffer(PyObject *buffer, PyObject *mode,
                                 Py_ssize_t width,
                                 Py_ssize_t height)``

    Returns a new ``Image`` instance, initialized with the contents of
    the ``buffer`` object.

``int PyObject_CheckImage(PyObject *obj)``

    Returns true if the object ``obj`` implements a sufficient subset
    of the image protocol to be accepted by the functions defined
    below, even if its class is not a subclass of ``ImageMixin``
    and/or ``Image``.  Currently it simply checks for the existence
    and correctness of the attributes ``mode``, ``size`` and
    ``buffer``.

| ``PyObject* PyImage_GetMode(PyObject *image)``
| ``Py_ssize_t PyImage_GetWidth(PyObject *image)``
| ``Py_ssize_t PyImage_GetHeight(PyObject *image)``
| ``int PyImage_Clip(PyObject *image)``
| ``PyObject* PyImage_Split(PyObject *image)``
| ``PyObject* PyImage_GetBuffer(PyObject *image)``
| ``int PyImage_AsBuffer(PyObject *image, const void **buffer,
                         Py_ssize_t *buffer_len)``

    These functions are equivalent to their corresponding Python
    attributes or methods; the image memory can be accessed only with
    the GIL and a reference to the image or its buffer held, and extra
    care should be taken for modes with more than 8 bits per
    component: the data is stored in native byte order and it can be
    **not** aligned on 2 or 4 byte boundaries.


Examples
========

A few examples of common operations with the new ``Image`` class and
protocol::

    # create a new black RGB image of 6x9 pixels
    rgb_image = imageop.Image(imageop.RGB, (6, 9))

    # same as above, but initialize the image to bright red
    rgb_image = imageop.Image(imageop.RGB, (6, 9), color=(255, 0, 0))

    # convert the image to YCbCr
    yuv_image = imageop.Image(imageop.JPEG_YV12, source=rgb_image)

    # read the value of a pixel and split it into three ints
    r, g, b = rgb_image[x, y]

    # modify the magenta component of a pixel in a CMYK image
    cmyk_image[x, y].m = 13

    # modify the Y (luma) component of a pixel in a *YV12 image and
    # its corresponding subsampled Cr (red chroma)
    yuv_image.y[x, y] = 42
    yuv_image.cr[x // 2, y // 2] = 54

    # iterate over an image
    for line in rgb_image:
        for pixel in line:
            # swap red and blue, and set green to 0
            pixel.value = pixel.b, 0, pixel.r

    # find the maximum value of the red component in the image
    max_red = max(pixel.r for pixel in rgb_image.pixels())

    # count the number of colors in the image
    num_of_colors = len(set(tuple(pixel) for pixel in image.pixels()))

    # copy a block of 4x2 pixels near the upper right corner of an
    # image and paste it into the lower left corner of the same image
    image[:4, -2:] = image[-6:-2, 1:3]

    # create a copy of the image, except that the new image can have a
    # different (usually empty) info dict
    new_image = image[:]

    # create a mirrored copy of the image, with the left and right
    # sides flipped
    flipped_image = image[::-1, :]

    # downsample an image to half its original size using a fast, low
    # quality operation and a slower, high quality one:
    low_quality_image = image[::2, ::2]
    new_size = image.size.width // 2, image.size.height // 2
    high_quality_image = imageop.Image(size=new_size, source=image)

    # direct buffer access
    rgb_image[0, 0] = r, g, b
    assert tuple(rgb_image.buffer[:3]) == (r, g, b)


Backwards Compatibility
=======================

There are three areas touched by this PEP where backwards
compatibility should be considered:

* **Python 2.6**: new classes and objects are added to the ``imageop``
  module without touching the existing module contents; new methods
  and attributes will be added to ``Tkinter.PhotoImage`` and its
  ``__getitem__`` and ``__setitem__`` methods will be modified to
  accept integers, tuples and slices (currently they only accept
  strings).  All the changes provide a superset of the existing
  functionality, so no major compatibility issues are expected.

* **Python 3.0**: the legacy contents of the ``imageop`` module will
  be deleted, according to PEP 3108; everything defined in this
  proposal will work like in Python 2.x with the exception of the
  usual 2.x/3.0 differences (e.g. support for ``long`` integers and
  for interpreting ``str`` instances as sequences of bytes will be
  dropped).

* **external libraries**: the names and the semantics of the standard
  image methods and attributes are carefully chosen to allow some
  external libraries that manipulate images (including at least PIL,
  wxPython and pygame) to implement the new protocol in their image
  classes without breaking compatibility with existing code.  The only
  blatant conflicts between the image protocol and NumPy arrays are
  the value of the ``size`` attribute and the coordinates order in the
  ``image[x, y]`` expression.


Reference Implementation
========================

If this PEP is accepted, the author will provide a reference
implementation of the new classes in pure Python (that can run in
CPython, PyPy, Jython and IronPython) and a second one optimized for
speed in Python and C, suitable for inclusion in the CPython standard
library.  The author will also submit the required Tkinter patches.
For all the code will be available a version for Python 2.x and a
version for Python 3.0 (it is expected that the two version will be
very similar and the Python 3.0 one will probably be generated almost
completely automatically).


Acknowledgments
===============

The implementation of this PEP, if accepted, is sponsored by Google
through the Google Summer of Code program.


Copyright
=========

This document has been placed in the public domain.


-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com

From bjourne at gmail.com  Sun Jul  1 14:34:03 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Sun, 1 Jul 2007 14:34:03 +0200
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
	<cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
Message-ID: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>

Cool PEP! I really love the API for the Image class. A standard Image
class would be a useful addition to the standard library.

But I cannot see how it would solve the problem with to many image
classes. The reason why PIL, PyGame and wxPython has different image
classes is because each of them use different C functions for
manipulating said image classes. These differences bubble up through
the bindings and results in PIL exposing an Image, PyGame a Surface
and wxPython a wxImage. The result is that if you want to use a PIL
Image in say PyGame, you  still need to convert it. If PIL stores RGB
images with 32 bpp and PyGame uses 24, then you'll have to convert it
to get it into the proper format.

The only way to get compatibility between the libraries is to create
an image library in C _and_ get those libraries to start using it.

-- 
mvh Bj?rn

From stargaming at gmail.com  Sun Jul  1 18:01:12 2007
From: stargaming at gmail.com (Stargaming)
Date: Sun, 01 Jul 2007 18:01:12 +0200
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>	<cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
	<740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>
Message-ID: <f68j4d$o30$1@sea.gmane.org>

BJ?rn Lindqvist schrieb:
> Cool PEP! I really love the API for the Image class. A standard Image
> class would be a useful addition to the standard library.
> 
> But I cannot see how it would solve the problem with to many image
> classes. The reason why PIL, PyGame and wxPython has different image
> classes is because each of them use different C functions for
> manipulating said image classes. These differences bubble up through
> the bindings and results in PIL exposing an Image, PyGame a Surface
> and wxPython a wxImage. The result is that if you want to use a PIL
> Image in say PyGame, you  still need to convert it. If PIL stores RGB
> images with 32 bpp and PyGame uses 24, then you'll have to convert it
> to get it into the proper format.
> 
> The only way to get compatibility between the libraries is to create
> an image library in C _and_ get those libraries to start using it.
> 

They'll all quack the same way. (This is paraphrased in the PEP's 
abstract, as far as I read it.)


From martin at v.loewis.de  Sun Jul  1 18:55:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 01 Jul 2007 18:55:42 +0200
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <f68j4d$o30$1@sea.gmane.org>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>	<cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>	<740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>
	<f68j4d$o30$1@sea.gmane.org>
Message-ID: <4687DC8E.6010109@v.loewis.de>

>> The only way to get compatibility between the libraries is to create
>> an image library in C _and_ get those libraries to start using it.
>>
> 
> They'll all quack the same way. (This is paraphrased in the PEP's 
> abstract, as far as I read it.)

To the Python side, yes. But to the underlying C library, some
quack, some bark.

How would you pass a Tkinter.PhotoImage to wxPython if both
supported the PEP? wxPython would likely be able to produce
objects that provide the Image interface, but I can't see how
wxPython could consume such a thing - the underlying C libraries
surely expect something completely different.

The only way I can see this work is if each library imports
Image objects by copying them, pixel for pixel, through this
interface.

Regards,
Martin

From l.mastrodomenico at gmail.com  Sun Jul  1 18:59:09 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Sun, 1 Jul 2007 18:59:09 +0200
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
	<cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
	<740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>
Message-ID: <cc93256f0707010959o44c77912sb989c68cf890b846@mail.gmail.com>

2007/7/1, BJ?rn Lindqvist <bjourne at gmail.com>:
> But I cannot see how it would solve the problem with to many image
> classes. The reason why PIL, PyGame and wxPython has different image
> classes is because each of them use different C functions for
> manipulating said image classes. These differences bubble up through
> the bindings and results in PIL exposing an Image, PyGame a Surface
> and wxPython a wxImage. The result is that if you want to use a PIL
> Image in say PyGame, you  still need to convert it.

Actually, this is not always true. :-)

For example it's entirely possible to have the *same* python RGBA
image considered as a SDL_Surface by SDL (the underlying library used
by pygame), as an ImagingMemoryInstance by the PIL C library and have
its buffer directly accepted by the OpenGL function glTexImage2D (with
a bit of care in the order of the corners passed to glTexCoord2f),
independently by who created the image in the first place.

This works because most C/C++ libraries give the possibility of
creating a native image struct/class using an existing memory buffer
(without copying it) and they support at least a subset of the modes
currently defined, with the exact byte order, padding, etc, specified
in the PEP (usually L and at least one of RGB or RGBA).

But you are right, the particular format specified in the PEP is not
always supported by existing the libraries, even when they support
that particular mode. Sometimes this can be fixed (e.g. PIL currently
uses by default 4 bytes per pixel for RGB images and has only
experimental support for 3 bytes per pixel, but its C library is
written by the same people that maintain the Python bindings, so they
can change it if they want) and sometimes it cannot be easily fixed
(e.g. a wxImage class will happily accept a RGB buffer as defined by
the PEP, but it has a funny memory arrangement for RGBA images that is
completely incompatible).

So I expect that each Python library that jumps on the PEP bandwagon
will have three levels of support for the modes listed:

  1) no support at all (e.g. most 3D libraries will probably never
accept CMYK images as textures); the user can explicitly convert the
image using "new_image = Image(new_mode, source=old_image)";

  2) limited support: they support a particular mode, but cannot
directly use the standard memory arrangement, so when they receive an
alien image object they convert it on the fly to their preferred byte
order and they do the reverse operation when a foreign library tries
to access the buffer property of their images (they may offer a
read-only buffer); this is not ideal, but it's better than the current
situation because it's transparent to the user and it requires only a
single memory copy/conversion instead of the two usually performed by
the current tostring/fromstring dance;

  3) full support: no conversion or memory copy ever necessary for the
exchange of images between two libraries if they both have full
support for a particular mode. Of course the Image class that I'm
writing and that I hope will be included in the stdlib, will have full
support for all the modes.

Please note that the conversions in "2)" above can be avoided in some
(most?) cases if PEP 3118 is accepted, because it will become possible
to expose and discover the "native" memory arrangement of an image
without accessing its buffer property (that, in my vision, will always
offer the "standard" arrangement defined in the PEP, to simplify
things for libraries that prefer a simpler interface, even if it may
be slightly less efficient in some, hopefully rare, cases).

-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com

From alexandre at peadrop.com  Mon Jul  2 19:46:28 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Mon, 2 Jul 2007 13:46:28 -0400
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231052x561e7acfpf84373ea670c2974@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
Message-ID: <acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>

If StringIO is not allowed to over-seek, what should happen to the
current file position when it is truncated?

   >>> s = StringIO("Hello world!")
   >>> s.seek(0, 2)
   >>> s.truncate(2)
   >>> s.tell()
   ???

Truncating can either set the position to the new string size, or it
leaves it alone.

-- Alexandre

From guido at python.org  Mon Jul  2 20:38:54 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 2 Jul 2007 11:38:54 -0700
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231052x561e7acfpf84373ea670c2974@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
Message-ID: <ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>

Honestly, I think truncate() should always set the current position to
the new size, even though that's not what it currently does. Or at
least it should set it to the new size if that's less than the current
position. What's the rationale (apart from "Unix defined it so") why
it currently leaves the position unchanged?

At least I think it's fine if StringIO does it this way. I think
TextIOWrapper should also do it this way, as it has the same issue
(writing null bytes is not defined for encoded files).

--Guido

On 7/2/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> If StringIO is not allowed to over-seek, what should happen to the
> current file position when it is truncated?
>
>    >>> s = StringIO("Hello world!")
>    >>> s.seek(0, 2)
>    >>> s.truncate(2)
>    >>> s.tell()
>    ???
>
> Truncating can either set the position to the new string size, or it
> leaves it alone.
>
> -- Alexandre
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alexandre at peadrop.com  Mon Jul  2 20:59:28 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Mon, 2 Jul 2007 14:59:28 -0400
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231052x561e7acfpf84373ea670c2974@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
	<ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
Message-ID: <acd65fa20707021159p4291451ycebaf3c6ac51a438@mail.gmail.com>

On 7/2/07, Guido van Rossum <guido at python.org> wrote:
> Honestly, I think truncate() should always set the current position to
> the new size, even though that's not what it currently does. Or at
> least it should set it to the new size if that's less than the current
> position. What's the rationale (apart from "Unix defined it so") why
> it currently leaves the position unchanged?

No idea. I just know that truncate in the old StringIO module do set
the position to the new size if the new size is less than the current
position. And that is how I implemented it in _bytes_io and
_string_io.

From rasky at develer.com  Tue Jul  3 00:51:41 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Tue, 03 Jul 2007 00:51:41 +0200
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <20070630205444.GD22221@theory.org>
References: <20070630205444.GD22221@theory.org>
Message-ID: <f6bvhu$9l3$1@sea.gmane.org>

On 30/06/2007 22.54, Matt Chisholm wrote:

> I've created and submitted a new PEP proposing support for labels in
> Python's break and continue statements.  Georg Brandl has graciously
> added it to the PEP list as PEP 3136:
> 
> http://www.python.org/dev/peps/pep-3136/
> 
> I understand that the deadline for submitting features for Python 3.0
> has passed, so this PEP targets Python 3.1.  I also expect that people
> might not want to take time off from the Python 3.0 effort to discuss
> features that are even further off in the future.
> 
> Thanks for your time, and thanks for letting me contribute an idea to
> Python.

I didn't see one simple alternative listed: move everything within a function:

def func():
    for a in a_list:
        for b in b_list:
            if condition1(a, b):
                return
            [...]
            if condition2(a, b):
                break

func()
-- 
Giovanni Bajo


From greg.ewing at canterbury.ac.nz  Tue Jul  3 01:35:25 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 03 Jul 2007 11:35:25 +1200
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <f6bvhu$9l3$1@sea.gmane.org>
References: <20070630205444.GD22221@theory.org> <f6bvhu$9l3$1@sea.gmane.org>
Message-ID: <46898BBD.1040901@canterbury.ac.nz>

On 30/06/2007 22.54, Matt Chisholm wrote:

> I've created and submitted a new PEP proposing support for labels in
> Python's break and continue statements.
>
> http://www.python.org/dev/peps/pep-3136/

-1. Confusing nested loops are best broken out into
separate functions rather than patching over the
problem with features like this.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From ntoronto at cs.byu.edu  Tue Jul  3 09:17:05 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Tue, 03 Jul 2007 01:17:05 -0600
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <46898BBD.1040901@canterbury.ac.nz>
References: <20070630205444.GD22221@theory.org> <f6bvhu$9l3$1@sea.gmane.org>
	<46898BBD.1040901@canterbury.ac.nz>
Message-ID: <4689F7F1.1070503@cs.byu.edu>

Greg Ewing wrote:
> On 30/06/2007 22.54, Matt Chisholm wrote:
>
>   
>> I've created and submitted a new PEP proposing support for labels in
>> Python's break and continue statements.
>>
>> http://www.python.org/dev/peps/pep-3136/
>>     
>
> -1. Confusing nested loops are best broken out into
> separate functions rather than patching over the
> problem with features like this.
>   

+1 (not that my vote really counts for much). Breaking logic out into 
separate functions can obscure the meaning of an algorithm that is most 
naturally implemented with nested loops.

Neil


From edin.salkovic at gmail.com  Tue Jul  3 10:11:59 2007
From: edin.salkovic at gmail.com (Edin Salkovic)
Date: Tue, 3 Jul 2007 10:11:59 +0200
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
	<cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
Message-ID: <63eb7fa90707030111r2ab33606xb2f76269e9e80b1f@mail.gmail.com>

Hi Lino,

On 7/1/07, Lino Mastrodomenico <l.mastrodomenico at gmail.com> wrote:
> ``__getitem__(integer) -> line``
>
>     Returns the line at the specified (y) position.

Just some ideas to think about.

1) Have you considered adding a separate lines property to the Image protocol?

2) Does one, by default, want to iterate over lines or over pixels of
an image?  Even your example iterates  over pixels:

   # iterate over an image
   for line in rgb_image:
       for pixel in line:
           # swap red and blue, and set green to 0
           pixel.value = pixel.b, 0, pixel.r

why not just:
   # iterate over an image
   for pixel in rgb_image:
           pixel.value = pixel.b, 0, pixel.r

3) The pixels method (same for the possible lines property that I
mentioned above) should probably be a property, i.e.:
pixels -> iterator[pixel], not: pixels() -> iterator[pixel]

P.S.: You might also inform the SciPy/NumPy lists about the PEP.

Keep up the good work!,
Edin

From guido at python.org  Tue Jul  3 10:14:17 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 3 Jul 2007 10:14:17 +0200
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <20070630205444.GD22221@theory.org>
References: <20070630205444.GD22221@theory.org>
Message-ID: <ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>

On 6/30/07, Matt Chisholm <matt-python at theory.org> wrote:
> I've created and submitted a new PEP proposing support for labels in
> Python's break and continue statements.  Georg Brandl has graciously
> added it to the PEP list as PEP 3136:
>
> http://www.python.org/dev/peps/pep-3136/

I think this is a good summary of various proposals that have been
floated in the past, plus some new ones. As a PEP, it falls short
because it doesn't pick a solution but merely offers a large menu of
possible options. Also, there is nothing about implementation yet.

However, I'm rejecting it on the basis that code so complicated to
require this feature is very rare. In most cases there are existing
work-arounds that produce clean code, for example using 'return'.
While I'm sure there are some (rare) real cases where clarity of the
code would suffer from a refactoring that makes it possible to use
return, this is offset by two issues:

1. The complexity added to the language, permanently. This affects not
only all Python implementations, but also every source analysis tool,
plus of course all documentation for the language.

2. My expectation that the feature will be abused more than it will be
used right, leading to a net decrease in code clarity (measured across
all Python code written henceforth). Lazy programmers are everywhere,
and before you know it you have an incredible mess on your hands of
unintelligible code.

I realize this is a heavy bar to pass, and somewhat subjective. That's
okay. There is real value in having a small language. Also, as I said,
while there are no past PEPs to document it, this has been brought up
and rejected many times before.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rasky at develer.com  Tue Jul  3 10:27:03 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Tue, 03 Jul 2007 10:27:03 +0200
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <4689F7F1.1070503@cs.byu.edu>
References: <20070630205444.GD22221@theory.org>
	<f6bvhu$9l3$1@sea.gmane.org>	<46898BBD.1040901@canterbury.ac.nz>
	<4689F7F1.1070503@cs.byu.edu>
Message-ID: <f6d18n$ag$1@sea.gmane.org>

On 03/07/2007 9.17, Neil Toronto wrote:
> Greg Ewing wrote:
>> On 30/06/2007 22.54, Matt Chisholm wrote:
>>
>>   
>>> I've created and submitted a new PEP proposing support for labels in
>>> Python's break and continue statements.
>>>
>>> http://www.python.org/dev/peps/pep-3136/
>>>     
>> -1. Confusing nested loops are best broken out into
>> separate functions rather than patching over the
>> problem with features like this.
>>   
> 
> +1 (not that my vote really counts for much). Breaking logic out into 
> separate functions can obscure the meaning of an algorithm that is most 
> naturally implemented with nested loops.

Do you have a concrete, real-world example?
-- 
Giovanni Bajo


From ntoronto at cs.byu.edu  Tue Jul  3 11:42:09 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Tue, 03 Jul 2007 03:42:09 -0600
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <f6d18n$ag$1@sea.gmane.org>
References: <20070630205444.GD22221@theory.org>	<f6bvhu$9l3$1@sea.gmane.org>	<46898BBD.1040901@canterbury.ac.nz>	<4689F7F1.1070503@cs.byu.edu>
	<f6d18n$ag$1@sea.gmane.org>
Message-ID: <468A19F1.7070302@cs.byu.edu>

Giovanni Bajo wrote:
> On 03/07/2007 9.17, Neil Toronto wrote:
>   
>> Greg Ewing wrote:
>>     
>>> On 30/06/2007 22.54, Matt Chisholm wrote:
>>>
>>>   
>>>       
>>>> I've created and submitted a new PEP proposing support for labels in
>>>> Python's break and continue statements.
>>>>
>>>> http://www.python.org/dev/peps/pep-3136/
>>>>     
>>>>         
>>> -1. Confusing nested loops are best broken out into
>>> separate functions rather than patching over the
>>> problem with features like this.
>>>   
>>>       
>> +1 (not that my vote really counts for much). Breaking logic out into 
>> separate functions can obscure the meaning of an algorithm that is most 
>> naturally implemented with nested loops.
>>     
>
> Do you have a concrete, real-world example?
>   

You pragmatists and your concrete, real-world examples. :p

Anyway, sure: image processing -> binary morphological operators -> 
erode. It's a four-deep nested loop. You pass a binary bitmask (kernel) 
over a binary image, centering it on each pixel. If one bit in the image 
is off that's on in the kernel, you turn off the center pixel in the 
destination. This is the obvious break - it only takes one, so it's 
senseless to keep going in the inner two loops.

Moving the innermost two loops into a new function makes the flow of the 
algorithm less linear and therefore less clear. (Also, the function 
would never be called from anywhere else. How about an inner function? 
That's worse for understandability, IMNSHO.) Other ways of avoiding the 
inner break, such as counting hits, or overwriting the center pixel 
repeatedly, obscure the meaning of the morphological operator.

Granted, Python doesn't usually get used for low-level stuff like this, 
and I'd probably use Numpy array operations in the place of the inner 
two loops, which would be less efficient, but faster. But you were 
asking whether algorithms that are naturally expressed as nested loops 
with breaks exist, and this just happened to be on my hard drive, 
written in Java.

FWIW, I've read Guido's recent rejection of this PEP, but I wanted to 
take up the challenge of showing that these (admittedly rare) use cases 
do exist. A lot of them come from 2D analogues of algorithms that call 
for a break from an inner loop.

Neil


From tomerfiliba at gmail.com  Tue Jul  3 12:59:51 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Tue, 3 Jul 2007 12:59:51 +0200
Subject: [Python-3000] the do-while pep
Message-ID: <1d85506f0707030359w45bd864cn2007e67459df18cc@mail.gmail.com>

i haven't seen this issue discussed at all, so i thought i'd bring it up --
what's the status of the pep 315 (do-while syntax)? is it getting into py3k?


-tomer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070703/30989ac6/attachment.html 

From guido at python.org  Tue Jul  3 13:42:14 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 3 Jul 2007 13:42:14 +0200
Subject: [Python-3000] the do-while pep
In-Reply-To: <1d85506f0707030359w45bd864cn2007e67459df18cc@mail.gmail.com>
References: <1d85506f0707030359w45bd864cn2007e67459df18cc@mail.gmail.com>
Message-ID: <ca471dc20707030442l30621106v8aaef7281b57b5b1@mail.gmail.com>

On 7/3/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> i haven't seen this issue discussed at all, so i thought i'd bring it up --
> what's the status of the pep 315 (do-while syntax)? is it getting into py3k?

No, it wasn't even considered. It was in the deferred list and nobody
suggested we look at it for Py3k. From the message quoted in the
deferral note it doesn't look like it's an easy sell.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From john at yates-sheets.org  Tue Jul  3 14:32:19 2007
From: john at yates-sheets.org (John S. Yates, Jr.)
Date: Tue, 03 Jul 2007 08:32:19 -0400
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
References: <20070630205444.GD22221@theory.org>
	<ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
Message-ID: <j1gk83hfeltf7uqi0u4b5tpgbg80qg3cp0@4ax.com>

On Tue, 3 Jul 2007, "Guido van Rossum" <guido at python.org> wrote:

>However, I'm rejecting it on the basis that code so complicated to
>require this feature is very rare.

I assume that you are familiar with Donald E. Knuth's classic paper:

  "Structured Programming with go to Statements"

  http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf

/john


From alexandre at peadrop.com  Tue Jul  3 17:06:20 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 3 Jul 2007 11:06:20 -0400
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231052x561e7acfpf84373ea670c2974@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
	<ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
Message-ID: <acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>

On 7/2/07, Guido van Rossum <guido at python.org> wrote:
> Honestly, I think truncate() should always set the current position to
> the new size, even though that's not what it currently does.

Thought about that and I think that would be the best thing to do.
That would avoid making StringIO unnecessary different from BytesIO.
And IMHO, it is less prone to bugs. If someone wants to truncate while
keeping the current position, then he will have to state is intention
explicitly by saving the value of tell() and calling seek() after
truncating.

I also find the semantic make more sense too. For example:

   >>> s = StringIO("Good bye, world")
   >>> s.truncate(10)
   >>> s.write("cruel world")
   >>> s.getvalue()
   ???

I think that should return "Good bye, cruel world", not "cruel world".

So, does anyone else agree with this small semantic change of truncate()?

-- Alexandre

From p.f.moore at gmail.com  Tue Jul  3 17:13:51 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 3 Jul 2007 16:13:51 +0100
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231052x561e7acfpf84373ea670c2974@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
	<ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
	<acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>
Message-ID: <79990c6b0707030813o38b36960m7a6469722fd05444@mail.gmail.com>

On 03/07/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> I also find the semantic make more sense too. For example:
>
>   >>> s = StringIO("Good bye, world")
>   >>> s.truncate(10)
>   >>> s.write("cruel world")
>   >>> s.getvalue()
>   ???
>
> I think that should return "Good bye, cruel world", not "cruel world".
>
> So, does anyone else agree with this small semantic change of truncate()?

Looks reasonable to me - without checking documentation, your proposal
is what I'd expect the example to do.

Paul.

From tjreedy at udel.edu  Wed Jul  4 01:40:14 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 3 Jul 2007 19:40:14 -0400
Subject: [Python-3000] Announcing PEP 3136
References: <20070630205444.GD22221@theory.org><ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
	<j1gk83hfeltf7uqi0u4b5tpgbg80qg3cp0@4ax.com>
Message-ID: <f6emou$a8n$1@sea.gmane.org>


"John S. Yates, Jr." <john at yates-sheets.org> wrote in message 
news:j1gk83hfeltf7uqi0u4b5tpgbg80qg3cp0 at 4ax.com...
| On Tue, 3 Jul 2007, "Guido van Rossum" <guido at python.org> wrote:
|
| >However, I'm rejecting it on the basis that code so complicated to
| >require this feature is very rare.
|
| I assume that you are familiar with Donald E. Knuth's classic paper:
|  "Structured Programming with go to Statements"
|  http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf

Do you consider this  to be for or against the PEP?  Rereading it....

At least half Knuth's goto examples are covered by Python's single level 
restricted gotos:

Example 1 (switched to 0-bases arrays, not tested):

for i in range(m):
    if A[i] == x: break
else:
    A[m] = x
    B[m] = 0
    m += 1
B[i] += 1

Example 5 (ditto):

i = 0 #? initial value not given
while True:
    if A[i] < x:
        if L[i] != 0:
            i = L[i]; continue
        else:
              L[i] = j; break
    else: # > x
        if R[i] != 0:
            i = R[i]; continue
        else:
              R[i] = j; break
    # dup code could be factored with LR = L or R as A[i] < or > x
A[j] = x
L[j] = R[j] = 0
j += 1

The rest are general gotos, including jumps into the middle of loops.
None are multilevel continues or breaks.

tjr


From john at yates-sheets.org  Wed Jul  4 15:41:48 2007
From: john at yates-sheets.org (John S. Yates, Jr.)
Date: Wed, 04 Jul 2007 09:41:48 -0400
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <f6emou$a8n$1@sea.gmane.org>
References: <20070630205444.GD22221@theory.org><ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
	<j1gk83hfeltf7uqi0u4b5tpgbg80qg3cp0@4ax.com>
	<f6emou$a8n$1@sea.gmane.org>
Message-ID: <f97n835arna3kmn156k6avn3i63q4g611h@4ax.com>

On Tue, 3 Jul 2007, "Terry Reedy" <tjreedy at udel.edu> wrote:

>Do you consider this  to be for or against the PEP?
>Rereading it....
>
>At least half Knuth's goto examples are covered
>by Python's single level restricted gotos:

In all honesty I did not reread the paper.  I posted
based on the recollection that it was the basis for
my feeling no compunction about using reviled gotos
in my C / C++ code to effect multi-level exists and
continuations.  When called to task by my peers I
invoke Knuth's name.

Let's chalk it up to the fallibility of memory over
a span of more than 30 years.  Thank's for keeping me
honest.  I am printing out the paper.  Rereading it
should help me recall the state of programming when
I was first starting out.

/john


From turnbull at sk.tsukuba.ac.jp  Wed Jul  4 18:46:08 2007
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Thu, 05 Jul 2007 01:46:08 +0900
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <f97n835arna3kmn156k6avn3i63q4g611h@4ax.com>
References: <20070630205444.GD22221@theory.org>
	<ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
	<j1gk83hfeltf7uqi0u4b5tpgbg80qg3cp0@4ax.com>
	<f6emou$a8n$1@sea.gmane.org>
	<f97n835arna3kmn156k6avn3i63q4g611h@4ax.com>
Message-ID: <876450xgkv.fsf@uwakimon.sk.tsukuba.ac.jp>

John S. Yates, Jr. writes:

 > In all honesty I did not reread the paper.

Sir, you have my thanks for this small misstep, without which you
would have undoubtedly abstained from posting that URL, and I, in
turn, would have missed a chance to read that wonderful paper.


From collinw at gmail.com  Fri Jul  6 16:03:56 2007
From: collinw at gmail.com (Collin Winter)
Date: Fri, 6 Jul 2007 16:03:56 +0200
Subject: [Python-3000] Change to class construction?
Message-ID: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>

While experimenting with porting setuptools to py3k (as of r56155), I
ran into this situation:

class C:
  a = (4, 5)
  b = [c for c in range(2) if a]

results in a "NameError: global name 'a' is not defined" error, while

class C:
  a = (4, 5)
  b = [c for c in a]

works fine. This gives the same error as above:

class C:
  a = (4, 5)
  b = [a for c in range(2)]

Both now-erroneous snippets work in 2.5.1. Was this change intentional?

Collin Winter

From g.brandl at gmx.net  Fri Jul  6 17:00:08 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 06 Jul 2007 17:00:08 +0200
Subject: [Python-3000] Change to class construction?
In-Reply-To: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
Message-ID: <f6lld0$qv3$1@sea.gmane.org>

Collin Winter schrieb:
> While experimenting with porting setuptools to py3k (as of r56155), I
> ran into this situation:
> 
> class C:
>   a = (4, 5)
>   b = [c for c in range(2) if a]
> 
> results in a "NameError: global name 'a' is not defined" error, while
> 
> class C:
>   a = (4, 5)
>   b = [c for c in a]
> 
> works fine. This gives the same error as above:
> 
> class C:
>   a = (4, 5)
>   b = [a for c in range(2)]
> 
> Both now-erroneous snippets work in 2.5.1. Was this change intentional?

It is at least intentional in the sense that in 3k it works the same as with
genexps, which give the same errors in 2.5.

What's different is that all code inside a genexp except the first iterator
(which is why the second example works) is contained in its own function
namespace.

So, an equivalent problem is:

class C:
   foo = 1
   def bar(): print(foo)
   bar()


Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From pje at telecommunity.com  Fri Jul  6 19:25:10 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 06 Jul 2007 13:25:10 -0400
Subject: [Python-3000] Change to class construction?
In-Reply-To: <f6lld0$qv3$1@sea.gmane.org>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
Message-ID: <20070706172258.1E65B3A4046@sparrow.telecommunity.com>

At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote:
>Collin Winter schrieb:
> > While experimenting with porting setuptools to py3k (as of r56155), I
> > ran into this situation:
> >
> > class C:
> >   a = (4, 5)
> >   b = [c for c in range(2) if a]
> >
> > results in a "NameError: global name 'a' is not defined" error, while
> >
> > class C:
> >   a = (4, 5)
> >   b = [c for c in a]
> >
> > works fine. This gives the same error as above:
> >
> > class C:
> >   a = (4, 5)
> >   b = [a for c in range(2)]
> >
> > Both now-erroneous snippets work in 2.5.1. Was this change intentional?
>
>It is at least intentional in the sense that in 3k it works the same as with
>genexps, which give the same errors in 2.5.

This looks like a bug to me.  A list comprehension's local scope 
should be the locals of the enclosing code, even if its loop indexes 
aren't exposed to that scope.


From guido at python.org  Sat Jul  7 00:32:15 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 7 Jul 2007 00:32:15 +0200
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070706172258.1E65B3A4046@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
Message-ID: <ca471dc20707061532k5a636a49r57ab383dad81fae3@mail.gmail.com>

On 7/6/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote:
> >Collin Winter schrieb:
> > > While experimenting with porting setuptools to py3k (as of r56155), I
> > > ran into this situation:
> > >
> > > class C:
> > >   a = (4, 5)
> > >   b = [c for c in range(2) if a]
> > >
> > > results in a "NameError: global name 'a' is not defined" error, while
> > >
> > > class C:
> > >   a = (4, 5)
> > >   b = [c for c in a]
> > >
> > > works fine. This gives the same error as above:
> > >
> > > class C:
> > >   a = (4, 5)
> > >   b = [a for c in range(2)]
> > >
> > > Both now-erroneous snippets work in 2.5.1. Was this change intentional?
> >
> >It is at least intentional in the sense that in 3k it works the same as with
> >genexps, which give the same errors in 2.5.
>
> This looks like a bug to me.  A list comprehension's local scope
> should be the locals of the enclosing code, even if its loop indexes
> aren't exposed to that scope.

It's because the class scope is not made available to the methods.
That is intentional. Georg's later example is relevant:

class C:
  a = 1
  def f(self): print(a)   # <-- raises NameError for 'a'

This is in turn intentional so that too-clever kids don't develop a
habit of referencing class variables without prefixing them with self
or C.

The OP's use case is rare enough that I don't think we should do
anything about it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Sat Jul  7 01:36:07 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 06 Jul 2007 19:36:07 -0400
Subject: [Python-3000] Change to class construction?
In-Reply-To: <ca471dc20707061532k5a636a49r57ab383dad81fae3@mail.gmail.co
 m>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<ca471dc20707061532k5a636a49r57ab383dad81fae3@mail.gmail.com>
Message-ID: <20070706233354.DA07A3A4046@sparrow.telecommunity.com>

At 12:32 AM 7/7/2007 +0200, Guido van Rossum wrote:
>On 7/6/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote:
> > >Collin Winter schrieb:
> > > > While experimenting with porting setuptools to py3k (as of r56155), I
> > > > ran into this situation:
> > > >
> > > > class C:
> > > >   a = (4, 5)
> > > >   b = [c for c in range(2) if a]
> > > >
> > > > results in a "NameError: global name 'a' is not defined" error, while
> > > >
> > > > class C:
> > > >   a = (4, 5)
> > > >   b = [c for c in a]
> > > >
> > > > works fine.
> >
> > This looks like a bug to me.  A list comprehension's local scope
> > should be the locals of the enclosing code, even if its loop indexes
> > aren't exposed to that scope.
>
>It's because the class scope is not made available to the methods.

The examples are in the class body, not in methods.  The code is 
statically initializing the class contents, so using C.a isn't possible.

I suppose it can be worked around by moving the static initialization 
code outside the class body; it's just not obvious why it happens.

Collin, where did you find this code in setuptools, btw?  I've been 
looking around at other packages of mine where static class 
initialization uses data structures like this, and I haven't found 
any place where anything but the "in" clause of a comprehension 
depends on class-scope variables.  So, if setuptools is the only one 
of my libraries that does this, I'd have to agree with Guido that it 
is indeed quite rare.  :)

If I had to hazard a guess, I'd guess that it's in one of the 
setuptools command classes that subclasses a distutils command, and 
proceeds to muck around with the original options in some fashion.  I 
just don't want to check all of them if you know which one it is.  :)


From guido at python.org  Sat Jul  7 01:41:16 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 7 Jul 2007 01:41:16 +0200
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070706233354.DA07A3A4046@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<ca471dc20707061532k5a636a49r57ab383dad81fae3@mail.gmail.com>
	<20070706233354.DA07A3A4046@sparrow.telecommunity.com>
Message-ID: <ca471dc20707061641l4dd04f07x203f225be2c32cb3@mail.gmail.com>

On 7/7/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:32 AM 7/7/2007 +0200, Guido van Rossum wrote:
> >On 7/6/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote:
> > > >Collin Winter schrieb:
> > > > > While experimenting with porting setuptools to py3k (as of r56155), I
> > > > > ran into this situation:
> > > > >
> > > > > class C:
> > > > >   a = (4, 5)
> > > > >   b = [c for c in range(2) if a]
> > > > >
> > > > > results in a "NameError: global name 'a' is not defined" error, while
> > > > >
> > > > > class C:
> > > > >   a = (4, 5)
> > > > >   b = [c for c in a]
> > > > >
> > > > > works fine.
> > >
> > > This looks like a bug to me.  A list comprehension's local scope
> > > should be the locals of the enclosing code, even if its loop indexes
> > > aren't exposed to that scope.
> >
> >It's because the class scope is not made available to the methods.
>
> The examples are in the class body, not in methods.  The code is
> statically initializing the class contents, so using C.a isn't possible.

Understood, but a generator expression (and hence in 3.0 also a list
comprehension) is treated the same as a method body.

> I suppose it can be worked around by moving the static initialization
> code outside the class body; it's just not obvious why it happens.
>
> Collin, where did you find this code in setuptools, btw?  I've been
> looking around at other packages of mine where static class
> initialization uses data structures like this, and I haven't found
> any place where anything but the "in" clause of a comprehension
> depends on class-scope variables.  So, if setuptools is the only one
> of my libraries that does this, I'd have to agree with Guido that it
> is indeed quite rare.  :)
>
> If I had to hazard a guess, I'd guess that it's in one of the
> setuptools command classes that subclasses a distutils command, and
> proceeds to muck around with the original options in some fashion.  I
> just don't want to check all of them if you know which one it is.  :)
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Sat Jul  7 03:17:39 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 Jul 2007 13:17:39 +1200
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070706172258.1E65B3A4046@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
Message-ID: <468EE9B3.40902@canterbury.ac.nz>

Phillip J. Eby wrote:
> This looks like a bug to me.  A list comprehension's local scope 
> should be the locals of the enclosing code, even if its loop indexes 
> aren't exposed to that scope.

It sounds like list comprehensions are being implemented
using genexps behind the scenes now.

Is this wise? In a recent thread, I suggested that one
of the reasons for keeping the LC syntax was that it
could be faster than list(genexp). Has anyone investigated
whether any speed is being lost by making them equivalent?

--
Greg

From g.brandl at gmx.net  Sat Jul  7 08:55:12 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 07 Jul 2007 08:55:12 +0200
Subject: [Python-3000] Change to class construction?
In-Reply-To: <468EE9B3.40902@canterbury.ac.nz>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>	<f6lld0$qv3$1@sea.gmane.org>	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468EE9B3.40902@canterbury.ac.nz>
Message-ID: <f6ndbn$i0v$1@sea.gmane.org>

Greg Ewing schrieb:
> Phillip J. Eby wrote:
>> This looks like a bug to me.  A list comprehension's local scope 
>> should be the locals of the enclosing code, even if its loop indexes 
>> aren't exposed to that scope.
> 
> It sounds like list comprehensions are being implemented
> using genexps behind the scenes now.

That's not true, but the implementation is somewhat similar in that
the code is executed in its own function context.

> Is this wise? In a recent thread, I suggested that one
> of the reasons for keeping the LC syntax was that it
> could be faster than list(genexp). Has anyone investigated
> whether any speed is being lost by making them equivalent?

I don't remember the details, but IIRC the new LC implementation
was not slower than the 2.x one. Nick should know more about that.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From ncoghlan at gmail.com  Sat Jul  7 16:15:54 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 08 Jul 2007 00:15:54 +1000
Subject: [Python-3000] Change to class construction?
In-Reply-To: <f6ndbn$i0v$1@sea.gmane.org>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>	<f6lld0$qv3$1@sea.gmane.org>	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>	<468EE9B3.40902@canterbury.ac.nz>
	<f6ndbn$i0v$1@sea.gmane.org>
Message-ID: <468FA01A.6040707@gmail.com>

Georg Brandl wrote:
> Greg Ewing schrieb:
>> Phillip J. Eby wrote:
>>> This looks like a bug to me.  A list comprehension's local scope 
>>> should be the locals of the enclosing code, even if its loop indexes 
>>> aren't exposed to that scope.
>> It sounds like list comprehensions are being implemented
>> using genexps behind the scenes now.
> 
> That's not true, but the implementation is somewhat similar in that
> the code is executed in its own function context.

Georg is correct. A list comprehension like:

[(x * y) for x in seq1 for y in seq2]

expands to the following in 2.x (% prefixes the compiler's hidden 
variables):

   %n = []
   for x in seq1:
     for y in seq2:
       %n.append(x*y) # Special opcode, not a normal call

In py3k it expands to:

   def <anon>(outermost):
     %0 = []
     for x in outermost:
       for y in seq2:
         %0.append(x*y) # Special opcode, not a normal call
     return %0
   %n = <anon>(seq1)

Python's scoping rules are somewhat tricky - doing it this way means we 
know they are being applied the same way in list and set comprehensions 
as they are applied in generator expressions, even if it isn't quite as 
fast as the 2.x approach to comprehensions.

Another significant benefit from a maintainability point of view is that 
the 3 kinds of comprehension (list, set, genexp) now follow the same 
code path through the compiler, with only minor variations in the 
setup/cleanup code and the statement inside the innermost loop.

>> Is this wise? In a recent thread, I suggested that one
>> of the reasons for keeping the LC syntax was that it
>> could be faster than list(genexp). Has anyone investigated
>> whether any speed is being lost by making them equivalent?
> 
> I don't remember the details, but IIRC the new LC implementation
> was not slower than the 2.x one. Nick should know more about that.

Inside a function, Py3k is slower by a constant amount relative to 2.x 
(the cost of creating and calling a function object) regardless of the 
length of the resulting list/set. At module level, Py3k will typically 
be faster, as the fixed cost from the anonymous function object will be 
overtaken by the speedup from the iteration variables becoming function 
locals instead of module globals.

The Py3k comprehensions are still significantly faster than the 
equivalent generator expressions, as they still avoid suspending and 
resuming a generator for each value in the resulting sequence.

The bit that makes all of this tricky isn't really hiding the iteration 
variables from the containing scope - it's making sure that the body of 
the comprehension can still see them after you have done so 
(particularly challenging if the comprehension itself contains a lambda 
expression, or another comprehension/genexp).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From tjreedy at udel.edu  Sat Jul  7 19:08:15 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 7 Jul 2007 13:08:15 -0400
Subject: [Python-3000] Change to class construction?
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>	<f6lld0$qv3$1@sea.gmane.org>	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>	<468EE9B3.40902@canterbury.ac.nz><f6ndbn$i0v$1@sea.gmane.org>
	<468FA01A.6040707@gmail.com>
Message-ID: <f6oh9v$f40$1@sea.gmane.org>


"Nick Coghlan" <ncoghlan at gmail.com> wrote in message 
news:468FA01A.6040707 at gmail.com...
| Georg is correct. A list comprehension like:
|
| [(x * y) for x in seq1 for y in seq2]
|
| expands to the following in 2.x (% prefixes the compiler's hidden
| variables):
|
|   %n = []
|   for x in seq1:
|     for y in seq2:
|       %n.append(x*y) # Special opcode, not a normal call
|
| In py3k it expands to:
|
|   def <anon>(outermost):
|     %0 = []
|     for x in outermost:
|       for y in seq2:
|         %0.append(x*y) # Special opcode, not a normal call
|     return %0
|   %n = <anon>(seq1)

Why not pass both seq1 *and* seq2 to the function so both become locals? 
The difference of treatment is quite surprising.


From tjreedy at udel.edu  Sat Jul  7 19:15:55 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 7 Jul 2007 13:15:55 -0400
Subject: [Python-3000] PEP 368: Standard image protocol and class
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
	<cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
Message-ID: <f6ohob$gc1$1@sea.gmane.org>


Reference Implementation
========================

If this PEP is accepted, the author will provide a reference
implementation of the new classes in pure Python (that can run in
CPython, PyPy, Jython and IronPython) and a second one optimized for
speed in Python and C, suitable for inclusion in the CPython standard
library.  The author will also submit the required Tkinter patches.
For all the code will be available a version for Python 2.x and a
version for Python 3.0 (it is expected that the two version will be
very similar and the Python 3.0 one will probably be generated almost
completely automatically).


Acknowledgments
===============

The implementation of this PEP, if accepted, is sponsored by Google
through the Google Summer of Code program.

****************************************************
*****************************************************

1. I think this *should* conform to the mew buffer protocol.  Assume that 
it will be in 3.0.

2. I don't see how work you promised to do for your stipend can be 
contingent on acceptance into the standard lib.  In any case, this should 
be released at the end of the summer as patches and 3rd party module on 
PyPI so it can be tested in practice and then proposed for the library. 
Very few new library modules get accepted before written ;-).


From ncoghlan at gmail.com  Sun Jul  8 07:10:16 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 08 Jul 2007 15:10:16 +1000
Subject: [Python-3000] Change to class construction?
In-Reply-To: <f6oh9v$f40$1@sea.gmane.org>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>	<f6lld0$qv3$1@sea.gmane.org>	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>	<468EE9B3.40902@canterbury.ac.nz><f6ndbn$i0v$1@sea.gmane.org>	<468FA01A.6040707@gmail.com>
	<f6oh9v$f40$1@sea.gmane.org>
Message-ID: <469071B8.8030604@gmail.com>

Terry Reedy wrote:
> "Nick Coghlan" <ncoghlan at gmail.com> wrote in message 
> news:468FA01A.6040707 at gmail.com...
> | In py3k it expands to:
> |
> |   def <anon>(outermost):
> |     %0 = []
> |     for x in outermost:
> |       for y in seq2:
> |         %0.append(x*y) # Special opcode, not a normal call
> |     return %0
> |   %n = <anon>(seq1)
> 
> Why not pass both seq1 *and* seq2 to the function so both become locals? 
> The difference of treatment is quite surprising.

The inner iterable expressions can't be evaluated early, as they need to 
be re-evaluated for each pass around the outer loop (or loops). An 
example where the iterable expression for the inner loop refers to the 
iteration variable of the outer loop should make that clear:

.>>> [y for x in range(4) for y in range(x)]
[0, 0, 1, 0, 1, 2]

The advantage of the Py3k approach is that it eliminates the current 
semantic differences between a list comprehension and list() with a 
generator expression argument, while keeping most of the performance 
benefits of the special syntax.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From collinw at gmail.com  Sun Jul  8 10:55:45 2007
From: collinw at gmail.com (Collin Winter)
Date: Sun, 8 Jul 2007 11:55:45 +0300
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070706233354.DA07A3A4046@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<ca471dc20707061532k5a636a49r57ab383dad81fae3@mail.gmail.com>
	<20070706233354.DA07A3A4046@sparrow.telecommunity.com>
Message-ID: <43aa6ff70707080155r820ec31t595442753817f0ae@mail.gmail.com>

On 7/7/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> Collin, where did you find this code in setuptools, btw?  I've been
> looking around at other packages of mine where static class
> initialization uses data structures like this, and I haven't found
> any place where anything but the "in" clause of a comprehension
> depends on class-scope variables.  So, if setuptools is the only one
> of my libraries that does this, I'd have to agree with Guido that it
> is indeed quite rare.  :)
>
> If I had to hazard a guess, I'd guess that it's in one of the
> setuptools command classes that subclasses a distutils command, and
> proceeds to muck around with the original options in some fashion.  I
> just don't want to check all of them if you know which one it is.  :)

Yep, it's in setuptools.command.install, lines 20-23 (setuptools v0.6c6).

Collin Winter

From ferringb at gmail.com  Sun Jul  8 16:07:42 2007
From: ferringb at gmail.com (Brian Harring)
Date: Sun, 8 Jul 2007 07:07:42 -0700
Subject: [Python-3000] Change to class construction?
In-Reply-To: <f6oh9v$f40$1@sea.gmane.org>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
Message-ID: <20070708140741.GD23765@seldon>

On Sat, Jul 07, 2007 at 01:08:15PM -0400, Terry Reedy wrote:
> 
> "Nick Coghlan" <ncoghlan at gmail.com> wrote in message 
> news:468FA01A.6040707 at gmail.com...
> | Georg is correct. A list comprehension like:
> |
> | [(x * y) for x in seq1 for y in seq2]
> |
> | expands to the following in 2.x (% prefixes the compiler's hidden
> | variables):
> |
> |   %n = []
> |   for x in seq1:
> |     for y in seq2:
> |       %n.append(x*y) # Special opcode, not a normal call
> |
> | In py3k it expands to:
> |
> |   def <anon>(outermost):
> |     %0 = []
> |     for x in outermost:
> |       for y in seq2:
> |         %0.append(x*y) # Special opcode, not a normal call
> |     return %0
> |   %n = <anon>(seq1)
> 
> Why not pass both seq1 *and* seq2 to the function so both become locals? 
> The difference of treatment is quite surprising.

I'd be curious if there is anyway to preserve the existing behaviour; 

class foo:
  some_list = ('blacklist1', 'blacklist2')
  known_bad = some_list += ('blah',)
  locals().update([(attr, some_callable) for attr in some_list])

is slightly contrived, but I use similar code quite often for method 
generation- both for tests, and standard enough objects.  Realize I 
could do the same via metaclasses, but it's an extra step and not 
nearly as easy/friendly imo.

So... anyway to preserve that trick under py3k?
~harring
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070708/565e6b76/attachment.pgp 

From pje at telecommunity.com  Sun Jul  8 19:50:30 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 08 Jul 2007 13:50:30 -0400
Subject: [Python-3000] Change to class construction?
In-Reply-To: <43aa6ff70707080155r820ec31t595442753817f0ae@mail.gmail.com
 >
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<ca471dc20707061532k5a636a49r57ab383dad81fae3@mail.gmail.com>
	<20070706233354.DA07A3A4046@sparrow.telecommunity.com>
	<43aa6ff70707080155r820ec31t595442753817f0ae@mail.gmail.com>
Message-ID: <20070708174816.E26A33A404D@sparrow.telecommunity.com>

At 11:55 AM 7/8/2007 +0300, Collin Winter wrote:
>On 7/7/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>Collin, where did you find this code in setuptools, btw?  I've been
>>looking around at other packages of mine where static class
>>initialization uses data structures like this, and I haven't found
>>any place where anything but the "in" clause of a comprehension
>>depends on class-scope variables.  So, if setuptools is the only one
>>of my libraries that does this, I'd have to agree with Guido that it
>>is indeed quite rare.  :)
>>
>>If I had to hazard a guess, I'd guess that it's in one of the
>>setuptools command classes that subclasses a distutils command, and
>>proceeds to muck around with the original options in some fashion.  I
>>just don't want to check all of them if you know which one it is.  :)
>
>Yep, it's in setuptools.command.install, lines 20-23 (setuptools v0.6c6).

Ah.  Yeah, no big deal to change it; 'new_commands' and '_nc' don't 
need to be attributes of the class, and so could just be done before 
the 'class:' statement.  I don't know why I even bothered with the 
_nc thing there, either.


From ncoghlan at gmail.com  Mon Jul  9 13:03:15 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 09 Jul 2007 21:03:15 +1000
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070708140741.GD23765@seldon>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>	<f6lld0$qv3$1@sea.gmane.org>	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>	<468FA01A.6040707@gmail.com>
	<f6oh9v$f40$1@sea.gmane.org> <20070708140741.GD23765@seldon>
Message-ID: <469215F3.90807@gmail.com>

Brian Harring wrote:
> 
> I'd be curious if there is anyway to preserve the existing behaviour; 
> 
> class foo:
>   some_list = ('blacklist1', 'blacklist2')
>   known_bad = some_list += ('blah',)
>   locals().update([(attr, some_callable) for attr in some_list])
> 
> is slightly contrived, but I use similar code quite often for method 
> generation- both for tests, and standard enough objects.  Realize I 
> could do the same via metaclasses, but it's an extra step and not 
> nearly as easy/friendly imo.
> 
> So... anyway to preserve that trick under py3k?

As you've written it, that trick isn't affected by the semantic change 
at all (as the expression inside the list comprehension doesn't try to 
refer to a class variable).

If 'some_callable' was actually a method of the class, then you'd need 
to use an actual for loop instead of the list comprehension:

class foo(object):
    some_list = ('blacklist1', 'blacklist2')
    def some_method(self):
      # whatever
      pass
    for attr in some_list:
      locals()[attr] = some_method

However, I will point out that setting class attributes via locals() is 
formally undefined (it happens to work in current versions of CPython, 
but there's no guarantee that will always be the case).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From pje at telecommunity.com  Mon Jul  9 17:07:06 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 11:07:06 -0400
Subject: [Python-3000] Change to class construction?
In-Reply-To: <469215F3.90807@gmail.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
Message-ID: <20070709150454.641193A404D@sparrow.telecommunity.com>

At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote:
>However, I will point out that setting class attributes via locals() is
>formally undefined (it happens to work in current versions of CPython,
>but there's no guarantee that will always be the case).

As of PEP 3115, it's no longer undefined for class statements.

Of course, if it were truly undefined to begin with, we wouldn't be 
so worried about how to implement the potential optimizations that 
the undefinedness theoretically implies.  :)  (i.e. optimized globals/locals)


From guido at python.org  Mon Jul  9 17:13:46 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Jul 2007 18:13:46 +0300
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070709150454.641193A404D@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
	<20070709150454.641193A404D@sparrow.telecommunity.com>
Message-ID: <ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>

On 7/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote:
> >However, I will point out that setting class attributes via locals() is
> >formally undefined (it happens to work in current versions of CPython,
> >but there's no guarantee that will always be the case).
>
> As of PEP 3115, it's no longer undefined for class statements.

Where does it say so? To be honest, I don't know where ti find Nick's
claim in the reference manual. But I'm surprised that you read
anything about locals() into that PEP, as it doesn't mention that
function at all.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon Jul  9 18:03:28 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 12:03:28 -0400
Subject: [Python-3000] Change to class construction?
In-Reply-To: <ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.co
 m>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
	<20070709150454.641193A404D@sparrow.telecommunity.com>
	<ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>
Message-ID: <20070709160115.16C323A404D@sparrow.telecommunity.com>

At 06:13 PM 7/9/2007 +0300, Guido van Rossum wrote:
>On 7/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote:
> > >However, I will point out that setting class attributes via locals() is
> > >formally undefined (it happens to work in current versions of CPython,
> > >but there's no guarantee that will always be the case).
> >
> > As of PEP 3115, it's no longer undefined for class statements.
>
>Where does it say so? To be honest, I don't know where ti find Nick's
>claim in the reference manual.

I assume Nick is referring to:

   http://www.python.org/doc/2.2/ref/execframes.html

which says it's undefined.  I can't seem to find where this section 
went to in 2.3 and beyond, or anything that says what happens with 
non-dictionary objects, except:

   http://docs.python.org/ref/exec.html

which makes a much stronger claim:

"The built-in functions globals() and locals() return the current 
global and local dictionary, respectively"

and also states that as of 2.4, exec allows the use of any mapping 
object as the locals.  There isn't any mention of the fact that 
locals() may not be writable, which should probably be considered an error.


>But I'm surprised that you read
>anything about locals() into that PEP, as it doesn't mention that
>function at all.

Correct -- which means that either the PEP is in error, or the 
semantics of locals() must be that the actual namespace in use is returned.

My reasoning: since PEP 3115 allows an arbitrary mapping object to be 
used, there is no way that such an object can be converted to a 
read-only dictionary, and the current definition (as I understand it) 
is that locals() returns you either the actual local namespace 
object, or a "dictionary representing the ... namespace" (per the 
reference manual).

Since PEP 3115 does not require that there be any way of converting 
the arbitrary mapping object into a dictionary (or even that there be 
any pre-defined way of *reading* its contents!) there is no way that 
locals() can fulfill its existing contract *except* by returning that object.

QED.  Well, that's the spelled-out reasoning for my intuition, 
anyway.  :)  That doesn't mean the PEP or the specification of 
locals() can't change, but it seems to me that if one or the other 
doesn't, then modifying class-suite locals() to create class members 
implicitly becomes official, since the failure for it to do so would 
become a bug in locals().  (Since it will no longer be returning a 
"dictionary representing the namespace" if it doesn't return that 
mapping object, and can't possibly return anything else that 
"represents" the namespace in any meaningful way.)


From pje at telecommunity.com  Mon Jul  9 20:44:09 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 14:44:09 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
Message-ID: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>

PEP 3100 suggests dict.setdefault() may be removed in Python 3, since 
it is in principle no longer necessary (due to the new defaultdict type).

However, there is another class of use cases which use setdefault for 
its limited atomic properties - the initialization of non-mutated 
data structures that are shared among threads.  (And defaultdict 
cannot achieve the same thing.)

I currently have three places where I use this, off the top of my head:

1. a "synchronized" decorator that initializes an object's __lock__ 
attribute (if not found) using ob.__dict__.setdefault('__lock__', 
allocate_lock())

2. an Aspect implementation that does almost exactly the same thing, 
so that if multiple threads ask for an Aspect that doesn't exist for 
a given object, they will not end up using different instances.

3. a configuration library that supports "write many, read once" 
configurations shared across threads.  A key may have its value 
written to any number of times, so long as it has never been 
read.  As soon as the value has been read by any thread, it becomes 
fixed and it cannot be set to any other value.  (Setting it to the 
same value has no effect.)  This is essentially a simple way of 
having a provably race-condition-free data structure -- if you have a 
race condition, you will get an error.  As a bonus, it is completely 
non-blocking and single threaded code does not pay any overhead for 
the use of the data structures.

Of course, to take advantage of setdefault's atomic properties, one 
must be using CPython, and all the dictionary keys must have __hash__ 
and __eq__ methods implemented entirely in C (recursively to their 
contents, if tuples are involved).  However, for all three of the 
above applications this latter condition is actually quite trivial to ensure.

I realize, however, that this is an "impure" usage, in that other 
Python implementations usually do not have any atomicity guarantees, 
period.  But it would save me having to write a setdefault function 
in C when porting any of the above code to 3.0.  ;-)


From tav at espians.com  Mon Jul  9 20:59:11 2007
From: tav at espians.com (tav)
Date: Mon, 9 Jul 2007 19:59:11 +0100
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
Message-ID: <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com>

> PEP 3100 suggests dict.setdefault() may be removed in Python 3, since
> it is in principle no longer necessary (due to the new defaultdict type).
>
> However, there is another class of use cases which use setdefault for
> its limited atomic properties - the initialization of non-mutated
> data structures that are shared among threads.  (And defaultdict
> cannot achieve the same thing.)

+1

setdefault's ability to return current value is also a very useful
functionality and has saved writing:

  if key not in dict:
    value = <compute-value>
    dict[key] = value

with the simpler:

  value = dict.setdefault(key, <compute-value>)

Is there a better way to do the above without .setdefault?

-- 
love, tav
founder and ceo, esp metanational llp

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369

From pje at telecommunity.com  Mon Jul  9 21:17:12 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 15:17:12 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.co
 m>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com>
Message-ID: <20070709191500.4C4033A404D@sparrow.telecommunity.com>

At 07:59 PM 7/9/2007 +0100, tav wrote:
>>PEP 3100 suggests dict.setdefault() may be removed in Python 3, since
>>it is in principle no longer necessary (due to the new defaultdict type).
>>
>>However, there is another class of use cases which use setdefault for
>>its limited atomic properties - the initialization of non-mutated
>>data structures that are shared among threads.  (And defaultdict
>>cannot achieve the same thing.)
>
>+1
>
>setdefault's ability to return current value is also a very useful
>functionality and has saved writing:
>
>  if key not in dict:
>    value = <compute-value>
>    dict[key] = value
>
>with the simpler:
>
>  value = dict.setdefault(key, <compute-value>)
>
>Is there a better way to do the above without .setdefault?

Yes, in 2.5 there's collections.defaultdict.  Of course, that only 
works if there is a fixed mapping from keys to initial computed 
values for the entire dictionary for all time.  Oh, and if your code 
gets to create the dictionary.  :)


From barry at python.org  Mon Jul  9 21:35:50 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 9 Jul 2007 15:35:50 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
Message-ID: <DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 9, 2007, at 2:44 PM, Phillip J. Eby wrote:

> PEP 3100 suggests dict.setdefault() may be removed in Python 3, since
> it is in principle no longer necessary (due to the new defaultdict  
> type).
>
> However, there is another class of use cases which use setdefault for
> its limited atomic properties - the initialization of non-mutated
> data structures that are shared among threads.  (And defaultdict
> cannot achieve the same thing.)

Phillip, I support any initiative to keep .setdefault() or similar  
functionality.  When this thread came up before, I wasn't against  
defaultdict, I just didn't think it covered enough of the use cases  
of .setdefault() to warrant its removal.  You describe some  
additional use cases.

However, .setdefault() is a horrible name because it's not clear from  
the name that a 'get' operation also happens.

It occurs to me that I haven't reached my stupid idea quota for the  
day, so here goes.  What if we ditched .setdefault() as a name and  
gave .get() an optional argument to also set the key's value when  
it's missing.

class dict2(dict):
     """
     >>> d = dict2()
     >>> d.setdefault('foo', []).append(7)
     >>> sorted(d.items())
     [('foo', [7])]
     >>> d.setdefault('foo', []).append(8)
     >>> sorted(d.items())
     [('foo', [7, 8])]
     >>> d.get('bar', [], set_missing=True).append(9)
     >>> sorted(d.items())
     [('bar', [9]), ('foo', [7, 8])]
     >>> d.get('bar', [], True).append(10)
     >>> sorted(d.items())
     [('bar', [9, 10]), ('foo', [7, 8])]
     """
     def get(self, key, default=None, set_missing=False):
         missing = object()
         value = super(dict2, self).get(key, missing)
         if value is not missing:
             return value
         if set_missing:
             self[key] = default
         return default

This more or less conveys that both a get and a set operation is  
happening.  It also doesn't violate the rule against letting an  
argument change the return type of a function.  Maybe it will make  
this useful functionality more palatable.

Cheers,
- -Barry


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRpKOGHEjvBPtnXfVAQJIxwP9Ev7aASfVOw3q1aiCZ3Pr4VsQwzmeb0SR
4xJR9VvAZVcsjL4wAaleU55vFir9fBnFkvEnMMRFOBJ49NtS6EuLt+yGkt22gadg
TSlfNK0t4oVeFT4MJ6AebaHwBL8PvILAbV5eJ6x3H0hH383rdcdtrRyFzvhKnBRy
tPqtjIZlU6Q=
=WxDp
-----END PGP SIGNATURE-----

From guido at python.org  Mon Jul  9 22:56:08 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Jul 2007 23:56:08 +0300
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070709160115.16C323A404D@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
	<20070709150454.641193A404D@sparrow.telecommunity.com>
	<ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>
	<20070709160115.16C323A404D@sparrow.telecommunity.com>
Message-ID: <ca471dc20707091356t2caa808u353705ca714d96e3@mail.gmail.com>

On 7/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 06:13 PM 7/9/2007 +0300, Guido van Rossum wrote:
> >On 7/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote:
> > > >However, I will point out that setting class attributes via locals() is
> > > >formally undefined (it happens to work in current versions of CPython,
> > > >but there's no guarantee that will always be the case).
> > >
> > > As of PEP 3115, it's no longer undefined for class statements.
> >
> >Where does it say so? To be honest, I don't know where ti find Nick's
> >claim in the reference manual.
>
> I assume Nick is referring to:
>
>    http://www.python.org/doc/2.2/ref/execframes.html
>
> which says it's undefined.  I can't seem to find where this section
> went to in 2.3 and beyond, or anything that says what happens with
> non-dictionary objects, except:
>
>    http://docs.python.org/ref/exec.html
>
> which makes a much stronger claim:
>
> "The built-in functions globals() and locals() return the current
> global and local dictionary, respectively"
>
> and also states that as of 2.4, exec allows the use of any mapping
> object as the locals.  There isn't any mention of the fact that
> locals() may not be writable, which should probably be considered an error.
>
>
> >But I'm surprised that you read
> >anything about locals() into that PEP, as it doesn't mention that
> >function at all.
>
> Correct -- which means that either the PEP is in error, or the
> semantics of locals() must be that the actual namespace in use is returned.
>
> My reasoning: since PEP 3115 allows an arbitrary mapping object to be
> used, there is no way that such an object can be converted to a
> read-only dictionary, and the current definition (as I understand it)
> is that locals() returns you either the actual local namespace
> object, or a "dictionary representing the ... namespace" (per the
> reference manual).
>
> Since PEP 3115 does not require that there be any way of converting
> the arbitrary mapping object into a dictionary (or even that there be
> any pre-defined way of *reading* its contents!) there is no way that
> locals() can fulfill its existing contract *except* by returning that object.
>
> QED.  Well, that's the spelled-out reasoning for my intuition,
> anyway.  :)  That doesn't mean the PEP or the specification of
> locals() can't change, but it seems to me that if one or the other
> doesn't, then modifying class-suite locals() to create class members
> implicitly becomes official, since the failure for it to do so would
> become a bug in locals().  (Since it will no longer be returning a
> "dictionary representing the namespace" if it doesn't return that
> mapping object, and can't possibly return anything else that
> "represents" the namespace in any meaningful way.)

Python's specification isn't as rigid as it should be, and such a
"proof" isn't worth much, especially as the reference manual hasn't
always been updated as things changed. The use of the word "mapping"
might easily be construed as implementing abc.Mapping, and then
iteration and reading the contents would be well-defined. The
weasel-words about "a dictionary representing the namespace" are meant
to cover the situation for a function's local scope, which isn't
stored in a mapping-like object at all until you use exec() or
locals(), or a few others. We could easily change this to return a
writable mapping that's not a dict at all but a "view" on the locals
just as dict.keys() returns a view on a dict. I don't see why locals()
couldn't return the object used to represent the namespace, but I
don't see that it couldn't be some view on that object either,
depending on the details of the implementation.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul  9 23:01:15 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Jul 2007 00:01:15 +0300
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com>
Message-ID: <ca471dc20707091401w43103abfi67d222388aee81c6@mail.gmail.com>

On 7/9/07, tav <tav at espians.com> wrote:
> setdefault's ability to return current value is also a very useful
> functionality and has saved writing:
>
>   if key not in dict:
>     value = <compute-value>
>     dict[key] = value
>
> with the simpler:
>
>   value = dict.setdefault(key, <compute-value>)
>
> Is there a better way to do the above without .setdefault?

Those are not equivalent, as the form using setdefault() *always*
evaluates <compute-value> while the other form only evaluates it when
needed.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From brandon at rhodesmill.org  Mon Jul  9 23:01:38 2007
From: brandon at rhodesmill.org (Brandon Craig Rhodes)
Date: Mon, 09 Jul 2007 17:01:38 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <DE56AE10-C634-4185-B89D-C513FCE1F877@python.org> (Barry Warsaw's
	message of "Mon, 9 Jul 2007 15:35:50 -0400")
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>
Message-ID: <87odilnvf1.fsf@ten22.rhodesmill.org>

Barry Warsaw <barry at python.org> writes:

> However, .setdefault() is a horrible name because it's not clear
> from the name that a 'get' operation also happens.

Agreed!  From the name, a clever but naive user would assume that
"setdefault" sets what value the dictionary returns when a key does
not exist.  On first encountering the name, one imagines:

>>> d = {}
>>> d[1]
KeyError: 1
>>> d.setdefault('missing')
>>> d[1]
'missing'

-- 
Brandon Craig Rhodes   brandon at rhodesmill.org   http://rhodesmill.org/brandon

From guido at python.org  Mon Jul  9 23:04:56 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Jul 2007 00:04:56 +0300
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>
Message-ID: <ca471dc20707091404u40c1a8eek5cb122a74edbf1b@mail.gmail.com>

On 7/9/07, Barry Warsaw <barry at python.org> wrote:
> Phillip, I support any initiative to keep .setdefault() or similar
> functionality.  When this thread came up before, I wasn't against
> defaultdict, I just didn't think it covered enough of the use cases
> of .setdefault() to warrant its removal.  You describe some
> additional use cases.
>
> However, .setdefault() is a horrible name because it's not clear from
> the name that a 'get' operation also happens.

We had a long name discussion when it was introduced. Perhaps we can
go back to the list suggested then and see if a better alternative was
overlooked?

> It occurs to me that I haven't reached my stupid idea quota for the
> day, so here goes.  What if we ditched .setdefault() as a name and
> gave .get() an optional argument to also set the key's value when
> it's missing.
>
[...]
>      def get(self, key, default=None, set_missing=False):
>          missing = object()
>          value = super(dict2, self).get(key, missing)
>          if value is not missing:
>              return value
>          if set_missing:
>              self[key] = default
>          return default
>
> This more or less conveys that both a get and a set operation is
> happening.  It also doesn't violate the rule against letting an
> argument change the return type of a function.  Maybe it will make
> this useful functionality more palatable.

But it does violate the rule that if you have a boolean flag to
indicate a "variant" of an API and in practice you'll always be
passing a constant for that flag, you're better off defining two
methods with different names. Although if the return type isn't
different, the semantics are certainly *very* different here. So I'm
strongly against this.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Mon Jul  9 23:29:04 2007
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 9 Jul 2007 14:29:04 -0700
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
Message-ID: <002a01c7c270$42de5060$a389763f@RaymondLaptop1>

> PEP 3100 suggests dict.setdefault() may be removed in Python 3, since 
> it is in principle no longer necessary (due to the new defaultdict type).

I've forgotten.  What was the whole point of Python 3.0?
Is it to make the language fat with lots of ways to do everything?
Guys, this is your ONE chance to slim down the language and
pare away anything that is unnecessary or arcane.

The setdefault() method has too many defects to keep around.
Why would you want a method that instantiates the default on
every call even if not needed.  

Let this one die.  The dict API already heavily loaded.  Thinning
it a bit would be a nice improvement.


Raymond

From fumanchu at amor.org  Mon Jul  9 23:55:41 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Mon, 9 Jul 2007 14:55:41 -0700
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <002a01c7c270$42de5060$a389763f@RaymondLaptop1>
Message-ID: <435DF58A933BA74397B42CDEB8145A860DBCAFEA@ex9.hostedexchange.local>

Raymond Hettinger wrote:
> > PEP 3100 suggests dict.setdefault() may be removed in 
> > Python 3, since it is in principle no longer necessary
> > (due to the new defaultdict type).
> 
> I've forgotten.  What was the whole point of Python 3.0?
> Is it to make the language fat with lots of ways to do everything?
> Guys, this is your ONE chance to slim down the language and
> pare away anything that is unnecessary or arcane.
> 
> The setdefault() method has too many defects to keep around.
> Why would you want a method that instantiates the default on
> every call even if not needed.  
> 
> Let this one die.  The dict API already heavily loaded.  Thinning
> it a bit would be a nice improvement.

I have to agree, even though it means more work for me (due to my own
heavy use of setdefault for its atomicity). Perhaps a better resolution
for these use cases would be a stdlib module which would provide fast,
thread-safe collections. This would standardize, across implementations,
some of the CPython behaviors we've come to rely on. It would also make
make it clear that the given type is being used specifically for its
thread-safety.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From barry at python.org  Tue Jul 10 00:14:33 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 9 Jul 2007 18:14:33 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <ca471dc20707091404u40c1a8eek5cb122a74edbf1b@mail.gmail.com>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>
	<ca471dc20707091404u40c1a8eek5cb122a74edbf1b@mail.gmail.com>
Message-ID: <9D661F09-FBD2-4C5D-9F90-2DDC476767D7@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 9, 2007, at 5:04 PM, Guido van Rossum wrote:

> On 7/9/07, Barry Warsaw <barry at python.org> wrote:
>> Phillip, I support any initiative to keep .setdefault() or similar
>> functionality.  When this thread came up before, I wasn't against
>> defaultdict, I just didn't think it covered enough of the use cases
>> of .setdefault() to warrant its removal.  You describe some
>> additional use cases.
>>
>> However, .setdefault() is a horrible name because it's not clear from
>> the name that a 'get' operation also happens.
>
> We had a long name discussion when it was introduced. Perhaps we can
> go back to the list suggested then and see if a better alternative was
> overlooked?

Don't look here because some big dummy contradicts himself seven  
years later:

http://mail.python.org/pipermail/python-dev/2000-August/007819.html

hmm-put()-ly y'rs,
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRpKzSXEjvBPtnXfVAQKRmQP8DZDYKFOhOjYvtf+OkmmgAnwWaOI5tpPv
kHHxtMGPdgEM3cXAdT0U5m04W1IUmMKBItV/JE4qGO4OdD0eFIUPaZBufVUIIg3b
230qJnamVWrzZ/uRUhgDK363Kt2NstrxKce+kX37FPy2qHUSu3RMiBpzx9NJBW8I
P3rjaqYZycg=
=cU+w
-----END PGP SIGNATURE-----

From barry at python.org  Tue Jul 10 00:17:08 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 9 Jul 2007 18:17:08 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <002a01c7c270$42de5060$a389763f@RaymondLaptop1>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<002a01c7c270$42de5060$a389763f@RaymondLaptop1>
Message-ID: <E792E098-68C1-43B6-A1C0-F9D37A2C853F@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 9, 2007, at 5:29 PM, Raymond Hettinger wrote:

>> PEP 3100 suggests dict.setdefault() may be removed in Python 3, since
>> it is in principle no longer necessary (due to the new defaultdict  
>> type).
>
> I've forgotten.  What was the whole point of Python 3.0?
> Is it to make the language fat with lots of ways to do everything?
> Guys, this is your ONE chance to slim down the language and
> pare away anything that is unnecessary or arcane.
>
> The setdefault() method has too many defects to keep around.
> Why would you want a method that instantiates the default on
> every call even if not needed.

Um, like .get()?

> Let this one die.  The dict API already heavily loaded.  Thinning
> it a bit would be a nice improvement.

Unless you remove something useful.  The problem with setdefault()  
isn't what it does, it's the name.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRpKz5HEjvBPtnXfVAQKV4gP+Ntpkcmo9Yx0d0CvPuGen1E78RLGVquhm
wtaGY2OHsQk8Fq+5DSLdTLQcqba5Ru8kToxcFG+FbKuul7xvN+yFJ4yfFzBKvp6z
CLwE+GkP6v/zC/W1hJ0zkd/0zWE4tPp5Egmug5BhZ6n2ZkwX2ExCfq2jMXf/xmsV
cmu7z3TWQXI=
=BzxB
-----END PGP SIGNATURE-----

From pje at telecommunity.com  Tue Jul 10 02:13:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 20:13:56 -0400
Subject: [Python-3000] Change to class construction?
In-Reply-To: <ca471dc20707091356t2caa808u353705ca714d96e3@mail.gmail.com
 >
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
	<20070709150454.641193A404D@sparrow.telecommunity.com>
	<ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>
	<20070709160115.16C323A404D@sparrow.telecommunity.com>
	<ca471dc20707091356t2caa808u353705ca714d96e3@mail.gmail.com>
Message-ID: <20070710001144.3B26A3A404D@sparrow.telecommunity.com>

At 11:56 PM 7/9/2007 +0300, Guido van Rossum wrote:
>  The use of the word "mapping"
>might easily be construed as implementing abc.Mapping, and then
>iteration and reading the contents would be well-defined.

I'm not sure which use of the word "mapping" you're talking 
about.  PEP 3115 is explicit that there is no specific requirements 
for the __prepare__()'d namespace; it just mentions some things that 
might be useful to have in such an object.

So, in order to replace it with a view or something, we'd want to 
change the PEP to explicitly document what is required.

Personally, I'd just as soon make it explicitly official that 
locals() in a class suite gives you the __prepare__()'d object, 
whatever it is.  If a given Python implementation can support PEP 
3115 in the first place, then it clearly knows what object to return.  ;-)


From pje at telecommunity.com  Tue Jul 10 02:21:44 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 20:21:44 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <ca471dc20707091404u40c1a8eek5cb122a74edbf1b@mail.gmail.com
 >
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>
	<ca471dc20707091404u40c1a8eek5cb122a74edbf1b@mail.gmail.com>
Message-ID: <20070710001930.07D653A404D@sparrow.telecommunity.com>

At 12:04 AM 7/10/2007 +0300, Guido van Rossum wrote:
>On 7/9/07, Barry Warsaw <barry at python.org> wrote:
>>Phillip, I support any initiative to keep .setdefault() or similar
>>functionality.  When this thread came up before, I wasn't against
>>defaultdict, I just didn't think it covered enough of the use cases
>>of .setdefault() to warrant its removal.  You describe some
>>additional use cases.
>>
>>However, .setdefault() is a horrible name because it's not clear from
>>the name that a 'get' operation also happens.
>
>We had a long name discussion when it was introduced. Perhaps we can
>go back to the list suggested then and see if a better alternative was
>overlooked?

Personally, for my use cases it wouldn't matter if it didn't return a 
value, because I'm not using it to shorten the code.  So if you took 
away the return value and left the name (or changed it to something 
clearer), that'd be okay by me.

The alternative, of course, is as Robert suggested, to just write 
some library code to deal with this and similar issues.  If I have to 
import setdefault from somewhere to use it (ala the heapq.* 
functions), that's fine by me too, as long as it's still able to be 
atomic.  That approach might also address Raymond's desire to narrow 
the dictionary object API.


From rrr at ronadam.com  Tue Jul 10 02:21:31 2007
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 09 Jul 2007 19:21:31 -0500
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<DE56AE10-C634-4185-B89D-C513FCE1F877@python.org>
Message-ID: <4692D10B.5040902@ronadam.com>

Barry Warsaw wrote:
> However, .setdefault() is a horrible name because it's not clear from  
> the name that a 'get' operation also happens.

The return value of .setdefault() could be changed to None, then the name 
would be correct.

And then a helper function could fill the current use case of returning the 
added abject at the same time.

 >>> d = {}
 >>> def setget(setter, getter, vars):
...   setter(*vars)
...   return getter(*vars)
...
 >>> setget(d.setdefault, d.get, ('foo', [])).append(7)
 >>> d
{'foo': [7]}
 >>> setget(d.setdefault, d.get, ('foo', [])).append(8)
 >>> d
{'foo': [7, 8]}

Now if this could be made to be more general so it worked with with other 
objects it might really be useful. ;-)

Cheers,
    Ron

From rrr at ronadam.com  Tue Jul 10 02:40:18 2007
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 09 Jul 2007 19:40:18 -0500
Subject: [Python-3000] Change to class construction?
In-Reply-To: <ca471dc20707091356t2caa808u353705ca714d96e3@mail.gmail.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>	<f6lld0$qv3$1@sea.gmane.org>	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>	<468FA01A.6040707@gmail.com>
	<f6oh9v$f40$1@sea.gmane.org>	<20070708140741.GD23765@seldon>
	<469215F3.90807@gmail.com>	<20070709150454.641193A404D@sparrow.telecommunity.com>	<ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>	<20070709160115.16C323A404D@sparrow.telecommunity.com>
	<ca471dc20707091356t2caa808u353705ca714d96e3@mail.gmail.com>
Message-ID: <4692D572.4080401@ronadam.com>


Guido van Rossum wrote:

> We could easily change this to return a
> writable mapping that's not a dict at all but a "view" on the locals
> just as dict.keys() returns a view on a dict. I don't see why locals()
> couldn't return the object used to represent the namespace, but I
> don't see that it couldn't be some view on that object either,
> depending on the details of the implementation.

This sounds great! I just recently wanted to pass a namespace to exec, but 
it refuses to accept anything but a dictionary for a local name space.

What I really want to do is pass an object as the local namespace.  And 
have the exec() use it complete with it's properties intact.  Passing 
obj.__dict__ doesn't work in this case.

Cheers,
    Ron

From pje at telecommunity.com  Tue Jul 10 02:48:54 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 20:48:54 -0400
Subject: [Python-3000] Change to class construction?
In-Reply-To: <4692D572.4080401@ronadam.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
	<20070709150454.641193A404D@sparrow.telecommunity.com>
	<ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>
	<20070709160115.16C323A404D@sparrow.telecommunity.com>
	<ca471dc20707091356t2caa808u353705ca714d96e3@mail.gmail.com>
	<4692D572.4080401@ronadam.com>
Message-ID: <20070710004640.93BD83A40A4@sparrow.telecommunity.com>

At 07:40 PM 7/9/2007 -0500, Ron Adam wrote:

>Guido van Rossum wrote:
>
>>We could easily change this to return a
>>writable mapping that's not a dict at all but a "view" on the locals
>>just as dict.keys() returns a view on a dict. I don't see why locals()
>>couldn't return the object used to represent the namespace, but I
>>don't see that it couldn't be some view on that object either,
>>depending on the details of the implementation.
>
>This sounds great! I just recently wanted to pass a namespace to 
>exec, but it refuses to accept anything but a dictionary for a local 
>name space.

You can already do that in Python 2.4.


>What I really want to do is pass an object as the local 
>namespace.  And have the exec() use it complete with it's properties 
>intact.  Passing obj.__dict__ doesn't work in this case.

You need a wrapper, e.g.:

      class AttrMap(object):
          def __init__(self, ob):
              self.ob = ob
          def __getitem__(self, key):
              try: return getattr(self.ob, key)
              except AttributeError: raise KeyError, key
          # setitem, delitem, etc...


From greg.ewing at canterbury.ac.nz  Tue Jul 10 03:16:20 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 10 Jul 2007 13:16:20 +1200
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
Message-ID: <4692DDE4.9090707@canterbury.ac.nz>

Phillip J. Eby wrote:
> However, there is another class of use cases which use setdefault for 
> its limited atomic properties - the initialization of non-mutated 
> data structures that are shared among threads.

Isn't it rather dangerous to rely on any built-in
Python operations to be atomic? They might happen
to be, but I don't think there's any guarantee
they will stay that way.

--
Greg

From matt-python at theory.org  Tue Jul 10 03:33:07 2007
From: matt-python at theory.org (Matt Chisholm)
Date: Mon, 9 Jul 2007 18:33:07 -0700
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
References: <20070630205444.GD22221@theory.org>
	<ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
Message-ID: <20070710013307.GA5495@theory.org>

On Jul  3 2007, 10:14, Guido van Rossum wrote:
>On 6/30/07, Matt Chisholm <matt-python at theory.org> wrote:
>>I've created and submitted a new PEP proposing support for labels in
>>Python's break and continue statements.  Georg Brandl has graciously
>>added it to the PEP list as PEP 3136:
>>
>>http://www.python.org/dev/peps/pep-3136/
>
>I think this is a good summary of various proposals that have been
>floated in the past, plus some new ones. As a PEP, it falls short
>because it doesn't pick a solution but merely offers a large menu of
>possible options. Also, there is nothing about implementation yet.

I was hoping the community would pick their favorite option.  And I
planned to address implementation if the PEP was well received.

>However, I'm rejecting it on the basis that code so complicated to
>require this feature is very rare. In most cases there are existing
>work-arounds that produce clean code, for example using 'return'.
>While I'm sure there are some (rare) real cases where clarity of the
>code would suffer from a refactoring that makes it possible to use
>return, this is offset by two issues:
>
>1. The complexity added to the language, permanently. This affects not
>only all Python implementations, but also every source analysis tool,
>plus of course all documentation for the language.

Not knowing anything about the implementation details, I can't argue
with that.

>2. My expectation that the feature will be abused more than it will be
>used right, leading to a net decrease in code clarity (measured across
>all Python code written henceforth). Lazy programmers are everywhere,
>and before you know it you have an incredible mess on your hands of
>unintelligible code.

Are break / continue currently abused more than they are used right,
or used to make code difficult to understand?  I am trying to come up
with an example of mis-use of labeled break or continue that is
mitigated by the absence of labels, and I can't quite think of one.
Maybe I'm being unimaginative. :)

>I realize this is a heavy bar to pass, and somewhat subjective. That's
>okay. There is real value in having a small language. Also, as I said,
>while there are no past PEPs to document it, this has been brought up
>and rejected many times before.

So, I don't quite agree, but you're the boss.  If this has been
rejected before, I don't want to waste everybody's time discussing it
again.  Should I add your justification to the PEP and change it's
status?

-matt

P.S. Thanks to everybody who read the PEP and commented. :)


From greg.ewing at canterbury.ac.nz  Tue Jul 10 03:40:51 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 10 Jul 2007 13:40:51 +1200
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <20070710013307.GA5495@theory.org>
References: <20070630205444.GD22221@theory.org>
	<ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
	<20070710013307.GA5495@theory.org>
Message-ID: <4692E3A3.9010600@canterbury.ac.nz>

Matt Chisholm wrote:
> Are break / continue currently abused more than they are used right,
> or used to make code difficult to understand?

In my experience, using break and continue for anything
other than a standard loop-and-a-half makes code hard
to follow, even when there is only one loop. Labels
would not mitigate that.

--
Greg

From fdrake at acm.org  Tue Jul 10 05:20:01 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 9 Jul 2007 23:20:01 -0400
Subject: [Python-3000] A request to keep dict.setdefault() in 3.0
In-Reply-To: <4692DDE4.9090707@canterbury.ac.nz>
References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com>
	<4692DDE4.9090707@canterbury.ac.nz>
Message-ID: <200707092320.01898.fdrake@acm.org>

On Monday 09 July 2007, Greg Ewing wrote:
 > Isn't it rather dangerous to rely on any built-in
 > Python operations to be atomic? They might happen
 > to be, but I don't think there's any guarantee
 > they will stay that way.

My limited recollection is that setdefault() was all about it being atomic; 
otherwise there's no benefit to building it in C.  The documentation sadly 
omits mentioning this very important property of setdefault(), however.

If the atomicity isn't promised, then there's no benefit, and writing a helper 
in Python would be fine.  However, as we've seen in this discussion, that's 
critical to many users of the method.  Without it, most users would have to 
add a C (or whatever) function that did the same task and made the atomicity 
promise.

IMHO, it's better to have a single shared implementation with this promise; 
that makes it easier to recognize when reading unfamiliar code.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From rrr at ronadam.com  Tue Jul 10 06:03:04 2007
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 09 Jul 2007 23:03:04 -0500
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070710004640.93BD83A40A4@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
	<20070709150454.641193A404D@sparrow.telecommunity.com>
	<ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>
	<20070709160115.16C323A404D@sparrow.telecommunity.com>
	<ca471dc20707091356t2caa808u353705ca714d96e3@mail.gmail.com>
	<4692D572.4080401@ronadam.com>
	<20070710004640.93BD83A40A4@sparrow.telecommunity.com>
Message-ID: <469304F8.2060303@ronadam.com>


Phillip J. Eby wrote:
> At 07:40 PM 7/9/2007 -0500, Ron Adam wrote:
> 
>> Guido van Rossum wrote:
>>
>>> We could easily change this to return a
>>> writable mapping that's not a dict at all but a "view" on the locals
>>> just as dict.keys() returns a view on a dict. I don't see why locals()
>>> couldn't return the object used to represent the namespace, but I
>>> don't see that it couldn't be some view on that object either,
>>> depending on the details of the implementation.
>>
>> This sounds great! I just recently wanted to pass a namespace to exec, 
>> but it refuses to accept anything but a dictionary for a local name 
>> space.
> 
> You can already do that in Python 2.4.
> 
> 
>> What I really want to do is pass an object as the local namespace.  
>> And have the exec() use it complete with it's properties intact.  
>> Passing obj.__dict__ doesn't work in this case.
> 
> You need a wrapper, e.g.:
> 
>      class AttrMap(object):
>          def __init__(self, ob):
>              self.ob = ob
>          def __getitem__(self, key):
>              try: return getattr(self.ob, key)
>              except AttributeError: raise KeyError, key
>          # setitem, delitem, etc...

Thanks, that should solves (I hope) the particular case I have.  Although 
it would have been nicer if it was in the library someplace.  Of course 
everyone says that about nearly everything.

It might be nice if locals() could receive an argument so it can be used 
with class's.  Possible returning a wrapped class view such as the example 
you gave.

Regards,
    Ron


From ncoghlan at gmail.com  Tue Jul 10 11:33:04 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 Jul 2007 19:33:04 +1000
Subject: [Python-3000] Change to class construction?
In-Reply-To: <20070709160115.16C323A404D@sparrow.telecommunity.com>
References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com>
	<f6lld0$qv3$1@sea.gmane.org>
	<20070706172258.1E65B3A4046@sparrow.telecommunity.com>
	<468FA01A.6040707@gmail.com> <f6oh9v$f40$1@sea.gmane.org>
	<20070708140741.GD23765@seldon> <469215F3.90807@gmail.com>
	<20070709150454.641193A404D@sparrow.telecommunity.com>
	<ca471dc20707090813k2271460dw9955c7aec79b2e4d@mail.gmail.com>
	<20070709160115.16C323A404D@sparrow.telecommunity.com>
Message-ID: <46935250.8060903@gmail.com>

Phillip J. Eby wrote:
> At 06:13 PM 7/9/2007 +0300, Guido van Rossum wrote:
>> On 7/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>> > At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote:
>> > >However, I will point out that setting class attributes via 
>> locals() is
>> > >formally undefined (it happens to work in current versions of CPython,
>> > >but there's no guarantee that will always be the case).
>> >
>> > As of PEP 3115, it's no longer undefined for class statements.
>>
>> Where does it say so? To be honest, I don't know where ti find Nick's
>> claim in the reference manual.
> 
> I assume Nick is referring to:
> 
>   http://www.python.org/doc/2.2/ref/execframes.html
> 
> which says it's undefined.

I was actually referring to this warning in the library reference docs 
for the locals() function:

   """Warning: The contents of this dictionary should not be modified; 
changes may not affect the values of local variables used by the 
interpreter."""

Cheers,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Tue Jul 10 23:14:27 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 00:14:27 +0300
Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni
Message-ID: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>

One of the most daunting tasks remaining for Python 3.0a1 (to be
released by the end of August) is fixing the remaining failing unit
tests in the py3k-struni branch
(http://svn.python.org/view/python/branches/py3k-struni/).

This is the branch where I have started the work on the
string/unification branch. I want to promote this branch to become the
"main" Py3k branch ASAP (by renaming it to py3k), but I don't want to
do that until all unit tests pass. I've been working diligently on
this task, and I've got it down to about 50 tests that are failing on
at least one of OSX and Ubuntu (the platforms to which I have easy
access). Now I need help.

To facilitate distributing the task of getting the remaining tests to
pass, I've created a wiki page:
http://wiki.python.org/moin/Py3kStrUniTests . Please help! It's easy
to help: (1) check out the py3k-struni branch; (2) build it; (3) pick
a test and figure out why it's failing; (4) produce a fix; (5) submit
the fix to SF (or check it in, if you have submit privileges and are
confident enough).

In order to avoid duplicate work, I've come up with a simple protocol:
you mark a test in the wiki as "MINE" (with your name) when you start
looking at it. You mark it as "FIXED [IN SF]" once you fix it, adding
the patch# if the fix is in SF. If you give up, remove your lock,
adding instead a note with what you've found (even just the names of
the failing subtests is helpful).

Please help!

There are other tasks, see PEP 3100. Mail me if you're interested in
anything specifically. (Please don't ask me "do you think I could do
this" -- you know better than I whether you're capable of coding at a
specific level. If you don't understand the task, you're probably not
qualified.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From lists at cheimes.de  Wed Jul 11 00:30:13 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 11 Jul 2007 00:30:13 +0200
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
Message-ID: <46940875.2000606@cheimes.de>

Guido van Rossum wrote:
> Please help!

I've made a meta patch that makes debugging the bugs a lot easier. It
replaces assert_(foo == bar) and failUnless(foo == bar) with
failUnlessEqual(foo, bar). failUnlessEqual shows the value of foo and
bar when they are not equal.

http://www.python.org/sf/1751515

sed -r "s/self\.assert_\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py
sed -r "s/self\.failUnless\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py

By the way the ctypes unit tests are causing a segfault on my machine:
test_ctypes
Warning: could not import ctypes.test.test_numbers: unpack requires a
string argument of length 1
Segmentation fault

Ubunutu 7.04 on i386 machine with an Intel P3.

Christian


From steven.bethard at gmail.com  Wed Jul 11 00:38:53 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue, 10 Jul 2007 16:38:53 -0600
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <46940875.2000606@cheimes.de>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de>
Message-ID: <d11dcfba0707101538l1d3c3244i788032054da92717@mail.gmail.com>

On 7/10/07, Christian Heimes <lists at cheimes.de> wrote:
> Guido van Rossum wrote:
> > Please help!
>
> I've made a meta patch that makes debugging the bugs a lot easier. It
> replaces assert_(foo == bar) and failUnless(foo == bar) with
> failUnlessEqual(foo, bar). failUnlessEqual shows the value of foo and
> bar when they are not equal.
>
> http://www.python.org/sf/1751515
>
> sed -r "s/self\.assert_\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py
> sed -r "s/self\.failUnless\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py

Some of these look questionable, e.g.:

-        self.assert_(d == self.spamle or d == self.spambe)
+        self.failUnlessEqual(d == self.spamle or d, self.spambe)
...
-        self.assert_((a == 42) is False)
+        self.failUnlessEqual((a, 42) is False)

I'd probably go with something a little more restrictive, maybe:

    r'self.assert_\(\S+ == \S+\)'

Something like that ought to have fewer false positives.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From lists at cheimes.de  Wed Jul 11 01:17:26 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 11 Jul 2007 01:17:26 +0200
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <d11dcfba0707101538l1d3c3244i788032054da92717@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>	<46940875.2000606@cheimes.de>
	<d11dcfba0707101538l1d3c3244i788032054da92717@mail.gmail.com>
Message-ID: <46941386.9080301@cheimes.de>

Steven Bethard wrote:
> I'd probably go with something a little more restrictive, maybe:
> 
>     r'self.assert_\(\S+ == \S+\)'
> 
> Something like that ought to have fewer false positives.

Woops! You are right. Even your pattern has caused some false positives
but I've reread the patch and removed the offending lines. I'm going to
upload another patch as soon as I have verified mine again.

Christian


From lists at cheimes.de  Wed Jul 11 03:54:49 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 11 Jul 2007 03:54:49 +0200
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
Message-ID: <f71d9d$thn$1@sea.gmane.org>

I found a bug in the str type that may affect a lot of tests.

In the py3k-struni branch the str() constructor doesn't use __str__ when
the argument is an instance of a subclass of str. A user defined string
can't change __str__(). The __repr__ method isn't affected.

It works in Python 2.5 and in the p3yk branch.

Python 3.0x (py3k-struni:56245, Jul 10 2007, 23:34:56)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class Mystr(str):
...     def __str__(self): return 'v'
...
>>> s = Mystr('x')
>>> s
'x'
>>> str(s)
'x' # <- SHOULD RETURN 'v'

Christian


From guido at python.org  Wed Jul 11 08:48:49 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 09:48:49 +0300
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <46941386.9080301@cheimes.de>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de>
	<d11dcfba0707101538l1d3c3244i788032054da92717@mail.gmail.com>
	<46941386.9080301@cheimes.de>
Message-ID: <ca471dc20707102348p78041da9hbd4f08b84d4abab1@mail.gmail.com>

Please use self.assertEqual() instead of self.failUnlessEqual() -- the
assertEqual() form is much more common. Otherwise, good idea!

On 7/11/07, Christian Heimes <lists at cheimes.de> wrote:
> Steven Bethard wrote:
> > I'd probably go with something a little more restrictive, maybe:
> >
> >     r'self.assert_\(\S+ == \S+\)'
> >
> > Something like that ought to have fewer false positives.
>
> Woops! You are right. Even your pattern has caused some false positives
> but I've reread the patch and removed the offending lines. I'm going to
> upload another patch as soon as I have verified mine again.
>
> Christian
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From walter at livinglogic.de  Wed Jul 11 09:51:58 2007
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 11 Jul 2007 09:51:58 +0200
Subject: [Python-3000] Need help fixing failing Py3k Unittests
	in	py3k-struni
In-Reply-To: <f71d9d$thn$1@sea.gmane.org>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<f71d9d$thn$1@sea.gmane.org>
Message-ID: <46948C1E.5050800@livinglogic.de>

Christian Heimes wrote:

> I found a bug in the str type that may affect a lot of tests.
> 
> In the py3k-struni branch the str() constructor doesn't use __str__ when
> the argument is an instance of a subclass of str. A user defined string
> can't change __str__(). The __repr__ method isn't affected.

This hasn't been rewired yet. Behind the covers str still behaves like 
unicode, i.e. it uses __unicode__ for conversion.

Servus,
    Walter

From guido at python.org  Wed Jul 11 10:01:05 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 11:01:05 +0300
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <46948C1E.5050800@livinglogic.de>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<f71d9d$thn$1@sea.gmane.org> <46948C1E.5050800@livinglogic.de>
Message-ID: <ca471dc20707110101m5f65249aj4cf1122be20c5856@mail.gmail.com>

Yeah, I'm looking in to this right now. What a mess! But I'm close to a fix.

There's more that causes test_descr to fail however. Bleh, what a
terrible unit test -- it doesn't use the unittest module, and a single
failure aborts the rest of the test.

--Guido

On 7/11/07, Walter D?rwald <walter at livinglogic.de> wrote:
> Christian Heimes wrote:
>
> > I found a bug in the str type that may affect a lot of tests.
> >
> > In the py3k-struni branch the str() constructor doesn't use __str__ when
> > the argument is an instance of a subclass of str. A user defined string
> > can't change __str__(). The __repr__ method isn't affected.
>
> This hasn't been rewired yet. Behind the covers str still behaves like
> unicode, i.e. it uses __unicode__ for conversion.
>
> Servus,
>     Walter
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 11 11:30:58 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 12:30:58 +0300
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <ca471dc20707110101m5f65249aj4cf1122be20c5856@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<f71d9d$thn$1@sea.gmane.org> <46948C1E.5050800@livinglogic.de>
	<ca471dc20707110101m5f65249aj4cf1122be20c5856@mail.gmail.com>
Message-ID: <ca471dc20707110230r21b2938bxa6d5bffe8fd968aa@mail.gmail.com>

Fixed in subversion. Please do review r56252 to see that I did the right thing.

On 7/11/07, Guido van Rossum <guido at python.org> wrote:
> Yeah, I'm looking in to this right now. What a mess! But I'm close to a fix.
>
> There's more that causes test_descr to fail however. Bleh, what a
> terrible unit test -- it doesn't use the unittest module, and a single
> failure aborts the rest of the test.
>
> --Guido
>
> On 7/11/07, Walter D?rwald <walter at livinglogic.de> wrote:
> > Christian Heimes wrote:
> >
> > > I found a bug in the str type that may affect a lot of tests.
> > >
> > > In the py3k-struni branch the str() constructor doesn't use __str__ when
> > > the argument is an instance of a subclass of str. A user defined string
> > > can't change __str__(). The __repr__ method isn't affected.
> >
> > This hasn't been rewired yet. Behind the covers str still behaves like
> > unicode, i.e. it uses __unicode__ for conversion.
> >
> > Servus,
> >     Walter
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 11 13:45:04 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 14:45:04 +0300
Subject: [Python-3000] Fwd: Your confirmation is required to leave the
	Python-3000 mailing list
In-Reply-To: <mailman.0.1184147038.21671.python-3000@python.org>
References: <mailman.0.1184147038.21671.python-3000@python.org>
Message-ID: <ca471dc20707110445n28be3d1g8d46daa701eaeab2@mail.gmail.com>

Which joker tried to unsub me?

---------- Forwarded message ----------
From: python-3000-confirm+e08ed5828...1ff418f380758543 at python.org
<python-3000-confirm+e08ed58281...51ff418f380758543 at python.org>
Date: Jul 11, 2007 12:43 PM
Subject: Your confirmation is required to leave the Python-3000 mailing list
To: guido at python.org


Mailing list removal confirmation notice for mailing list Python-3000

We have received a request for the removal of your email address,
"guido at python.org" from the python-3000 at python.org mailing list.  To
confirm that you want to be removed from this mailing list, simply
reply to this message, keeping the Subject: header intact.  Or visit
this web page:

    http://mail.python.org/mailman/confirm/python-3000/e08ed5828...8543


Or include the following line -- and only the following line -- in a
message to python-3000-request at python.org:

    confirm e08e...0758543

Note that simply sending a `reply' to this message should work from
most mail readers, since that usually leaves the Subject: line in the
right form (additional "Re:" text in the Subject: is okay).

If you do not wish to be removed from this list, please simply
disregard this message.  If you think you are being maliciously
removed from the list, or have any other questions, send them to
python-3000-owner at python.org.


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 11 13:46:12 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 14:46:12 +0300
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <f728g6$6gt$1@sea.gmane.org>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de> <f728g6$6gt$1@sea.gmane.org>
Message-ID: <ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>

On 7/11/07, Thomas Heller <theller at ctypes.org> wrote:
> Christian Heimes schrieb:
> >
> > By the way the ctypes unit tests are causing a segfault on my machine:
> > test_ctypes
> > Warning: could not import ctypes.test.test_numbers: unpack requires a
> > string argument of length 1
> > Segmentation fault
> >
> > Ubunutu 7.04 on i386 machine with an Intel P3.
>
> I can reproduce this.  ctypes.test.test_numbers is easy to fix, but there
> are other severe problems with ctypes.
>
> I would love to look into these, but I prefer debugging on Windows.
> However, the windows build does not work because the _fileio builtin
> module is missing from config.c.  Again, this is not so easy to fix,
> because the ftruncate function does not exist on Windows.

I don't have a Windows box; contributions to fix this situation are welcome.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From amauryfa at gmail.com  Wed Jul 11 14:27:35 2007
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 11 Jul 2007 14:27:35 +0200
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de> <f728g6$6gt$1@sea.gmane.org>
	<ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>
Message-ID: <e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>

Thomas Heller wrote:
> I would love to look into these, but I prefer debugging on Windows.
> However, the windows build does not work because the _fileio builtin
> module is missing from config.c.  Again, this is not so easy to fix,
> because the ftruncate function does not exist on Windows.

In fileobject.c, there is a replacement for ftruncate. See the code
around the call to SetEndOfFile().

I'll try to provide a patch later today.

-- 
Amaury Forgeot d'Arc

From guido at python.org  Wed Jul 11 14:41:21 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 15:41:21 +0300
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de> <f728g6$6gt$1@sea.gmane.org>
	<ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>
	<e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>
Message-ID: <ca471dc20707110541i4a3199c1w4ea14a1c486bdd01@mail.gmail.com>

That would be great! Assign it to theller who can test it much better
than I can.

On 7/11/07, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote:
> Thomas Heller wrote:
> > I would love to look into these, but I prefer debugging on Windows.
> > However, the windows build does not work because the _fileio builtin
> > module is missing from config.c.  Again, this is not so easy to fix,
> > because the ftruncate function does not exist on Windows.
>
> In fileobject.c, there is a replacement for ftruncate. See the code
> around the call to SetEndOfFile().
>
> I'll try to provide a patch later today.
>
> --
> Amaury Forgeot d'Arc
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at ctypes.org  Wed Jul 11 14:50:44 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 11 Jul 2007 14:50:44 +0200
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <ca471dc20707110541i4a3199c1w4ea14a1c486bdd01@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>	<46940875.2000606@cheimes.de>
	<f728g6$6gt$1@sea.gmane.org>	<ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>	<e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>
	<ca471dc20707110541i4a3199c1w4ea14a1c486bdd01@mail.gmail.com>
Message-ID: <f72jn4$cma$1@sea.gmane.org>

Guido van Rossum schrieb:
> That would be great! Assign it to theller who can test it much better
> than I can.
> 
> On 7/11/07, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote:
>> Thomas Heller wrote:
>> > I would love to look into these, but I prefer debugging on Windows.
>> > However, the windows build does not work because the _fileio builtin
>> > module is missing from config.c.  Again, this is not so easy to fix,
>> > because the ftruncate function does not exist on Windows.
>>
>> In fileobject.c, there is a replacement for ftruncate. See the code
>> around the call to SetEndOfFile().
>>
>> I'll try to provide a patch later today.

Awaiting your patch ;-).

The most important problem, IMO, is now that wide filenames on Windows are not
implemented, see the code starting at line 148 in _fileio.c.  This prevents
most unittests to run because test_support cannot be imported:

C:\svn\py3k-struni\PCbuild>python  -E -tt ../lib/test/regrtest.py
Traceback (most recent call last):
  File "../lib/test/regrtest.py", line 165, in <module>
    from test import test_support
  File "C:\svn\py3k-struni\lib\test\test_support.py", line 182, in <module>
    fp = open(TESTFN, 'w+')
  File "C:\svn\py3k-struni\lib\site.py", line 412, in __new__
    return io.open(*args, **kwds)
  File "C:\svn\py3k-struni\lib\io.py", line 122, in open
    (updating and "+" or ""))
NotImplementedError: Windows wide filenames are not yet supported

Thomas


From theller at ctypes.org  Wed Jul 11 16:08:47 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 11 Jul 2007 16:08:47 +0200
Subject: [Python-3000] Heaptypes
Message-ID: <f72o9f$v6i$1@sea.gmane.org>

ctypes creates heaptypes with this call, in _ctypes.c, line 3986 (slightly simplified):

	result = PyObject_CallFunction((PyObject *)&ArrayType_Type,
				       "s(O){s:n,s:O}",
				       name,
				       &Array_Type,
				       "_length_",
				       length,
				       "_type_",
				       itemtype
		);

The call succeeds.  Printing the type fails with an assertion:

theller at tubu:~/devel/py3k-struni$ ./python
Python 3.0x (py3k-struni:56268M, Jul 11 2007, 15:56:43)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import c_int
[54751 refs]
>>> atype = c_int * 3
[54762 refs]
>>> atype.__name__
s'c_int_Array_3'
[55278 refs]
>>> repr(atype)
python: Objects/unicodeobject.c:630: PyUnicodeUCS2_FromFormatV: Assertion `obj && ((((obj)->ob_type)->tp_flags & ((1L<<28))) != 0)' failed.
Abgebrochen
theller at tubu:~/devel/py3k-struni$

As one can see, the __name__ is a byte string (or how is this called now?).
The fix is probably to use an 'U' format character in the PyObject_CallFunction format string,
but I assume the call should have failed in the first place?  And what about the dictionary that
is constructed for the call '{s:n,s:O}', should it use 'U' format chars also?

Thomas


From guido at python.org  Wed Jul 11 16:15:41 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 17:15:41 +0300
Subject: [Python-3000] Heaptypes
In-Reply-To: <f72o9f$v6i$1@sea.gmane.org>
References: <f72o9f$v6i$1@sea.gmane.org>
Message-ID: <ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>

There are currently three "string" types, here shown with there repr styles:

- str = 'same as unicode in 2.x'
- bytes = b'new, mutable list of small ints'
- str8 = s'same as str in 2.x'

The s'...' notation means it's an 8-bit string (not a bytes array).
This is not supported in the syntax; it's just used on output. (Use
str8(b'...') to create one of these.) I'm still hoping to remove this
type before the release, but it appears to be still necessary so far.

I don't know enouch about ...CallFunction to help you with the rest.

--Guido

On 7/11/07, Thomas Heller <theller at ctypes.org> wrote:
> ctypes creates heaptypes with this call, in _ctypes.c, line 3986 (slightly simplified):
>
>         result = PyObject_CallFunction((PyObject *)&ArrayType_Type,
>                                        "s(O){s:n,s:O}",
>                                        name,
>                                        &Array_Type,
>                                        "_length_",
>                                        length,
>                                        "_type_",
>                                        itemtype
>                 );
>
> The call succeeds.  Printing the type fails with an assertion:
>
> theller at tubu:~/devel/py3k-struni$ ./python
> Python 3.0x (py3k-struni:56268M, Jul 11 2007, 15:56:43)
> [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from ctypes import c_int
> [54751 refs]
> >>> atype = c_int * 3
> [54762 refs]
> >>> atype.__name__
> s'c_int_Array_3'
> [55278 refs]
> >>> repr(atype)
> python: Objects/unicodeobject.c:630: PyUnicodeUCS2_FromFormatV: Assertion `obj && ((((obj)->ob_type)->tp_flags & ((1L<<28))) != 0)' failed.
> Abgebrochen
> theller at tubu:~/devel/py3k-struni$
>
> As one can see, the __name__ is a byte string (or how is this called now?).
> The fix is probably to use an 'U' format character in the PyObject_CallFunction format string,
> but I assume the call should have failed in the first place?  And what about the dictionary that
> is constructed for the call '{s:n,s:O}', should it use 'U' format chars also?
>
> Thomas
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at ctypes.org  Wed Jul 11 16:39:00 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 11 Jul 2007 16:39:00 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
Message-ID: <f72q24$61n$1@sea.gmane.org>

Guido van Rossum schrieb:
> There are currently three "string" types, here shown with there repr styles:
> 
> - str = 'same as unicode in 2.x'
> - bytes = b'new, mutable list of small ints'
> - str8 = s'same as str in 2.x'
> 
> The s'...' notation means it's an 8-bit string (not a bytes array).
> This is not supported in the syntax; it's just used on output. (Use
> str8(b'...') to create one of these.) I'm still hoping to remove this
> type before the release, but it appears to be still necessary so far.
> 
> I don't know enouch about ...CallFunction to help you with the rest.

Let me explain it in other words.  This code creates a new type:

>>> ht = type("name", (object,), {})
[47054 refs]
>>> ht
<class '__main__.name'>
[47093 refs]

The '__name__' attribute is a (unicode) string:

>>> ht.__name__
'name'
[47121 refs]
>>>

But I can also create a type in this way:

>>> ht = type(str8(b"name"), (object,), {})
[47208 refs]

The __name__ attribute is a str8 instance:

>>> ht.__name__
s'name'
[47236 refs]

Printing the type triggers an assertion:

>>> ht
Assertion failed: obj && PyUnicode_Check(obj), file \svn\py3k-struni\Objects\unicodeobject.c, line 630
C:\svn\py3k-struni\PCbuild>

because parts of the code assume that the '__name__' is a (unicode) string.

Thomas


From guido at python.org  Wed Jul 11 16:47:47 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Jul 2007 17:47:47 +0300
Subject: [Python-3000] Heaptypes
In-Reply-To: <f72q24$61n$1@sea.gmane.org>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
	<f72q24$61n$1@sea.gmane.org>
Message-ID: <ca471dc20707110747o16064f8an8889467d81bb434c@mail.gmail.com>

On 7/11/07, Thomas Heller <theller at ctypes.org> wrote:
> Let me explain it in other words.  This code creates a new type:
>
> >>> ht = type("name", (object,), {})
> [47054 refs]
> >>> ht
> <class '__main__.name'>
> [47093 refs]
>
> The '__name__' attribute is a (unicode) string:
>
> >>> ht.__name__
> 'name'
> [47121 refs]
> >>>
>
> But I can also create a type in this way:
>
> >>> ht = type(str8(b"name"), (object,), {})
> [47208 refs]
>
> The __name__ attribute is a str8 instance:
>
> >>> ht.__name__
> s'name'
> [47236 refs]
>
> Printing the type triggers an assertion:
>
> >>> ht
> Assertion failed: obj && PyUnicode_Check(obj), file \svn\py3k-struni\Objects\unicodeobject.c, line 630
> C:\svn\py3k-struni\PCbuild>
>
> because parts of the code assume that the '__name__' is a (unicode) string.

Hm. I guess the creation must insist that __name__ is a unicode. Can
you fix this yourself?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From thomas at python.org  Wed Jul 11 17:07:48 2007
From: thomas at python.org (Thomas Wouters)
Date: Wed, 11 Jul 2007 08:07:48 -0700
Subject: [Python-3000] Fwd: Your confirmation is required to leave the
	Python-3000 mailing list
In-Reply-To: <ca471dc20707110445n28be3d1g8d46daa701eaeab2@mail.gmail.com>
References: <mailman.0.1184147038.21671.python-3000@python.org>
	<ca471dc20707110445n28be3d1g8d46daa701eaeab2@mail.gmail.com>
Message-ID: <9e804ac0707110807i1c2e515dy5538aa487b711649@mail.gmail.com>

I can't find the message you forwarded in mail.python.org's logs (although I
may be looking wrong; its hard to do such a search without the full headers
of the original) -- but it looks to me like it was a hoax, and not an actual
unsubscription request from mail.python.org.

On 7/11/07, Guido van Rossum <guido at python.org> wrote:
>
> Which joker tried to unsub me?
>
> ---------- Forwarded message ----------
> From: python-3000-confirm+e08ed5828...1ff418f380758543 at python.org
> <python-3000-confirm+e08ed58281...51ff418f380758543 at python.org>
> Date: Jul 11, 2007 12:43 PM
> Subject: Your confirmation is required to leave the Python-3000 mailing
> list
> To: guido at python.org
>
>
> Mailing list removal confirmation notice for mailing list Python-3000
>
> We have received a request for the removal of your email address,
> "guido at python.org" from the python-3000 at python.org mailing list.  To
> confirm that you want to be removed from this mailing list, simply
> reply to this message, keeping the Subject: header intact.  Or visit
> this web page:
>
>     http://mail.python.org/mailman/confirm/python-3000/e08ed5828...8543
>
>
> Or include the following line -- and only the following line -- in a
> message to python-3000-request at python.org:
>
>     confirm e08e...0758543
>
> Note that simply sending a `reply' to this message should work from
> most mail readers, since that usually leaves the Subject: line in the
> right form (additional "Re:" text in the Subject: is okay).
>
> If you do not wish to be removed from this list, please simply
> disregard this message.  If you think you are being maliciously
> removed from the list, or have any other questions, send them to
> python-3000-owner at python.org.
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/thomas%40python.org
>


-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070711/b89d3fc1/attachment.html 

From walter at livinglogic.de  Wed Jul 11 17:28:09 2007
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Wed, 11 Jul 2007 17:28:09 +0200
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <ca471dc20707110230r21b2938bxa6d5bffe8fd968aa@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>	
	<f71d9d$thn$1@sea.gmane.org> <46948C1E.5050800@livinglogic.de>	
	<ca471dc20707110101m5f65249aj4cf1122be20c5856@mail.gmail.com>
	<ca471dc20707110230r21b2938bxa6d5bffe8fd968aa@mail.gmail.com>
Message-ID: <4694F709.2040304@livinglogic.de>

Guido van Rossum wrote:

> Fixed in subversion. Please do review r56252 to see that I did the right
> thing.

I haven't looked at test_descr.py but the rest looks good to me.

I guess for the final version of Py3000 type_set_name() in typeobject.c
will not downgrade unicode strings to str8, but instead upgrade str8
objects to unicode.

Also now that PyObject_Unicode() tries __unicode__ first and then tp_str
should we rename all __unicode__ methods to __str__, or will __unicode__
stay?

Servus,
   Walter

> On 7/11/07, Guido van Rossum <guido at python.org> wrote:
>> Yeah, I'm looking in to this right now. What a mess! But I'm close to
>> a fix.
>>
>> There's more that causes test_descr to fail however. Bleh, what a
>> terrible unit test -- it doesn't use the unittest module, and a single
>> failure aborts the rest of the test.
>>
>> --Guido
>>
>> On 7/11/07, Walter D?rwald <walter at livinglogic.de> wrote:
>> > Christian Heimes wrote:
>> >
>> > > I found a bug in the str type that may affect a lot of tests.
>> > >
>> > > In the py3k-struni branch the str() constructor doesn't use
>> __str__ when
>> > > the argument is an instance of a subclass of str. A user defined
>> string
>> > > can't change __str__(). The __repr__ method isn't affected.
>> >
>> > This hasn't been rewired yet. Behind the covers str still behaves like
>> > unicode, i.e. it uses __unicode__ for conversion.
>> >
>> > Servus,
>> >     Walter
>> > _______________________________________________
>> > Python-3000 mailing list
>> > Python-3000 at python.org
>> > http://mail.python.org/mailman/listinfo/python-3000
>> > Unsubscribe:
>> http://mail.python.org/mailman/options/python-3000/guido%40python.org
>> >
>>
>>
>> -- 
>> --Guido van Rossum (home page: http://www.python.org/~guido/)
>>
> 
> 


From theller at ctypes.org  Wed Jul 11 17:52:49 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 11 Jul 2007 17:52:49 +0200
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <4694F709.2040304@livinglogic.de>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>		<f71d9d$thn$1@sea.gmane.org>
	<46948C1E.5050800@livinglogic.de>		<ca471dc20707110101m5f65249aj4cf1122be20c5856@mail.gmail.com>	<ca471dc20707110230r21b2938bxa6d5bffe8fd968aa@mail.gmail.com>
	<4694F709.2040304@livinglogic.de>
Message-ID: <f72uch$mki$1@sea.gmane.org>

Walter D?rwald schrieb:
> 
> I guess for the final version of Py3000 type_set_name() in typeobject.c
> will not downgrade unicode strings to str8, but instead upgrade str8
> objects to unicode.

I'm currently working on type_set_name, see the other message with subject 'Heaptypes'.

Thomas


From amauryfa at gmail.com  Wed Jul 11 17:53:14 2007
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 11 Jul 2007 17:53:14 +0200
Subject: [Python-3000] Fwd: Your confirmation is required to leave the
	Python-3000 mailing list
In-Reply-To: <9e804ac0707110807i1c2e515dy5538aa487b711649@mail.gmail.com>
References: <mailman.0.1184147038.21671.python-3000@python.org>
	<ca471dc20707110445n28be3d1g8d46daa701eaeab2@mail.gmail.com>
	<9e804ac0707110807i1c2e515dy5538aa487b711649@mail.gmail.com>
Message-ID: <e27efe130707110853x6eb32d6raa38c679b3bac2d4@mail.gmail.com>

Hello,

Thomas Wouters wrote:
>
> I can't find the message you forwarded in mail.python.org's logs (although I
> may be looking wrong; its hard to do such a search without the full headers
> of the original) -- but it looks to me like it was a hoax, and not an actual
> unsubscription request from mail.python.org.
...
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/amauryfa%40gmail.com
>

Every mail sent by mailman seems to contain a self-unsubscribe link,
like the one just above. On reply, the link (with *my* address) is
part of the quoted text.
Did someone click on such a link, and used the web interface?

-- 
Amaury Forgeot d'Arc

From chrism at plope.com  Wed Jul 11 19:16:01 2007
From: chrism at plope.com (Chris McDonough)
Date: Wed, 11 Jul 2007 13:16:01 -0400
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
Message-ID: <FB45562E-60A9-4369-837B-ECD846981ED1@plope.com>

I have a very remedial question about how to fix test failures due to  
the side effects of string-unicode integration.

The xmlrpc library uses explicit encoding to encode XML tag payloads  
to (almost always) utf8.  Tag literals are not encoded.

What would be the best way to mimic this behavior under the new  
regime?  Just use unicode everywhere and encode the entire XML body  
to utf-8 at the end?  Or deal explicitly in bytes everywhere?  Or..?

Remedially,

- C

On Jul 10, 2007, at 5:14 PM, Guido van Rossum wrote:

> One of the most daunting tasks remaining for Python 3.0a1 (to be
> released by the end of August) is fixing the remaining failing unit
> tests in the py3k-struni branch
> (http://svn.python.org/view/python/branches/py3k-struni/).
>
> This is the branch where I have started the work on the
> string/unification branch. I want to promote this branch to become the
> "main" Py3k branch ASAP (by renaming it to py3k), but I don't want to
> do that until all unit tests pass. I've been working diligently on
> this task, and I've got it down to about 50 tests that are failing on
> at least one of OSX and Ubuntu (the platforms to which I have easy
> access). Now I need help.
>
> To facilitate distributing the task of getting the remaining tests to
> pass, I've created a wiki page:
> http://wiki.python.org/moin/Py3kStrUniTests . Please help! It's easy
> to help: (1) check out the py3k-struni branch; (2) build it; (3) pick
> a test and figure out why it's failing; (4) produce a fix; (5) submit
> the fix to SF (or check it in, if you have submit privileges and are
> confident enough).
>
> In order to avoid duplicate work, I've come up with a simple protocol:
> you mark a test in the wiki as "MINE" (with your name) when you start
> looking at it. You mark it as "FIXED [IN SF]" once you fix it, adding
> the patch# if the fix is in SF. If you give up, remove your lock,
> adding instead a note with what you've found (even just the names of
> the failing subtests is helpful).
>
> Please help!
>
> There are other tasks, see PEP 3100. Mail me if you're interested in
> anything specifically. (Please don't ask me "do you think I could do
> this" -- you know better than I whether you're capable of coding at a
> specific level. If you don't understand the task, you're probably not
> qualified.)
>
> -- 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists 
> %40plope.com
>


From amauryfa at gmail.com  Wed Jul 11 20:33:46 2007
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 11 Jul 2007 20:33:46 +0200
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <f72jn4$cma$1@sea.gmane.org>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de> <f728g6$6gt$1@sea.gmane.org>
	<ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>
	<e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>
	<ca471dc20707110541i4a3199c1w4ea14a1c486bdd01@mail.gmail.com>
	<f72jn4$cma$1@sea.gmane.org>
Message-ID: <e27efe130707111133h329080fcocf1a3f1c5e954824@mail.gmail.com>

Hi,

Thomas Heller wrote:
> The most important problem, IMO, is now that wide filenames on Windows are not
> implemented, see the code starting at line 148 in _fileio.c.  This prevents
> most unittests to run because test_support cannot be imported:
>
> C:\svn\py3k-struni\PCbuild>python  -E -tt ../lib/test/regrtest.py
> Traceback (most recent call last):
>   File "../lib/test/regrtest.py", line 165, in <module>
>     from test import test_support
>   File "C:\svn\py3k-struni\lib\test\test_support.py", line 182, in <module>
>     fp = open(TESTFN, 'w+')
>   File "C:\svn\py3k-struni\lib\site.py", line 412, in __new__
>     return io.open(*args, **kwds)
>   File "C:\svn\py3k-struni\lib\io.py", line 122, in open
>     (updating and "+" or ""))
> NotImplementedError: Windows wide filenames are not yet supported

The attached patch corrects this. Now open() accept both unicode
strings and bytes objects.

-- 
Amaury Forgeot d'Arc
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fileio-1.diff
Type: application/octet-stream
Size: 1473 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070711/5bb54244/attachment.obj 

From amauryfa at gmail.com  Wed Jul 11 21:13:31 2007
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 11 Jul 2007 21:13:31 +0200
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <f72jn4$cma$1@sea.gmane.org>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de> <f728g6$6gt$1@sea.gmane.org>
	<ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>
	<e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>
	<ca471dc20707110541i4a3199c1w4ea14a1c486bdd01@mail.gmail.com>
	<f72jn4$cma$1@sea.gmane.org>
Message-ID: <e27efe130707111213h7830a78fg584177883240217e@mail.gmail.com>

Re-hello,

Thomas Heller wrote:
> > On 7/11/07, Amaury Forgeot d'Arc wrote:
> >> Thomas Heller wrote:
> >> > I would love to look into these, but I prefer debugging on Windows.
> >> > However, the windows build does not work because the _fileio builtin
> >> > module is missing from config.c.  Again, this is not so easy to fix,
> >> > because the ftruncate function does not exist on Windows.
> >>
> >> In fileobject.c, there is a replacement for ftruncate. See the code
> >> around the call to SetEndOfFile().
> >>
> >> I'll try to provide a patch later today.
>
> Awaiting your patch ;-).

Ok, here it is; shamelessly copied from fileobject.c.
BTW, what is the status of this fileobject? open() doesn't seem to use
it anymore. Will file() be removed at some point?

Now test_fileio passes on Windows,
with the exception of testAbles(): since c:\dev is an existing
directory on my machine, /dev/tty is a regular file and is seekable...
Maybe skip this test on win32?

I have a couple of other corrections, found by randomly playing with
the tests functions... shall I post the corrections here as well?

-- 
Amaury Forgeot d'Arc
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fileio-2.diff
Type: application/octet-stream
Size: 1681 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070711/d18cdd5d/attachment-0001.obj 

From theller at ctypes.org  Wed Jul 11 22:07:11 2007
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 11 Jul 2007 22:07:11 +0200
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <e27efe130707111213h7830a78fg584177883240217e@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>	<46940875.2000606@cheimes.de>
	<f728g6$6gt$1@sea.gmane.org>	<ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>	<e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>	<ca471dc20707110541i4a3199c1w4ea14a1c486bdd01@mail.gmail.com>	<f72jn4$cma$1@sea.gmane.org>
	<e27efe130707111213h7830a78fg584177883240217e@mail.gmail.com>
Message-ID: <f73d9i$emp$1@sea.gmane.org>

Amaury Forgeot d'Arc schrieb:
> Re-hello,
> 
> Thomas Heller wrote:
>> > On 7/11/07, Amaury Forgeot d'Arc wrote:
>> >> Thomas Heller wrote:
>> >> > I would love to look into these, but I prefer debugging on Windows.
>> >> > However, the windows build does not work because the _fileio builtin
>> >> > module is missing from config.c.  Again, this is not so easy to fix,
>> >> > because the ftruncate function does not exist on Windows.
>> >>
>> >> In fileobject.c, there is a replacement for ftruncate. See the code
>> >> around the call to SetEndOfFile().
>> >>
>> >> I'll try to provide a patch later today.
>>
>> Awaiting your patch ;-).
> 
> Ok, here it is; shamelessly copied from fileobject.c.

Amaury, please upload your patches to the SF bug tracker, and assign them to me.
I will (hopefully) look into them tomorrow.

> BTW, what is the status of this fileobject? open() doesn't seem to use
> it anymore. Will file() be removed at some point?
> 
> Now test_fileio passes on Windows,
> with the exception of testAbles(): since c:\dev is an existing
> directory on my machine, /dev/tty is a regular file and is seekable...
> Maybe skip this test on win32?
> 
> I have a couple of other corrections, found by randomly playing with
> the tests functions... shall I post the corrections here as well?


See above: posting them to the tracker makes sure they don't get lost.

Thanks,
Thomas


From guido at python.org  Wed Jul 11 23:03:03 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Jul 2007 00:03:03 +0300
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <4694F709.2040304@livinglogic.de>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<f71d9d$thn$1@sea.gmane.org> <46948C1E.5050800@livinglogic.de>
	<ca471dc20707110101m5f65249aj4cf1122be20c5856@mail.gmail.com>
	<ca471dc20707110230r21b2938bxa6d5bffe8fd968aa@mail.gmail.com>
	<4694F709.2040304@livinglogic.de>
Message-ID: <ca471dc20707111403u45c29d39hf12f2fcaa3b55696@mail.gmail.com>

On 7/11/07, Walter D?rwald <walter at livinglogic.de> wrote:
> I guess for the final version of Py3000 type_set_name() in typeobject.c
> will not downgrade unicode strings to str8, but instead upgrade str8
> objects to unicode.

Right, Thomas is working on this (but I have some feedback on his fix).

> Also now that PyObject_Unicode() tries __unicode__ first and then tp_str
> should we rename all __unicode__ methods to __str__, or will __unicode__
> stay?

__unicode__ should be renamed to __str__, or removed (depending on
whether the __str__ method already does the right thing).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 11 23:05:39 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Jul 2007 00:05:39 +0300
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <FB45562E-60A9-4369-837B-ECD846981ED1@plope.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<FB45562E-60A9-4369-837B-ECD846981ED1@plope.com>
Message-ID: <ca471dc20707111405q44922c5chaba514e734f86c9@mail.gmail.com>

On 7/11/07, Chris McDonough <chrism at plope.com> wrote:
> I have a very remedial question about how to fix test failures due to
> the side effects of string-unicode integration.
>
> The xmlrpc library uses explicit encoding to encode XML tag payloads
> to (almost always) utf8.  Tag literals are not encoded.
>
> What would be the best way to mimic this behavior under the new
> regime?  Just use unicode everywhere and encode the entire XML body
> to utf-8 at the end?  Or deal explicitly in bytes everywhere?  Or..?

The correct approach would be to use Unicode (i.e., str) everywhere
and encode to UTF-8 at the end. If that's too hard something's wrong
with the philosophy of using Unicode everywhere...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 11 23:08:35 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Jul 2007 00:08:35 +0300
Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k
	Unittests in py3k-struni
In-Reply-To: <e27efe130707111213h7830a78fg584177883240217e@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>
	<46940875.2000606@cheimes.de> <f728g6$6gt$1@sea.gmane.org>
	<ca471dc20707110446je177145x8d0b0cf2ca2f8487@mail.gmail.com>
	<e27efe130707110527r58a3d1a0g7ab78d9e1707d51c@mail.gmail.com>
	<ca471dc20707110541i4a3199c1w4ea14a1c486bdd01@mail.gmail.com>
	<f72jn4$cma$1@sea.gmane.org>
	<e27efe130707111213h7830a78fg584177883240217e@mail.gmail.com>
Message-ID: <ca471dc20707111408x138db80en9080346bcea0e3ba@mail.gmail.com>

On 7/11/07, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote:
> BTW, what is the status of this fileobject? open() doesn't seem to use
> it anymore. Will file() be removed at some point?

The 'file' builtin is already gone. (You did use the py3k-struni
branch, didn't you?) Some parts of the fileobject.c file will remain,
but the only APIs that remain in there are generic I/O APIs that work
with file-like objects (in particular io.IOBase).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nas at arctrix.com  Thu Jul 12 01:12:45 2007
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 11 Jul 2007 23:12:45 +0000 (UTC)
Subject: [Python-3000] Change _Py prefix for 3k?
Message-ID: <f73o5d$hsf$1@sea.gmane.org>

It's a small detail but I wonder if it's time to stop using a
leading underscore for internal APIs.  I'm not sure what would be a
good replacement, perhaps a trailing underscore.  In case people
don't remember, the _Py prefix could, theoretically, be invalid C on
some platforms.

Regards,

  Neil


From tjreedy at udel.edu  Thu Jul 12 04:01:01 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 11 Jul 2007 22:01:01 -0400
Subject: [Python-3000] PEP 3099 += no bool change?
Message-ID: <f7420u$cg1$1@sea.gmane.org>

Someone asked if Py3 would get a "real" or "pure" bool type (one not 
subclassing int).  [The usual complaints and rehash about current bool 
ensured.]  I believe (and said so) that this is a settled question.  If so, 
please add a line under Standard types
* bool will continue to subclass int.

tjr


From joe at bitworking.org  Thu Jul 12 07:02:51 2007
From: joe at bitworking.org (Joe Gregorio)
Date: Thu, 12 Jul 2007 01:02:51 -0400
Subject: [Python-3000] test_mmap.py and OSError
Message-ID: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>

I decided to try to tackle the unit tests failing on the py3k-struni
branch for mmap. It now passes all the unit tests but one, and
the problem is that I don't know what should be 'fixed'. The
code in the unit test is:

         finally:
            try:
                f.close()
            except OSError:
                pass

The problem is that the file is already closed and in Lib/io.py,
the close calls flush() and flush() raises
ValueError() if the file is already closed, but the
unit test is looking for OSError.

Should io.py raise OSError instead of ValueError?
Or should test_mmap.py be expecting ValueError?
Or is there something else that I'm completely missing?

[ The wisdom of choosing mmap as my first fiddling
  with Python internals can be debated later :) ]

   Thanks,
   -joe

-- 
Joe Gregorio        http://bitworking.org

From greg.ewing at canterbury.ac.nz  Thu Jul 12 07:26:54 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 12 Jul 2007 17:26:54 +1200
Subject: [Python-3000] test_mmap.py and OSError
In-Reply-To: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>
References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>
Message-ID: <4695BB9E.2030202@canterbury.ac.nz>

Joe Gregorio wrote:
> flush() raises
> ValueError() if the file is already closed,
> 
> Should io.py raise OSError instead of ValueError?

Is it really necessary to raise anything at all?
An already-closed file is as flushed as it can
get, so why not just let it be a no-op?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Thu Jul 12 09:02:29 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Jul 2007 10:02:29 +0300
Subject: [Python-3000] test_mmap.py and OSError
In-Reply-To: <4695BB9E.2030202@canterbury.ac.nz>
References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>
	<4695BB9E.2030202@canterbury.ac.nz>
Message-ID: <ca471dc20707120002u5d49a0c9s17970b705c68a588@mail.gmail.com>

On 7/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Joe Gregorio wrote:
> > flush() raises
> > ValueError() if the file is already closed,
> >
> > Should io.py raise OSError instead of ValueError?
>
> Is it really necessary to raise anything at all?
> An already-closed file is as flushed as it can
> get, so why not just let it be a no-op?

I like that much better. So close() shouldn't try to flush() if it's
already closed. This means fixing io.py. (Unfortunately it's a bit of
a mess, a bit of refactoring would do it good.)

BTW whenever changing io.py, always run both test_io.py and
test_file.py, as they test slightly different sets of behavior.
(Though occasionally these tests must be adjusted too.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Jul 12 09:04:44 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Jul 2007 10:04:44 +0300
Subject: [Python-3000] Change _Py prefix for 3k?
In-Reply-To: <f73o5d$hsf$1@sea.gmane.org>
References: <f73o5d$hsf$1@sea.gmane.org>
Message-ID: <ca471dc20707120004r4ff7c3ceg475c55a0b751ecff@mail.gmail.com>

On 7/12/07, Neil Schemenauer <nas at arctrix.com> wrote:
> It's a small detail but I wonder if it's time to stop using a
> leading underscore for internal APIs.  I'm not sure what would be a
> good replacement, perhaps a trailing underscore.  In case people
> don't remember, the _Py prefix could, theoretically, be invalid C on
> some platforms.

There are lots of things we do that could theoretically be bad C. I
doubt that this particular one will ever bite us. Are there any other
reasons for such a change?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From walter at livinglogic.de  Thu Jul 12 14:16:33 2007
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Thu, 12 Jul 2007 14:16:33 +0200
Subject: [Python-3000] Need help fixing failing Py3k Unittests in
	py3k-struni
In-Reply-To: <ca471dc20707111403u45c29d39hf12f2fcaa3b55696@mail.gmail.com>
References: <ca471dc20707101414q77168e32v12b157c2ab756394@mail.gmail.com>	
	<f71d9d$thn$1@sea.gmane.org> <46948C1E.5050800@livinglogic.de>	
	<ca471dc20707110101m5f65249aj4cf1122be20c5856@mail.gmail.com>	
	<ca471dc20707110230r21b2938bxa6d5bffe8fd968aa@mail.gmail.com>	
	<4694F709.2040304@livinglogic.de>
	<ca471dc20707111403u45c29d39hf12f2fcaa3b55696@mail.gmail.com>
Message-ID: <46961BA1.9080206@livinglogic.de>

Guido van Rossum wrote:
> On 7/11/07, Walter D?rwald <walter at livinglogic.de> wrote:
>> I guess for the final version of Py3000 type_set_name() in typeobject.c
>> will not downgrade unicode strings to str8, but instead upgrade str8
>> objects to unicode.
> 
> Right, Thomas is working on this (but I have some feedback on his fix).
> 
>> Also now that PyObject_Unicode() tries __unicode__ first and then tp_str
>> should we rename all __unicode__ methods to __str__, or will __unicode__
>> stay?
> 
> __unicode__ should be renamed to __str__, or removed (depending on
> whether the __str__ method already does the right thing).

I've dropped __unicode__ from tkinter. The only remaining __unicode__
use is in the email package (besides the tests, where IMHO __unicode__
should stay as long as its handled by PyObject_Unicode()).
email.Header.Header defines a __unicode__ which is different from the
__str__ method. I guess Barry will know how to fix this.

Servus,
   Walter


From joe at bitworking.org  Thu Jul 12 15:54:23 2007
From: joe at bitworking.org (Joe Gregorio)
Date: Thu, 12 Jul 2007 09:54:23 -0400
Subject: [Python-3000] test_mmap.py and OSError
In-Reply-To: <ca471dc20707120002u5d49a0c9s17970b705c68a588@mail.gmail.com>
References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>
	<4695BB9E.2030202@canterbury.ac.nz>
	<ca471dc20707120002u5d49a0c9s17970b705c68a588@mail.gmail.com>
Message-ID: <3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com>

On 7/12/07, Guido van Rossum <guido at python.org> wrote:
> On 7/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > Joe Gregorio wrote:
> > > flush() raises
> > > ValueError() if the file is already closed,
> > >
> > > Should io.py raise OSError instead of ValueError?
> >
> > Is it really necessary to raise anything at all?
> > An already-closed file is as flushed as it can
> > get, so why not just let it be a no-op?
>
> I like that much better. So close() shouldn't try to flush() if it's
> already closed. This means fixing io.py. (Unfortunately it's a bit of
> a mess, a bit of refactoring would do it good.)

Thanks for the guidance.

This patch fixes mmap and also changes io.py
so that close() doesn't flush if it's already closed.
I did run both test_io.py and test_file.py when checking
the changes to io.py.

http://www.python.org/sf/1752647

   Thanks,
   -joe

-- 
Joe Gregorio        http://bitworking.org

From nas at arctrix.com  Thu Jul 12 17:53:48 2007
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 12 Jul 2007 09:53:48 -0600
Subject: [Python-3000] Change _Py prefix for 3k?
In-Reply-To: <ca471dc20707120004r4ff7c3ceg475c55a0b751ecff@mail.gmail.com>
References: <f73o5d$hsf$1@sea.gmane.org>
	<ca471dc20707120004r4ff7c3ceg475c55a0b751ecff@mail.gmail.com>
Message-ID: <20070712155348.GA29907@arctrix.com>

On Thu, Jul 12, 2007 at 10:04:44AM +0300, Guido van Rossum wrote:
> There are lots of things we do that could theoretically be bad C. I
> doubt that this particular one will ever bite us. Are there any other
> reasons for such a change?

I think Python is one of the only open source projects to use a
_[A-Z] prefix on non-local symbols.  That seems more dangerous that
other non-standard stuff.  Also, it could be hard to work around if
someone runs into trouble.  My gut feeling is that it's not worth
the effort to change but I wanted it to be considered for 3k.

  Neil

From martin at v.loewis.de  Fri Jul 13 17:19:45 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 17:19:45 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
Message-ID: <46979811.2050405@v.loewis.de>

> I don't know enouch about ...CallFunction to help you with the rest.

I wonder whether the "s" specifier in CallFunction, BuildValue etc
should create Unicode objects, rather than str8 objects.

Regards,
Martin

From pje at telecommunity.com  Fri Jul 13 19:41:47 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 13 Jul 2007 13:41:47 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
Message-ID: <20070713173936.53C213A404D@sparrow.telecommunity.com>

At 07:39 AM 7/13/2007 +0200, Michele Simionato wrote:
>But I want to ask your opinion first, in order to understand if you 
>are willing to scale down your proposal or not. At EuroPython Guido 
>said that in private mail you made some strong argument explaining 
>why the PEP could not be simplified, but he did not say more than that

It's not an argument that the PEP can't be simplified; only that a 
simpler PEP won't accomplish my original goal for the PEP (of having 
a generic API for generic functions) vs. simply having a generic 
function implementation in the stdlib.  The first goal requires the 
second, but the second doesn't need the first, and as far as I'm 
aware, I'm the only person who really wants the first.

A simpler PEP could exist to implement the second goal only, 
implementing dynamic overloading in Python 3.0 with all of the 
non-controversial features of 3124, and using Guido's preferred API.

The holdup is that I don't have time to work on the *implementation* 
of both my version *and* this simplified version; there is little 
overlap between the two because mine is highly 
self-referential/self-bootstrapping, absolutely dependent on being 
able to modify functions in-place (a feature Guido seems near -1 on), 
and virtually impossible to scale down.

So, it is much lower on my priorities at the moment to implement the 
simplified version, because I will neither gain code reuse *nor* the 
API standardization I'd hoped for.

At the moment, my plan is to finish implementing a PEP 3124-like, 
fully extensible implementation for Python 2.x (see PEAK-Rules), then 
look at splitting 3124 into a simplified version and a separate 
extension API PEP aimed at Python 3.1 or later.  At that point, I 
will know for sure what extension API features are necessary to 
implement the more advanced features I want in PEAK-Rules.

I expect to be able to start work on this (i.e., revisiting the 
proposal) in about a month.  With luck, I will be able to carve out 
enough time to create the simpler implementation and update the PEP 
in a reasonable amount of time.

However, there is nothing stopping anyone else who wishes it from 
either making the simpler implementation or drafting the scaled-down 
PEP.  The simpler version Guido wants isn't really that different 
from his existing generic function prototype, especially if you drop 
all forms of method combination (including :next_method).  It will 
also need positional dispatching, but that's another feature that 
could perhaps wait for 3.1 as well.

In short, if you want a PEP 3124 implementation started on sooner 
than about a month from now, you need to find a volunteer or do it yourself.


>The point is that for 95% of my use cases, simplegeneric would be 
>enough, and it is alreay available *now*. So, if Guido was willing 
>to accept something like simplegeneric for Python 3.0, I would not 
>mind waiting for multiple dispatch in 3.1.

You'll have to ask him about that.  For what it's worth, the pkgutil 
module already contains an even simpler generic function 
implementation than simplegeneric, and is already in the stdlib 
albeit undocumented.


>The reason why I am not using simplegeneric or RuleDispatch already, 
>is that I do not want to commit in production to a technology 
>without the official approval of the BDFL, and I prefer to wait now 
>than having to change my code later.

I guess this means you never use any packages from the Cheeseshop?  :)


From michele.simionato at gmail.com  Fri Jul 13 20:37:40 2007
From: michele.simionato at gmail.com (Michele Simionato)
Date: Fri, 13 Jul 2007 18:37:40 +0000 (UTC)
Subject: [Python-3000] pep 3124 plans
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.co m>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
Message-ID: <loom.20070713T201857-3@post.gmane.org>

Phillip J. Eby <pje <at> telecommunity.com> writes:
> For what it's worth, the pkgutil 
> module already contains an even simpler generic function 
> implementation than simplegeneric, and is already in the stdlib 
> albeit undocumented.

Well, that is good to know. Personally I would be content with something
at that level of sophistication (i.e. the absolute minimum). I think there
is no much experience in the community with generic functions (except for 
you) and there is no danger in waiting and in acquiring more experience
before including in the standard library a fully featured package. After
all, RuleDispatch is already out there and there is no reason for putting
everything in the stdlib. For the same reason, I am happy that Zope interfaces
will stay out of the stdlib, and that we will have the much simpler ABC
(of course one could argue that generic functions are better than ABC and
actually I think so, but still ABC are a simpler entry point for most 
programmers, more in line with how Python has worked until now, and they
will allows me to throw away an half-backed interface implementation I am
using now, which is always a good thing ;)

        Michele Simionato


From theller at ctypes.org  Fri Jul 13 21:13:39 2007
From: theller at ctypes.org (Thomas Heller)
Date: Fri, 13 Jul 2007 21:13:39 +0200
Subject: [Python-3000] pep3115 - metaclasses in python 3000
Message-ID: <f78it5$t35$1@sea.gmane.org>

playing a little with py3k...

pep3115 mentions that "__prepare__ returns a dictionary-like object
which is used to store the class member definitions during evaluation
of the class body."

It does not mention whether this dict-like object is used afterwards
as the class-dictionary of the created class or not (when the __new__
method of the metaclass is called).

The sample-code suggests that it would be used as class dict of the
newly created class (the sample code copies it into a regular dictionary
before it is passed to the type.__new__ call).
However, the actual code in the py3k-struni branch (typeobject.c) copies
the passed in dict again.

In other words, it seems impossible even with pep3115 to use a custom
subclass of dict as a type's __dict__ member, and afaik it is impossible
in Python to replace that afterwards.

Is this analysis correct?   Is that the intent of pep3115?  Or could
the code be changed so that it is possible to supply a custom type dict
with the metaclass?

Thanks,
Thomas


From pje at telecommunity.com  Fri Jul 13 23:51:52 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 13 Jul 2007 17:51:52 -0400
Subject: [Python-3000] pep3115 - metaclasses in python 3000
In-Reply-To: <f78it5$t35$1@sea.gmane.org>
References: <f78it5$t35$1@sea.gmane.org>
Message-ID: <20070713214940.9A5883A404D@sparrow.telecommunity.com>

At 09:13 PM 7/13/2007 +0200, Thomas Heller wrote:
>playing a little with py3k...
>
>pep3115 mentions that "__prepare__ returns a dictionary-like object
>which is used to store the class member definitions during evaluation
>of the class body."
>
>It does not mention whether this dict-like object is used afterwards
>as the class-dictionary of the created class or not (when the __new__
>method of the metaclass is called).
>
>The sample-code suggests that it would be used as class dict of the
>newly created class (the sample code copies it into a regular dictionary
>before it is passed to the type.__new__ call).
>However, the actual code in the py3k-struni branch (typeobject.c) copies
>the passed in dict again.
>
>In other words, it seems impossible even with pep3115 to use a custom
>subclass of dict as a type's __dict__ member, and afaik it is impossible
>in Python to replace that afterwards.
>
>Is this analysis correct?   Is that the intent of pep3115?  Or could
>the code be changed so that it is possible to supply a custom type dict
>with the metaclass?

I would suggest that we do not intend that the class __dict__ == the 
__prepare__ object, even as the default case.  Otherwise, we have to 
find everything that accesses type dictionaries and make sure they 
can work with other kinds of objects.


From talin at acm.org  Sat Jul 14 06:56:44 2007
From: talin at acm.org (Talin)
Date: Fri, 13 Jul 2007 21:56:44 -0700
Subject: [Python-3000] pep3115 - metaclasses in python 3000
In-Reply-To: <f78it5$t35$1@sea.gmane.org>
References: <f78it5$t35$1@sea.gmane.org>
Message-ID: <4698578C.3080808@acm.org>

Thomas Heller wrote:
> playing a little with py3k...
> 
> pep3115 mentions that "__prepare__ returns a dictionary-like object
> which is used to store the class member definitions during evaluation
> of the class body."
> 
> It does not mention whether this dict-like object is used afterwards
> as the class-dictionary of the created class or not (when the __new__
> method of the metaclass is called).

The intention is that it's up to the metaclass to decide. I suspect that 
most metaclasses won't want to use the dict-like object as the class 
dict, for two reasons:

1) The behavior of assigning to the class dict after class creation is 
likely to be different than the behavior of assignment during class 
creation. In particular, a typical 'dict-like' object is likely to be 
slower than a dict (it has more work to do, after all), and you don't 
want that slowness around once your class is finished initializing.

2) A 'dict-like' object doesn't have to support all of the methods of a 
real dict, wherease a class dict does. So your dict-like wrapper can be 
relatively simple.

-- Talin

From lists at cheimes.de  Sat Jul 14 15:36:04 2007
From: lists at cheimes.de (Christian Heimes)
Date: Sat, 14 Jul 2007 15:36:04 +0200
Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in py3k-struni
Message-ID: <f7ajg5$pg5$1@sea.gmane.org>

Hello!

I'm having some troubles with unit tests in the py3k-struni branch. Some
test like test_uu are failing because an io.TextIOWrapper instance's
write() method doesn't handle bytes. The method is defined as:

    def write(self, s: str):
        if self.closed:
            raise ValueError("write to closed file")
        # XXX What if we were just reading?
        b = s.encode(self._encoding)
        if isinstance(b, str):
            b = bytes(b)
        n = self.buffer.write(b)
        if "\n" in s:
            # XXX only if isatty
            self.flush()
        self._snapshot = self._decoder = None
        return len(s)

The problematic lines are the lines from s.encode() to b = bytes(b). The
behavior is more than questionable. A bytes object doesn't have an
encode() method and str's encode method() always returns bytes. IMO the
write() method should be changed to:

    def write(self, s: (str, bytes)):
        if self.closed:
            raise ValueError("write to closed file")
        # XXX What if we were just reading?
        if isinstance(s, basestring):
            b = s.encode(self._encoding)
        elif isinstance(s, bytes):
            b = s
        else:
            b = bytes(b)
        n = self.buffer.write(b)
        if b"\n" in b:
            # XXX only if isatty
            self.flush()
        self._snapshot = self._decoder = None
        return len(s)

Or the write() should explictly raise a TypeError when it is not allowed
to handle bytes.

Christian


From guido at python.org  Sat Jul 14 16:08:31 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 14 Jul 2007 17:08:31 +0300
Subject: [Python-3000] Heaptypes
In-Reply-To: <46979811.2050405@v.loewis.de>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
	<46979811.2050405@v.loewis.de>
Message-ID: <ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>

That sounds like a good idea to try. It may break some more tests but
those are all indications of places that incorrectly still require
str8.

On 7/13/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > I don't know enouch about ...CallFunction to help you with the rest.
>
> I wonder whether the "s" specifier in CallFunction, BuildValue etc
> should create Unicode objects, rather than str8 objects.
>
> Regards,
> Martin
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun Jul 15 16:17:00 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 15 Jul 2007 07:17:00 -0700
Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace
	error
Message-ID: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>

When a source file contains a string literal with an out-of-range \U
escape (e.g. "\U12345678"), instead of a syntax error pointing to the
offending literal, I get this, without any indication of the file or
line:

UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
position 0-9: illegal Unicode character

This is quite hard to track down. (Both the location of the bad
literal in the source file, and the origin of the error in the parser.
:-) Can someone come up with a fix?

I note that raw escapes show a slightly different error. I also note
that the same issue exists for u"..." literals in Python 2.5.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Sun Jul 15 23:04:19 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 15 Jul 2007 23:04:19 +0200
Subject: [Python-3000] exclusion feature for 2to3?
Message-ID: <f7e24l$92e$1@sea.gmane.org>

In order to have a codebase run in 2.x and 3.x, via automated translated by
2to3, there should be some "exclusion feature" for single lines that tells
the refactorer not to touch those lines.

For example, if you have some object that still has an iteritems() method and
keeps it, it'll have to stay the same during translation.
Same goes, e.g., for methods named next(), has_key() etc.

Most obvious would be a special comment, something like

for x in curiousobject.iteritems():  # 2to3:keep
     foo(x)

Does that make sense?

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From python3now at gmail.com  Mon Jul 16 03:14:00 2007
From: python3now at gmail.com (James Thiele)
Date: Sun, 15 Jul 2007 18:14:00 -0700
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <f7e24l$92e$1@sea.gmane.org>
References: <f7e24l$92e$1@sea.gmane.org>
Message-ID: <8f01efd00707151814y600e6derb248e3dec921162c@mail.gmail.com>

It makes sense - what would you suggest to specify lines/features to exclude?


On 7/15/07, Georg Brandl <g.brandl at gmx.net> wrote:
> In order to have a codebase run in 2.x and 3.x, via automated translated by
> 2to3, there should be some "exclusion feature" for single lines that tells
> the refactorer not to touch those lines.
>
> For example, if you have some object that still has an iteritems() method and
> keeps it, it'll have to stay the same during translation.
> Same goes, e.g., for methods named next(), has_key() etc.
>
> Most obvious would be a special comment, something like
>
> for x in curiousobject.iteritems():  # 2to3:keep
>      foo(x)
>
> Does that make sense?
>
> Georg
>
> --
> Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
> Four shall be the number of spaces thou shalt indent, and the number of thy
> indenting shall be four. Eight shalt thou not indent, nor either indent thou
> two, excepting that thou then proceed to four. Tabs are right out.
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/python3now%40gmail.com
>

From guido at python.org  Mon Jul 16 04:22:15 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 15 Jul 2007 19:22:15 -0700
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <f7e24l$92e$1@sea.gmane.org>
References: <f7e24l$92e$1@sea.gmane.org>
Message-ID: <ca471dc20707151922i1894355fh5118d07aa68abb65@mail.gmail.com>

On 7/15/07, Georg Brandl <g.brandl at gmx.net> wrote:
> In order to have a codebase run in 2.x and 3.x, via automated translated by
> 2to3, there should be some "exclusion feature" for single lines that tells
> the refactorer not to touch those lines.
>
> For example, if you have some object that still has an iteritems() method and
> keeps it, it'll have to stay the same during translation.
> Same goes, e.g., for methods named next(), has_key() etc.
>
> Most obvious would be a special comment, something like
>
> for x in curiousobject.iteritems():  # 2to3:keep
>      foo(x)
>
> Does that make sense?

Absolutely. (Were you in the audience of my keynote at EuroPython? I
believe I briefly mentioned the need for such a feature there. :-)

Can't say I have a good feeling for how to implement it yet, but it
should definitely be possible. Precise syntax to be done.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Mon Jul 16 08:12:26 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 15 Jul 2007 23:12:26 -0700
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
Message-ID: <ee2a432c0707152312g48030ado88ffb03e37956cdf@mail.gmail.com>

On 7/15/07, Guido van Rossum <guido at python.org> wrote:
> When a source file contains a string literal with an out-of-range \U
> escape (e.g. "\U12345678"), instead of a syntax error pointing to the
> offending literal, I get this, without any indication of the file or
> line:
>
> UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
> position 0-9: illegal Unicode character
>
> This is quite hard to track down. (Both the location of the bad
> literal in the source file, and the origin of the error in the parser.
> :-) Can someone come up with a fix?

Take a look at the patch http://python.org/sf/1031213

That might help.  I'm not sure if it's the same problem.

I really need to dispose of a bunch of things assigned to me. :-(

n

From g.brandl at gmx.net  Mon Jul 16 13:23:29 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 16 Jul 2007 13:23:29 +0200
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <ca471dc20707151922i1894355fh5118d07aa68abb65@mail.gmail.com>
References: <f7e24l$92e$1@sea.gmane.org>
	<ca471dc20707151922i1894355fh5118d07aa68abb65@mail.gmail.com>
Message-ID: <f7fkf5$a0n$1@sea.gmane.org>

Guido van Rossum schrieb:
> On 7/15/07, Georg Brandl <g.brandl at gmx.net> wrote:
>> In order to have a codebase run in 2.x and 3.x, via automated translated by
>> 2to3, there should be some "exclusion feature" for single lines that tells
>> the refactorer not to touch those lines.
>>
>> For example, if you have some object that still has an iteritems() method and
>> keeps it, it'll have to stay the same during translation.
>> Same goes, e.g., for methods named next(), has_key() etc.
>>
>> Most obvious would be a special comment, something like
>>
>> for x in curiousobject.iteritems():  # 2to3:keep
>>      foo(x)
>>
>> Does that make sense?
> 
> Absolutely. (Were you in the audience of my keynote at EuroPython? I
> believe I briefly mentioned the need for such a feature there. :-)

No, I ran the new documentation toolset through 2to3; and e.g. docutils
nodes have a has_key() that does something else than __contains__().

Good to know it's planned!

Georg


From guido at python.org  Mon Jul 16 16:16:10 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Jul 2007 07:16:10 -0700
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <f7fkf5$a0n$1@sea.gmane.org>
References: <f7e24l$92e$1@sea.gmane.org>
	<ca471dc20707151922i1894355fh5118d07aa68abb65@mail.gmail.com>
	<f7fkf5$a0n$1@sea.gmane.org>
Message-ID: <ca471dc20707160716hcf583c4uca822775f19b9987@mail.gmail.com>

On 7/16/07, Georg Brandl <g.brandl at gmx.net> wrote:
> > Absolutely. (Were you in the audience of my keynote at EuroPython? I
> > believe I briefly mentioned the need for such a feature there. :-)
>
> No, I ran the new documentation toolset through 2to3; and e.g. docutils
> nodes have a has_key() that does something else than __contains__().
>
> Good to know it's planned!

Planned is a big word. Someone has to design and implement it.

BTW I hope to see more core developers from Europe at EuroPython next year!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul 16 20:29:17 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Jul 2007 11:29:17 -0700
Subject: [Python-3000] test_mmap.py and OSError
In-Reply-To: <3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com>
References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>
	<4695BB9E.2030202@canterbury.ac.nz>
	<ca471dc20707120002u5d49a0c9s17970b705c68a588@mail.gmail.com>
	<3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com>
Message-ID: <ca471dc20707161129h63fa1e04g93100ba88eea8fd4@mail.gmail.com>

So, after seeing the patch and thinking this over some more, I have
changed my mind (again). Attempting to flush a closed file seems to
indicate that you're confused about whether a file is closed or not,
and that seems indicative of unclear thinking, i.e. it's likely a bug
that ought to be caught. I think the original thinking that lead to
this being treated as an error in 2.x was correct.

I don't see attempts to close an already closed file the same way --
this is a state transition to a final state and it makes total sense
that you can reach that state from itself. There are good use cases
for allowing this. I don't see the use case for flushing a closed
file.

--Guido

On 7/12/07, Joe Gregorio <joe at bitworking.org> wrote:
> On 7/12/07, Guido van Rossum <guido at python.org> wrote:
> > On 7/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > > Joe Gregorio wrote:
> > > > flush() raises
> > > > ValueError() if the file is already closed,
> > > >
> > > > Should io.py raise OSError instead of ValueError?
> > >
> > > Is it really necessary to raise anything at all?
> > > An already-closed file is as flushed as it can
> > > get, so why not just let it be a no-op?
> >
> > I like that much better. So close() shouldn't try to flush() if it's
> > already closed. This means fixing io.py. (Unfortunately it's a bit of
> > a mess, a bit of refactoring would do it good.)
>
> Thanks for the guidance.
>
> This patch fixes mmap and also changes io.py
> so that close() doesn't flush if it's already closed.
> I did run both test_io.py and test_file.py when checking
> the changes to io.py.
>
> http://www.python.org/sf/1752647
>
>    Thanks,
>    -joe
>
> --
> Joe Gregorio        http://bitworking.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From joe at bitworking.org  Mon Jul 16 20:45:05 2007
From: joe at bitworking.org (Joe Gregorio)
Date: Mon, 16 Jul 2007 14:45:05 -0400
Subject: [Python-3000] test_mmap.py and OSError
In-Reply-To: <ca471dc20707161129h63fa1e04g93100ba88eea8fd4@mail.gmail.com>
References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>
	<4695BB9E.2030202@canterbury.ac.nz>
	<ca471dc20707120002u5d49a0c9s17970b705c68a588@mail.gmail.com>
	<3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com>
	<ca471dc20707161129h63fa1e04g93100ba88eea8fd4@mail.gmail.com>
Message-ID: <3f1451f50707161145i3a17541arf8c9c8595d641c39@mail.gmail.com>

On 7/16/07, Guido van Rossum <guido at python.org> wrote:
> So, after seeing the patch and thinking this over some more, I have
> changed my mind (again). Attempting to flush a closed file seems to
> indicate that you're confused about whether a file is closed or not,
> and that seems indicative of unclear thinking, i.e. it's likely a bug
> that ought to be caught. I think the original thinking that lead to
> this being treated as an error in 2.x was correct.
>
> I don't see attempts to close an already closed file the same way --
> this is a state transition to a final state and it makes total sense
> that you can reach that state from itself. There are good use cases
> for allowing this. I don't see the use case for flushing a closed
> file.

Personally I like that better, it seems more consistent.

Should I change the try/except block in the mmap unit test to look for
ValueError or should the exception raised in io.py be of type OSError like
the 2.5 code expects?

test_mmap.py:108

            try:
                f.close()
            except OSError:
                pass

   Thanks,
   -joe

-- 
Joe Gregorio        http://bitworking.org

From guido at python.org  Mon Jul 16 21:36:59 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Jul 2007 12:36:59 -0700
Subject: [Python-3000] test_mmap.py and OSError
In-Reply-To: <3f1451f50707161145i3a17541arf8c9c8595d641c39@mail.gmail.com>
References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com>
	<4695BB9E.2030202@canterbury.ac.nz>
	<ca471dc20707120002u5d49a0c9s17970b705c68a588@mail.gmail.com>
	<3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com>
	<ca471dc20707161129h63fa1e04g93100ba88eea8fd4@mail.gmail.com>
	<3f1451f50707161145i3a17541arf8c9c8595d641c39@mail.gmail.com>
Message-ID: <ca471dc20707161236r76b28e5fx49da5831937b87ce@mail.gmail.com>

On 7/16/07, Joe Gregorio <joe at bitworking.org> wrote:
> On 7/16/07, Guido van Rossum <guido at python.org> wrote:
> > So, after seeing the patch and thinking this over some more, I have
> > changed my mind (again). Attempting to flush a closed file seems to
> > indicate that you're confused about whether a file is closed or not,
> > and that seems indicative of unclear thinking, i.e. it's likely a bug
> > that ought to be caught. I think the original thinking that lead to
> > this being treated as an error in 2.x was correct.
> >
> > I don't see attempts to close an already closed file the same way --
> > this is a state transition to a final state and it makes total sense
> > that you can reach that state from itself. There are good use cases
> > for allowing this. I don't see the use case for flushing a closed
> > file.
>
> Personally I like that better, it seems more consistent.
>
> Should I change the try/except block in the mmap unit test to look for
> ValueError or should the exception raised in io.py be of type OSError like
> the 2.5 code expects?
>
> test_mmap.py:108
>
>             try:
>                 f.close()
>             except OSError:
>                 pass
>
>    Thanks,
>    -joe

I just checked in your changes, but looking at the code, I think it's
bogus either way: there should be two separate try/finally blocks
corresponding to the two 'f = open(...)' calls. I'll fix it that way.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul 16 22:35:12 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Jul 2007 13:35:12 -0700
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <ee2a432c0707152312g48030ado88ffb03e37956cdf@mail.gmail.com>
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
	<ee2a432c0707152312g48030ado88ffb03e37956cdf@mail.gmail.com>
Message-ID: <ca471dc20707161335m1d41d274x3d64d3f9856d5bbc@mail.gmail.com>

Doesn't look like it's the same problem. I've assigned that one to
Martin who knows that area best of all.

On 7/15/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 7/15/07, Guido van Rossum <guido at python.org> wrote:
> > When a source file contains a string literal with an out-of-range \U
> > escape (e.g. "\U12345678"), instead of a syntax error pointing to the
> > offending literal, I get this, without any indication of the file or
> > line:
> >
> > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
> > position 0-9: illegal Unicode character
> >
> > This is quite hard to track down. (Both the location of the bad
> > literal in the source file, and the origin of the error in the parser.
> > :-) Can someone come up with a fix?
>
> Take a look at the patch http://python.org/sf/1031213
>
> That might help.  I'm not sure if it's the same problem.
>
> I really need to dispose of a bunch of things assigned to me. :-(
>
> n
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul 16 23:23:33 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Jul 2007 14:23:33 -0700
Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in
	py3k-struni
In-Reply-To: <f7ajg5$pg5$1@sea.gmane.org>
References: <f7ajg5$pg5$1@sea.gmane.org>
Message-ID: <ca471dc20707161423p106bfe15i29eb572dda2b07c2@mail.gmail.com>

On 7/14/07, Christian Heimes <lists at cheimes.de> wrote:
> I'm having some troubles with unit tests in the py3k-struni branch. Some
> test like test_uu are failing because an io.TextIOWrapper instance's
> write() method doesn't handle bytes. The method is defined as:
>
>     def write(self, s: str):
>         if self.closed:
>             raise ValueError("write to closed file")
>         # XXX What if we were just reading?
>         b = s.encode(self._encoding)
>         if isinstance(b, str):
>             b = bytes(b)
>         n = self.buffer.write(b)
>         if "\n" in s:
>             # XXX only if isatty
>             self.flush()
>         self._snapshot = self._decoder = None
>         return len(s)
>
> The problematic lines are the lines from s.encode() to b = bytes(b). The
> behavior is more than questionable. A bytes object doesn't have an
> encode() method and str's encode method() always returns bytes. IMO the
> write() method should be changed to:
>
>     def write(self, s: (str, bytes)):
>         if self.closed:
>             raise ValueError("write to closed file")
>         # XXX What if we were just reading?
>         if isinstance(s, basestring):
>             b = s.encode(self._encoding)
>         elif isinstance(s, bytes):
>             b = s
>         else:
>             b = bytes(b)
>         n = self.buffer.write(b)
>         if b"\n" in b:
>             # XXX only if isatty
>             self.flush()
>         self._snapshot = self._decoder = None
>         return len(s)
>
> Or the write() should explictly raise a TypeError when it is not allowed
> to handle bytes.

I came across this in your SF patch. I disagree with your desire to
let TextIOWrapper.write() handle bytes: it should *only* be passed str
objects. The uu test was failing because it was writing bytes to a
text stream.

Perhaps the error should be better; though I'm not sure I want to add
explicit type checks (as it would defeat duck typing).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Jul 17 01:58:36 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Jul 2007 16:58:36 -0700
Subject: [Python-3000] pep3115 - metaclasses in python 3000
In-Reply-To: <4698578C.3080808@acm.org>
References: <f78it5$t35$1@sea.gmane.org> <4698578C.3080808@acm.org>
Message-ID: <ca471dc20707161658l58a1d600j34c7175aac3a4b3c@mail.gmail.com>

On 7/13/07, Talin <talin at acm.org> wrote:
> Thomas Heller wrote:
> > playing a little with py3k...
> >
> > pep3115 mentions that "__prepare__ returns a dictionary-like object
> > which is used to store the class member definitions during evaluation
> > of the class body."
> >
> > It does not mention whether this dict-like object is used afterwards
> > as the class-dictionary of the created class or not (when the __new__
> > method of the metaclass is called).
>
> The intention is that it's up to the metaclass to decide. I suspect that
> most metaclasses won't want to use the dict-like object as the class
> dict, for two reasons:
>
> 1) The behavior of assigning to the class dict after class creation is
> likely to be different than the behavior of assignment during class
> creation. In particular, a typical 'dict-like' object is likely to be
> slower than a dict (it has more work to do, after all), and you don't
> want that slowness around once your class is finished initializing.
>
> 2) A 'dict-like' object doesn't have to support all of the methods of a
> real dict, wherease a class dict does. So your dict-like wrapper can be
> relatively simple.

The object returned by __prepare__() actually *is* incorporated into
the class object, unless the metaclass' __new__() passes something
else to type.__new__(). However this isn't obvious when you ask for
the class' __dict__ attribute: you always get a dict proxy.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Jul 17 02:11:09 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Jul 2007 17:11:09 -0700
Subject: [Python-3000] pep3115 - metaclasses in python 3000
In-Reply-To: <ca471dc20707161658l58a1d600j34c7175aac3a4b3c@mail.gmail.com>
References: <f78it5$t35$1@sea.gmane.org> <4698578C.3080808@acm.org>
	<ca471dc20707161658l58a1d600j34c7175aac3a4b3c@mail.gmail.com>
Message-ID: <ca471dc20707161711q49d46202s2ce6f5989dff6aca@mail.gmail.com>

On 7/16/07, Guido van Rossum <guido at python.org> wrote:
> The object returned by __prepare__() actually *is* incorporated into
> the class object, unless the metaclass' __new__() passes something
> else to type.__new__(). However this isn't obvious when you ask for
> the class' __dict__ attribute: you always get a dict proxy.

I take it back. The object is copied, for the reasons Phillip
explained. There is no way around this without writing C code, as the
only way to create a type object from Python is to call type.__new__()
-- the __new__() method if a subclass of type still must call type's
__new__() method to create the actual object.

(Embarrassed, since I wrote all the code involved.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From lists at cheimes.de  Tue Jul 17 03:22:13 2007
From: lists at cheimes.de (Christian Heimes)
Date: Tue, 17 Jul 2007 03:22:13 +0200
Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in
	py3k-struni
In-Reply-To: <ca471dc20707161423p106bfe15i29eb572dda2b07c2@mail.gmail.com>
References: <f7ajg5$pg5$1@sea.gmane.org>
	<ca471dc20707161423p106bfe15i29eb572dda2b07c2@mail.gmail.com>
Message-ID: <469C19C5.3010006@cheimes.de>

Guido van Rossum wrote:
> I came across this in your SF patch. I disagree with your desire to
> let TextIOWrapper.write() handle bytes: it should *only* be passed str
> objects. The uu test was failing because it was writing bytes to a
> text stream.
> 
> Perhaps the error should be better; though I'm not sure I want to add
> explicit type checks (as it would defeat duck typing).

Yes, duck typing is very useful but this duck doesn't quack me why it
hurts. ;) It's rather confusing at first.

What do you think about

    def write(self, s: str):
        if self.closed:
            raise ValueError("write to closed file")
        try:
            b = s.encode(self._encoding)
        except AttributeError:
            raise TypeError("str expected, got %r" % s)
        ...

    def write(self, s: str):
        if self.closed:
            raise ValueError("write to closed file")
        if not hasattr(s, 'encode')
            raise TypeError("str expected, got %r" % s)
        ...

? It explains what is going wrong.

Christian


From martin at v.loewis.de  Tue Jul 17 06:52:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 17 Jul 2007 06:52:27 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>	
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>	
	<46979811.2050405@v.loewis.de>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
Message-ID: <469C4B0B.50605@v.loewis.de>

Guido van Rossum schrieb:
> That sounds like a good idea to try. It may break some more tests but
> those are all indications of places that incorrectly still require
> str8.
> 
>> I wonder whether the "s" specifier in CallFunction, BuildValue etc
>> should create Unicode objects, rather than str8 objects.

Done. I fixed a number of test cases that broke because of that.
In particular, bytes.__reduce__ could not easily return str8 objects
as its marshalling state anymore (and shouldn't do so, anyway).
So I made bytes a builtin type of pickle, using the S code.
As a consequence, a number of other types had to get fixed.

So in total, it adds one new failure: something in test_pickle
now complains that bytes objects are not hashable.

Regards,
Martin

From p.f.moore at gmail.com  Tue Jul 17 13:04:13 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 17 Jul 2007 12:04:13 +0100
Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in
	py3k-struni
In-Reply-To: <469C19C5.3010006@cheimes.de>
References: <f7ajg5$pg5$1@sea.gmane.org>
	<ca471dc20707161423p106bfe15i29eb572dda2b07c2@mail.gmail.com>
	<469C19C5.3010006@cheimes.de>
Message-ID: <79990c6b0707170404x68b6b99cj6be77e4f8e65c82@mail.gmail.com>

On 17/07/07, Christian Heimes <lists at cheimes.de> wrote:
>    def write(self, s: str):
>        if self.closed:
>            raise ValueError("write to closed file")
>        if not hasattr(s, 'encode')
>            raise TypeError("str expected, got %r" % s)
>        ...
>
> ? It explains what is going wrong.

Surely the error should say that the object passed needs an encode
method, rather than that it should be a str?

Paul.

From ncoghlan at gmail.com  Tue Jul 17 14:15:31 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 17 Jul 2007 22:15:31 +1000
Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes
	in	py3k-struni
In-Reply-To: <469C19C5.3010006@cheimes.de>
References: <f7ajg5$pg5$1@sea.gmane.org>	<ca471dc20707161423p106bfe15i29eb572dda2b07c2@mail.gmail.com>
	<469C19C5.3010006@cheimes.de>
Message-ID: <469CB2E3.5070309@gmail.com>

Christian Heimes wrote:
> What do you think about
> 
>     def write(self, s: str):
>         if self.closed:
>             raise ValueError("write to closed file")
>         try:
>             b = s.encode(self._encoding)
>         except AttributeError:
>             raise TypeError("str expected, got %r" % s)
>         ...

The try/except here is a bit too broad - you only want to trap the 
attribute error.

That said, I'm not sure what error you could raise that would be clearer 
than complaining that the object passed in doesn't have an encode() method.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Tue Jul 17 16:25:30 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Jul 2007 07:25:30 -0700
Subject: [Python-3000] Heaptypes
In-Reply-To: <469C4B0B.50605@v.loewis.de>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
	<46979811.2050405@v.loewis.de>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
	<469C4B0B.50605@v.loewis.de>
Message-ID: <ca471dc20707170725q91bfba7p6f549a613c0c300e@mail.gmail.com>

Thanks! Can you add test_pickle to the wiki page?
(http://wiki.python.org/moin/Py3kStrUniTests)

On 7/16/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Guido van Rossum schrieb:
> > That sounds like a good idea to try. It may break some more tests but
> > those are all indications of places that incorrectly still require
> > str8.
> >
> >> I wonder whether the "s" specifier in CallFunction, BuildValue etc
> >> should create Unicode objects, rather than str8 objects.
>
> Done. I fixed a number of test cases that broke because of that.
> In particular, bytes.__reduce__ could not easily return str8 objects
> as its marshalling state anymore (and shouldn't do so, anyway).
> So I made bytes a builtin type of pickle, using the S code.
> As a consequence, a number of other types had to get fixed.
>
> So in total, it adds one new failure: something in test_pickle
> now complains that bytes objects are not hashable.
>
> Regards,
> Martin
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Jul 17 22:42:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 17 Jul 2007 22:42:54 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707170725q91bfba7p6f549a613c0c300e@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>	
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>	
	<46979811.2050405@v.loewis.de>	
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>	
	<469C4B0B.50605@v.loewis.de>
	<ca471dc20707170725q91bfba7p6f549a613c0c300e@mail.gmail.com>
Message-ID: <469D29CE.5050600@v.loewis.de>

Guido van Rossum schrieb:
> Thanks! Can you add test_pickle to the wiki page?
> (http://wiki.python.org/moin/Py3kStrUniTests)

Done!

Martin

From guido at python.org  Tue Jul 17 23:04:14 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Jul 2007 14:04:14 -0700
Subject: [Python-3000] Heaptypes
In-Reply-To: <469D29CE.5050600@v.loewis.de>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
	<46979811.2050405@v.loewis.de>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
	<469C4B0B.50605@v.loewis.de>
	<ca471dc20707170725q91bfba7p6f549a613c0c300e@mail.gmail.com>
	<469D29CE.5050600@v.loewis.de>
Message-ID: <ca471dc20707171404l40c28f9cr733930031123537@mail.gmail.com>

On 7/17/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Guido van Rossum schrieb:
> > Thanks! Can you add test_pickle to the wiki page?
> > (http://wiki.python.org/moin/Py3kStrUniTests)
>
> Done!

But now I'm confused. I don't see the failure. Are you sure you
checked in what you did? In the py3k-struni branch?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Jul 17 23:47:51 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Jul 2007 14:47:51 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <loom.20070713T201857-3@post.gmane.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
Message-ID: <ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>

On 7/13/07, Michele Simionato <michele.simionato at gmail.com> wrote:
> Phillip J. Eby <pje <at> telecommunity.com> writes:
> > For what it's worth, the pkgutil
> > module already contains an even simpler generic function
> > implementation than simplegeneric, and is already in the stdlib
> > albeit undocumented.
>
> Well, that is good to know. Personally I would be content with something
> at that level of sophistication (i.e. the absolute minimum). I think there
> is no much experience in the community with generic functions (except for
> you) and there is no danger in waiting and in acquiring more experience
> before including in the standard library a fully featured package. After
> all, RuleDispatch is already out there and there is no reason for putting
> everything in the stdlib. For the same reason, I am happy that Zope interfaces
> will stay out of the stdlib, and that we will have the much simpler ABC
> (of course one could argue that generic functions are better than ABC and
> actually I think so, but still ABC are a simpler entry point for most
> programmers, more in line with how Python has worked until now, and they
> will allows me to throw away an half-backed interface implementation I am
> using now, which is always a good thing ;)

Actually, I believe ABCs and GFs work well together, and I believe
Phillip has said so too.

Regarding the fate of PEP 3124, perhaps the right thing is to reject
the PEP, and be content with having GFs as a third party add-on? There
seems to be nothing particular about Python 3.0 as the point of
introduction of GFs anyway -- they can be introduced just as easily in
3.1 or 4.0 or any time later (or earlier, as Phillip's existing
implementation show).

I have one remaining question for Phillip: why is your design
"absolutely dependent on being able to modify functions in-place"?
That dependency would appear to make it harder to port the design to
other Python implementations whose function objects don't behave the
same way. I can see it as a philosophical desirable feature; but I
don't understand the technical need for it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed Jul 18 00:38:06 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 17 Jul 2007 18:38:06 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
Message-ID: <20070717223550.7B1B13A403A@sparrow.telecommunity.com>

At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote:
>I have one remaining question for Phillip: why is your design
>"absolutely dependent on being able to modify functions in-place"?
>That dependency would appear to make it harder to port the design to
>other Python implementations whose function objects don't behave the
>same way. I can see it as a philosophical desirable feature; but I
>don't understand the technical need for it.

It allows the framework to bootstrap via successive 
approximation.  Initially, the 'implies()' function is just a plain 
function, and then it later becomes a generic function.  (And of 
course it gets called in between those two points.)  The same happens 
for 'disjuncts()' and 'overrides()'.

Is it potentially possible that there's another way to do it, given 
enough restrictions on how other code uses the exported API and 
enough hackery during bootstrapping?  Perhaps, but I don't know of 
such a way.  The modification-in-place approach allows me to just 
write the functions and not care precisely when they become 
generic.  I still have to do a little extra special bootstrapping for 
implies(), because of its self-referential nature, but everything 
else I can pretty much blaze right on through with.

(By the way, AFAIK IronPython, Jython (2.2), and PyPy all support 
writable func_code attributes, so it's evidently practical to do so 
for reasonably dynamic Python implementations.)


>Regarding the fate of PEP 3124, perhaps the right thing is to reject
>the PEP, and be content with having GFs as a third party add-on?

I've also suggested simply deferring it.  I'd still like to see a 
"blessed" meta-API for generic functions at some point.

Also, as I've said, there's nothing stopping anybody from stepping up 
with a less-ambitious and less-controversial implementation based on 
your preferred API.  I just won't be able to get to it myself for a 
month or so.

(Also, nothing stops such a less-ambitious approach from being later 
folded into something more like my approach, with full extensibility 
and all the bells and whistles.  In the worst case, one could always 
make a backward compatibility layer that fakes the more limited API 
using the more general one, as long as the lesser API is a strict 
subset of the greater -- and I believe it is.)


>There seems to be nothing particular about Python 3.0 as the point of
>introduction of GFs anyway -- they can be introduced just as easily in
>3.1 or 4.0 or any time later (or earlier, as Phillip's existing
>implementation show).

Well, the one thing that might still be relevant is the "overloading 
inside classes" rule.  That's the only bit that has any effect on 
Python 3.0 semantics vis-a-vis metaclasses, class decorators, etc.

The way things currently stand for 3.0, I actually *won't* be able to 
make a GF implementation that handles the "first argument should be 
of the containing class" rule without users having an explicit 
metaclass or class decorator that supports it.

In 2.x, I take advantage of the ability of code run inside a class 
suite to change the enclosing class' __metaclass__; in 3.0, you can't 
do this anymore since the __metaclass__ doesn't come from the class 
suite, and there isn't a replacement hook.


From guido at python.org  Wed Jul 18 00:53:24 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Jul 2007 15:53:24 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070717223550.7B1B13A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
Message-ID: <ca471dc20707171553x69ebd106n2af86d47e2f6afc3@mail.gmail.com>

On 7/17/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote:
> >I have one remaining question for Phillip: why is your design
> >"absolutely dependent on being able to modify functions in-place"?
> >That dependency would appear to make it harder to port the design to
> >other Python implementations whose function objects don't behave the
> >same way. I can see it as a philosophical desirable feature; but I
> >don't understand the technical need for it.
>
> It allows the framework to bootstrap via successive
> approximation.  Initially, the 'implies()' function is just a plain
> function, and then it later becomes a generic function.  (And of
> course it gets called in between those two points.)  The same happens
> for 'disjuncts()' and 'overrides()'.

Why isn't it possible to mark these functions as explicitly
overloadable? I'm not sure I understand what you mean by
"bootstrapping".

> Is it potentially possible that there's another way to do it, given
> enough restrictions on how other code uses the exported API and
> enough hackery during bootstrapping?  Perhaps, but I don't know of
> such a way.  The modification-in-place approach allows me to just
> write the functions and not care precisely when they become
> generic.  I still have to do a little extra special bootstrapping for
> implies(), because of its self-referential nature, but everything
> else I can pretty much blaze right on through with.

I guess I'll have to reserve judgment until the implementation exists.

> (By the way, AFAIK IronPython, Jython (2.2), and PyPy all support
> writable func_code attributes, so it's evidently practical to do so
> for reasonably dynamic Python implementations.)

Fair enough, though I suspect that IronPython might use certain
optimizations that depend on func_code not being written. However, I
certainly don't know enough about it. Anyone familiar with IronPython
on this list care to comment?

> >Regarding the fate of PEP 3124, perhaps the right thing is to reject
> >the PEP, and be content with having GFs as a third party add-on?
>
> I've also suggested simply deferring it.  I'd still like to see a
> "blessed" meta-API for generic functions at some point.

I'll defer it. It seems you are the only one who can write such a
blessed meta-API, and I'm guessing that's the part of PEP 3124 that
was never completed.

> Also, as I've said, there's nothing stopping anybody from stepping up
> with a less-ambitious and less-controversial implementation based on
> your preferred API.  I just won't be able to get to it myself for a
> month or so.

I'm not sure anybody else cares enough to pre-empt you.

> (Also, nothing stops such a less-ambitious approach from being later
> folded into something more like my approach, with full extensibility
> and all the bells and whistles.  In the worst case, one could always
> make a backward compatibility layer that fakes the more limited API
> using the more general one, as long as the lesser API is a strict
> subset of the greater -- and I believe it is.)
>
>
> >There seems to be nothing particular about Python 3.0 as the point of
> >introduction of GFs anyway -- they can be introduced just as easily in
> >3.1 or 4.0 or any time later (or earlier, as Phillip's existing
> >implementation show).
>
> Well, the one thing that might still be relevant is the "overloading
> inside classes" rule.  That's the only bit that has any effect on
> Python 3.0 semantics vis-a-vis metaclasses, class decorators, etc.
>
> The way things currently stand for 3.0, I actually *won't* be able to
> make a GF implementation that handles the "first argument should be
> of the containing class" rule without users having an explicit
> metaclass or class decorator that supports it.
>
> In 2.x, I take advantage of the ability of code run inside a class
> suite to change the enclosing class' __metaclass__; in 3.0, you can't
> do this anymore since the __metaclass__ doesn't come from the class
> suite, and there isn't a replacement hook.

I don't understand enough of your implementation to understand this requirement.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alexandre at peadrop.com  Wed Jul 18 01:27:54 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 17 Jul 2007 19:27:54 -0400
Subject: [Python-3000] Introspection broken for objects using Py_FindMethod()
Message-ID: <acd65fa20707171627t29f9fc03p37164f3d87d94a25@mail.gmail.com>

Hi,

It is intentional that the introspection broken for objects using
Py_FindMethod()? For example:

   Python 3.0x (cpy_merge:56413:56414M, Jul 17 2007, 13:57:23)
   [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
   >>> import cPickle
   >>> dir(cPickle.Unpickler(file))
   []
   >>> dir(cPickle.Pickler(file))
   ['PicklingError', snip...]

Thanks,
-- Alexandre

From guido at python.org  Wed Jul 18 01:52:16 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Jul 2007 16:52:16 -0700
Subject: [Python-3000] Introspection broken for objects using
	Py_FindMethod()
In-Reply-To: <acd65fa20707171627t29f9fc03p37164f3d87d94a25@mail.gmail.com>
References: <acd65fa20707171627t29f9fc03p37164f3d87d94a25@mail.gmail.com>
Message-ID: <ca471dc20707171652w254d597bl9068abae61b64da4@mail.gmail.com>

On 7/17/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> Hi,
>
> It is intentional that the introspection broken for objects using
> Py_FindMethod()? For example:
>
>    Python 3.0x (cpy_merge:56413:56414M, Jul 17 2007, 13:57:23)
>    [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
>    >>> import cPickle
>    >>> dir(cPickle.Unpickler(file))
>    []
>    >>> dir(cPickle.Pickler(file))
>    ['PicklingError', snip...]

Yes, see a thread between me, Georg and Brett around March 7-10:

http://mail.python.org/pipermail/python-3000/2007-March/006061.html

I think the conclusion was to get rid of Py_FindMethod altogether. The
replacement isn't very hard. But it hasn't been done yet.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed Jul 18 02:27:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 17 Jul 2007 20:27:02 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707171553x69ebd106n2af86d47e2f6afc3@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<ca471dc20707171553x69ebd106n2af86d47e2f6afc3@mail.gmail.com>
Message-ID: <20070718002446.4B2763A403A@sparrow.telecommunity.com>

At 03:53 PM 7/17/2007 -0700, Guido van Rossum wrote:
>On 7/17/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote:
> > >I have one remaining question for Phillip: why is your design
> > >"absolutely dependent on being able to modify functions in-place"?
> > >That dependency would appear to make it harder to port the design to
> > >other Python implementations whose function objects don't behave the
> > >same way. I can see it as a philosophical desirable feature; but I
> > >don't understand the technical need for it.
> >
> > It allows the framework to bootstrap via successive
> > approximation.  Initially, the 'implies()' function is just a plain
> > function, and then it later becomes a generic function.  (And of
> > course it gets called in between those two points.)  The same happens
> > for 'disjuncts()' and 'overrides()'.
>
>Why isn't it possible to mark these functions as explicitly
>overloadable?

How would I ever add rules to them, if I need them to already be 
callable in order to add any rules in the first place?  :)  (In 
practice, things are even hairier, because I also sometimes need to 
call these functions *while they are already being called*, if 
there's no cache hit!)

This is partly a consequence of splitting responsibilities between 
"rule sets" and "dispatch engines".  PEAK-Rules wants to be able to 
use a simple type-tuple dispatcher (like your prototype), but also 
upgrade to fancier engines as required for specific functions, 
without changing the rules already registered for the function.  So 
it treats the set of overloads as a separate object from the engine 
that actually implements dispatching.  That way, you can upgrade the 
engine, even while keeping the rules.

However, to populate a rule set, you need to know the disjuncts() of 
a rule...  so you could never add the default rule to disjuncts() 
without a default rule already being there.

None of this is relevant for a design that doesn't care about having 
more than one supported implementation, though, which is why a 
reduced-in-scope implementation that's not trying to be a universal 
API can just ignore all of this.

(Heck, disjuncts() wouldn't even be needed in an implementation that 
wasn't trying to support arbitrary engine extensions, since its 
purpose is to list the "or"-ed conditions of a rule that can be 
fulfilled in more than one way.)


> > Well, the one thing that might still be relevant is the "overloading
> > inside classes" rule.  That's the only bit that has any effect on
> > Python 3.0 semantics vis-a-vis metaclasses, class decorators, etc.
> >
> > The way things currently stand for 3.0, I actually *won't* be able to
> > make a GF implementation that handles the "first argument should be
> > of the containing class" rule without users having an explicit
> > metaclass or class decorator that supports it.
> >
> > In 2.x, I take advantage of the ability of code run inside a class
> > suite to change the enclosing class' __metaclass__; in 3.0, you can't
> > do this anymore since the __metaclass__ doesn't come from the class
> > suite, and there isn't a replacement hook.
>
>I don't understand enough of your implementation to understand this 
>requirement.

This part would actually be relevant even for a scaled-down 
non-extensible implementation.

The requirement is this: overloads defined in a class need to 
implicitly treat the first argument of the overloading method as if 
it were explicitly declared "self: EnclosingClass".

In order to do this, the equivalent code in RuleDispatch currently 
sticks a temporary metaclass into the class locals(), so that it can 
defer the overload operation until after the class exists.  Then it 
adds in the class to the overload registration.

This could be handled by any other sort of mechanism that would allow 
code in a class body to register a callback to receive the created 
class.  A custom metaclass or class decorator would certainly do the 
trick, but then you have do something like:

      @class_contains_overloads
      class Something:

          @some_function.overload
          def blah(self, ...):
              yadda()

It'd be nice to be able to skip the redundant class decorator, as 
it's not adding any useful information for the reader, and forgetting 
it will produce a bug.  So if method decorators were allowed to 
request class decorators to be added, that would be the simplest way 
to manage this.

However, if this has to wait for 3.1, it's no big deal.


From jimjjewett at gmail.com  Wed Jul 18 03:04:01 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 17 Jul 2007 21:04:01 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070718002446.4B2763A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<ca471dc20707171553x69ebd106n2af86d47e2f6afc3@mail.gmail.com>
	<20070718002446.4B2763A403A@sparrow.telecommunity.com>
Message-ID: <fb6fbf560707171804n2d60958dq89a0726a53c16c84@mail.gmail.com>

On 7/17/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote:
> >I have one remaining question for Phillip: why is your design
> >"absolutely dependent on being able to modify functions in-place"?

> It allows the framework to bootstrap via successive
> approximation.  Initially, the 'implies()' function is just a plain

Would it work to make the original 'implies()' something other than an
ordinary function?  I realize that you prefer being able to overload
anything, but it seems that you *could* mark the ones you'll need to
overload as part of bootstrapping.

> In 2.x, I take advantage of the ability of code run inside a class
> suite to change the enclosing class' __metaclass__; in 3.0,

What was missing from the __class__ attribute that you get from the
super PEP fail?  Was it that you wanted access to the class while
defining the class, before the method is ever called?

Why can't an ordinary class decorator work?  Is it because you want
the funky stuff to be conditional?  If so, is that really required?

Or are you just objecting to the fact that metaclasses like this won't
be the default?

-jJ

From greg.ewing at canterbury.ac.nz  Wed Jul 18 03:37:10 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 18 Jul 2007 13:37:10 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070717223550.7B1B13A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
Message-ID: <469D6EC6.9010005@canterbury.ac.nz>

Phillip J. Eby wrote:
> It allows the framework to bootstrap via successive 
> approximation.  Initially, the 'implies()' function is just a plain 
> function, and then it later becomes a generic function.  (And of 
> course it gets called in between those two points.)  The same happens 
> for 'disjuncts()' and 'overrides()'.

But you know from the outset that these functions will
eventually become generic, so why can't they be defined
as some callable object that can have its insides
switched, if you're on a Python whose normal function
objects don't allow that?

--
Greg

From pje at telecommunity.com  Wed Jul 18 04:03:20 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 17 Jul 2007 22:03:20 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <fb6fbf560707171804n2d60958dq89a0726a53c16c84@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<ca471dc20707171553x69ebd106n2af86d47e2f6afc3@mail.gmail.com>
	<20070718002446.4B2763A403A@sparrow.telecommunity.com>
	<fb6fbf560707171804n2d60958dq89a0726a53c16c84@mail.gmail.com>
Message-ID: <20070718020107.EA7123A403A@sparrow.telecommunity.com>

At 09:04 PM 7/17/2007 -0400, Jim Jewett wrote:
>On 7/17/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote:
> > >I have one remaining question for Phillip: why is your design
> > >"absolutely dependent on being able to modify functions in-place"?
>
> > It allows the framework to bootstrap via successive
> > approximation.  Initially, the 'implies()' function is just a plain
>
>Would it work to make the original 'implies()' something other than an
>ordinary function?  I realize that you prefer being able to overload
>anything, but it seems that you *could* mark the ones you'll need to
>overload as part of bootstrapping.

Fair enough.  The design is still dependent on modifying "functions" 
in place, for some value of "function".  It just never occurred to me 
to introduce a *third* type of "function", besides the two already 
being dealt with (i.e., standard functions and generic 
functions).  Even *thinking* about the idea right now is like 
fingernails on a chalkboard to me, so I can see why it didn't occur to me.  :)


> > In 2.x, I take advantage of the ability of code run inside a class
> > suite to change the enclosing class' __metaclass__; in 3.0,
>
>What was missing from the __class__ attribute that you get from the
>super PEP fail?  Was it that you wanted access to the class while
>defining the class, before the method is ever called?

Correct; you need access to it before the method is called, since 
it's to add an overload to a generic function.


>Why can't an ordinary class decorator work?

It can; it's just noise.


>   Is it because you want
>the funky stuff to be conditional?  If so, is that really required?

I don't understand what you mean by "funky stuff" or "conditional", here.


>Or are you just objecting to the fact that metaclasses like this won't
>be the default?

The idea is to make it so that using generic functions doesn't 
require a bunch of extra bookkeeping, like adding metaclasses or 
decorators.  Metaclasses are particularly problematic in that mixing 
multiple metaclasses is not an activity for novice wizards.

That's why I don't use that approach in today's Python: I can safely 
wizard around the problem using pseudo-metaclasses, such that the 
user's metaclasses aren't touched.

Post-PEP 3115, however, it won't be an option any more, and you'll at 
least need a boilerplate decorator for it to work, and it'll silently 
break without it, giving absolutely no clue as to the problem.


From pje at telecommunity.com  Wed Jul 18 04:05:25 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 17 Jul 2007 22:05:25 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <469D6EC6.9010005@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<469D6EC6.9010005@canterbury.ac.nz>
Message-ID: <20070718020310.2168A3A403A@sparrow.telecommunity.com>

At 01:37 PM 7/18/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > It allows the framework to bootstrap via successive
> > approximation.  Initially, the 'implies()' function is just a plain
> > function, and then it later becomes a generic function.  (And of
> > course it gets called in between those two points.)  The same happens
> > for 'disjuncts()' and 'overrides()'.
>
>But you know from the outset that these functions will
>eventually become generic, so why can't they be defined
>as some callable object that can have its insides
>switched, if you're on a Python whose normal function
>objects don't allow that?

Well, phrased that way, it sounds like a justification for treating 
it as a porting strategy for such Pythons.  The library could just 
use a "copy_code(srcfunc, dstfunc)" function that's implemented 
differently on different Pythons.


From martin at v.loewis.de  Wed Jul 18 04:29:14 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 Jul 2007 04:29:14 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707171404l40c28f9cr733930031123537@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>	
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>	
	<46979811.2050405@v.loewis.de>	
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>	
	<469C4B0B.50605@v.loewis.de>	
	<ca471dc20707170725q91bfba7p6f549a613c0c300e@mail.gmail.com>	
	<469D29CE.5050600@v.loewis.de>
	<ca471dc20707171404l40c28f9cr733930031123537@mail.gmail.com>
Message-ID: <469D7AFA.5030505@v.loewis.de>

> But now I'm confused. I don't see the failure. Are you sure you
> checked in what you did? In the py3k-struni branch?

Oops, no. The commit was rejected because it was not
whitespace-normalized correctly, and I didn't notice.

Now I tried again.

Martin

From unknown_kev_cat at hotmail.com  Tue Jul 17 19:16:42 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Tue, 17 Jul 2007 13:16:42 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
Message-ID: <f7ithr$lrr$1@sea.gmane.org>

Building Py3k_struni under Cygwin I've noticed a few more tests failing than 
the wiki shows.
These are using SVN revision 56413.

Some spurious errors seem to occur if Python/ is not remaned temporally. I 
have not included those. (This is an oddity of the cygwin '.exe' 
autohandling combined with case-insensitivity)


Test_coding: Errors. Traceback included at end of message.
"test test_descr failed -- ['foo\u1234bar'] slots not caught"
"test test_largefile failed -- got b'z', but expected 'z'"
test_marshal: Tests that fail are fasiling with a recursion limit exceeded 
error.


Tracebacks:

test test_coding failed -- Traceback (most recent call last):
  File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in 
test_bad_c
oding2
    self.verify_bad_module(module_name)
  File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in 
verify_bad
_module
    text = fp.read()
  File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read
    res += decoder.decode(self.buffer.read(), True)
  File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: 
ordinal
not in range(128)


Just a heads up. 


From martin at v.loewis.de  Wed Jul 18 05:36:05 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 Jul 2007 05:36:05 +0200
Subject: [Python-3000] Invalid \U escape in source code give
 hard-to-trace error
In-Reply-To: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
Message-ID: <469D8AA5.1080502@v.loewis.de>

> When a source file contains a string literal with an out-of-range \U
> escape (e.g. "\U12345678"), instead of a syntax error pointing to the
> offending literal, I get this, without any indication of the file or
> line:
> 
> UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
> position 0-9: illegal Unicode character
> 
> This is quite hard to track down.

I think the fundamental flaw is that a codec is used to implement
the Python syntax (or, rather, lexical rules).

Not quite sure what the rationale for this design was; doing it on
the lexical level is (was) tricky because \u escapes were allowed
only for Unicode literals, and the lexer had no knowledge of the
prefix preceding a literal. (In 3k, it's still similar, because
\U escapes have no effect in bytes and raw literals).

Still, even if it is "only" handled at the parsing level, I
don't see why it needs to be a codec. Instead, implementing
escapes in the compiler would still allow for proper diagnostics
(notice that in the AST the original lexical form of the string
literal is gone).

> (Both the location of the bad
> literal in the source file, and the origin of the error in the parser.
> :-) Can someone come up with a fix?

The language definition makes it difficult to fix it where I would
consider the "proper" place, i.e. in the tokenization:

http://docs.python.org/ref/strings.html

says that escapeseq is "\" <any ASCII character>. So
"\x" is a valid shortstring.

Then it becomes fuzzy: It says that any unrecognized escape
sequences are left in the string. While that appears like a clear
specification, it is not implemented (and has not since Python
2.0 anymore). According to the spec, '\U12345678' is well-formed,
and denotes the same string as '\\U12345678'.

I now see the following choices:
1. Restore implementing the spec again. Stop complaining about
   invalid escapes for \x and \U, and just interpret the \
   as '\\'. In this case, the current design could be left in
   place, and the codecs would just stop raising these errors.
2. Change the spec to make it an error if \x is not followed
   by two hex digits, \u not by four hex digits, \U not by
   8, or the value denoted by the \U digits is out of range.
   In this case, I would propose to move the lexical analysis
   back into the parser, or just make an internal API that
   will raise a proper SyntaxError (it will be tricky to
   compute the column in the original source line, though).
3. Change the spec to make constrain escapeseq, giving up
   the rule that uninterpreted escapes silently become
   two characters. That's difficult to write down in EBNF,
   so should be formulated through constraints in natural
   language. The lexer would have to keep track of what kind
   of literal it is processing, and reject invalid escapes
   directly on source level.
There are probably other options as well.

Regards,
Martin

From martin at v.loewis.de  Wed Jul 18 05:37:56 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 Jul 2007 05:37:56 +0200
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <ca471dc20707160716hcf583c4uca822775f19b9987@mail.gmail.com>
References: <f7e24l$92e$1@sea.gmane.org>	<ca471dc20707151922i1894355fh5118d07aa68abb65@mail.gmail.com>	<f7fkf5$a0n$1@sea.gmane.org>
	<ca471dc20707160716hcf583c4uca822775f19b9987@mail.gmail.com>
Message-ID: <469D8B14.4050907@v.loewis.de>

> BTW I hope to see more core developers from Europe at EuroPython next year!

It's always difficult to get there for me, as it takes place during the
semester :-(

Regards,
Martin


From kbk at shore.net  Wed Jul 18 08:04:13 2007
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed, 18 Jul 2007 02:04:13 -0400
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
	(Guido van Rossum's message of "Sun, 15 Jul 2007 07:17:00 -0700")
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
Message-ID: <87odia5jtu.fsf@hydra.bayview.thirdcreek.com>

"Guido van Rossum" <guido at python.org> writes:

> When a source file contains a string literal with an out-of-range \U
> escape (e.g. "\U12345678"), instead of a syntax error pointing to the
> offending literal, I get this, without any indication of the file or
> line:
>
> UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
> position 0-9: illegal Unicode character
>
> This is quite hard to track down. (Both the location of the bad
> literal in the source file, and the origin of the error in the parser.
> :-) Can someone come up with a fix?
>
> I note that raw escapes show a slightly different error. I also note
> that the same issue exists for u"..." literals in Python 2.5.

For what it's worth, I posted a patch to ast.c against the 2.6 trunk
which massages the unicode exception into a SyntaxError showing the
location.

That approach lets unicodeobject.c handle the gory details while ast.c
handles the SyntaxError generation.  It might be a solution until
something deeper along the lines of Martin's thoughts is possibly
developed.

I don't think that any reference adjustments are needed, but someone
should check the patch.

www.python.org/sf/1755885

-- 
KBK

From kbk at shore.net  Wed Jul 18 08:04:13 2007
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed, 18 Jul 2007 02:04:13 -0400
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
	(Guido van Rossum's message of "Sun, 15 Jul 2007 07:17:00 -0700")
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
Message-ID: <87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com>

"Guido van Rossum" <guido at python.org> writes:

> When a source file contains a string literal with an out-of-range \U
> escape (e.g. "\U12345678"), instead of a syntax error pointing to the
> offending literal, I get this, without any indication of the file or
> line:
>
> UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
> position 0-9: illegal Unicode character
>
> This is quite hard to track down. (Both the location of the bad
> literal in the source file, and the origin of the error in the parser.
> :-) Can someone come up with a fix?
>
> I note that raw escapes show a slightly different error. I also note
> that the same issue exists for u"..." literals in Python 2.5.

For what it's worth, I posted a patch to ast.c against the 2.6 trunk
which massages the unicode exception into a SyntaxError showing the
location.

That approach lets unicodeobject.c handle the gory details while ast.c
handles the SyntaxError generation.  It might be a solution until
something deeper along the lines of Martin's thoughts is possibly
developed.

I don't think that any reference adjustments are needed, but someone
should check the patch.

www.python.org/sf/1755885

-- 
KBK

From amauryfa at gmail.com  Wed Jul 18 10:20:36 2007
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 18 Jul 2007 10:20:36 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f7ithr$lrr$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>
Message-ID: <e27efe130707180120l4bd674bcg8cda3a0c2ec8b5bf@mail.gmail.com>

Hello,

2007/7/17, Joe Smith wrote:
> Building Py3k_struni under Cygwin I've noticed a few more tests failing than
> the wiki shows.
> These are using SVN revision 56413.
>
> Some spurious errors seem to occur if Python/ is not remaned temporally. I
> have not included those. (This is an oddity of the cygwin '.exe'
> autohandling combined with case-insensitivity)

For this, I have added a line to runtests.sh:

# Choose the Python binary.
case `uname` in
Darwin) PYTHON=./python.exe;;
CYGWIN*) PYTHON=./python.exe;;
*)      PYTHON=./python;;
esac

Hope this helps,

-- 
Amaury Forgeot d'Arc

From guido at python.org  Wed Jul 18 18:47:13 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 09:47:13 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070718020310.2168A3A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<469D6EC6.9010005@canterbury.ac.nz>
	<20070718020310.2168A3A403A@sparrow.telecommunity.com>
Message-ID: <ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com>

On 7/17/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 01:37 PM 7/18/2007 +1200, Greg Ewing wrote:
> >Phillip J. Eby wrote:
> > > It allows the framework to bootstrap via successive
> > > approximation.  Initially, the 'implies()' function is just a plain
> > > function, and then it later becomes a generic function.  (And of
> > > course it gets called in between those two points.)  The same happens
> > > for 'disjuncts()' and 'overrides()'.
> >
> >But you know from the outset that these functions will
> >eventually become generic, so why can't they be defined
> >as some callable object that can have its insides
> >switched, if you're on a Python whose normal function
> >objects don't allow that?
>
> Well, phrased that way, it sounds like a justification for treating
> it as a porting strategy for such Pythons.  The library could just
> use a "copy_code(srcfunc, dstfunc)" function that's implemented
> differently on different Pythons.

Sorry, but I'm still totally uncomfortable with this. While I admit
the feature exists, I really, really, really don't want it to be used
on a regular basis. As long as Phillip calls a counterproposal
"fingernails on a chalkboard", I call this unpythonic.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 18 18:59:46 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 09:59:46 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <e27efe130707180120l4bd674bcg8cda3a0c2ec8b5bf@mail.gmail.com>
References: <f7ithr$lrr$1@sea.gmane.org>
	<e27efe130707180120l4bd674bcg8cda3a0c2ec8b5bf@mail.gmail.com>
Message-ID: <ca471dc20707180959n6a8f971dqb92982fe2fdaade5@mail.gmail.com>

On 7/18/07, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote:
> Hello,
>
> 2007/7/17, Joe Smith wrote:
> > Building Py3k_struni under Cygwin I've noticed a few more tests failing than
> > the wiki shows.
> > These are using SVN revision 56413.
> >
> > Some spurious errors seem to occur if Python/ is not remaned temporally. I
> > have not included those. (This is an oddity of the cygwin '.exe'
> > autohandling combined with case-insensitivity)
>
> For this, I have added a line to runtests.sh:
>
> # Choose the Python binary.
> case `uname` in
> Darwin) PYTHON=./python.exe;;
> CYGWIN*) PYTHON=./python.exe;;
> *)      PYTHON=./python;;
> esac

This is now committed to Subversion: (r56440).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 18 19:02:07 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 10:02:07 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f7ithr$lrr$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>
Message-ID: <ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>

On 7/17/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> Building Py3k_struni under Cygwin I've noticed a few more tests failing than
> the wiki shows.
> These are using SVN revision 56413.
>
> Some spurious errors seem to occur if Python/ is not remaned temporally. I
> have not included those. (This is an oddity of the cygwin '.exe'
> autohandling combined with case-insensitivity)
>
>
> Test_coding: Errors. Traceback included at end of message.
> "test test_descr failed -- ['foo\u1234bar'] slots not caught"
> "test test_largefile failed -- got b'z', but expected 'z'"
> test_marshal: Tests that fail are fasiling with a recursion limit exceeded
> error.
>
>
>
> Tracebacks:
>
> test test_coding failed -- Traceback (most recent call last):
>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in
> test_bad_c
> oding2
>     self.verify_bad_module(module_name)
>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in
> verify_bad
> _module
>     text = fp.read()
>   File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read
>     res += decoder.decode(self.buffer.read(), True)
>   File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in decode
>     return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0:
> ordinal
> not in range(128)

The test_descr and test_largefile failures are reproducible on Ubuntu
and someone will eventually fix them.

I can't reproduce the test_marshal and test_coding failures; please
investigate more on CYGWIN.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 18 19:27:01 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 10:27:01 -0700
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com>
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
	<87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <ca471dc20707181027u182550dyaaf362fc718dd883@mail.gmail.com>

On 7/17/07, Kurt B. Kaiser <kbk at shore.net> wrote:
> For what it's worth, I posted a patch to ast.c against the 2.6 trunk
> which massages the unicode exception into a SyntaxError showing the
> location.
>
> That approach lets unicodeobject.c handle the gory details while ast.c
> handles the SyntaxError generation.  It might be a solution until
> something deeper along the lines of Martin's thoughts is possibly
> developed.
>
> I don't think that any reference adjustments are needed, but someone
> should check the patch.
>
> www.python.org/sf/1755885

Thanks! Checked in, and merged into p3yk.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed Jul 18 19:27:49 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 18 Jul 2007 13:27:49 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com
 >
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<469D6EC6.9010005@canterbury.ac.nz>
	<20070718020310.2168A3A403A@sparrow.telecommunity.com>
	<ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com>
Message-ID: <20070718172907.861383A40A4@sparrow.telecommunity.com>

At 09:47 AM 7/18/2007 -0700, Guido van Rossum wrote:
>Sorry, but I'm still totally uncomfortable with this. While I admit
>the feature exists, I really, really, really don't want it to be used
>on a regular basis. As long as Phillip calls a counterproposal
>"fingernails on a chalkboard", I call this unpythonic.

I didn't say I wouldn't *do* it, I just explained why I'd have never 
come up with the idea on my own.  I don't have to like something in 
order to do it, though of course it helps.  :)


From guido at python.org  Wed Jul 18 19:31:53 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 10:31:53 -0700
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <469D8AA5.1080502@v.loewis.de>
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
	<469D8AA5.1080502@v.loewis.de>
Message-ID: <ca471dc20707181031sa2339a4u4900de65a549c4e2@mail.gmail.com>

On 7/17/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > When a source file contains a string literal with an out-of-range \U
> > escape (e.g. "\U12345678"), instead of a syntax error pointing to the
> > offending literal, I get this, without any indication of the file or
> > line:
> >
> > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
> > position 0-9: illegal Unicode character
> >
> > This is quite hard to track down.
>
> I think the fundamental flaw is that a codec is used to implement
> the Python syntax (or, rather, lexical rules).
>
> Not quite sure what the rationale for this design was; doing it on
> the lexical level is (was) tricky because \u escapes were allowed
> only for Unicode literals, and the lexer had no knowledge of the
> prefix preceding a literal. (In 3k, it's still similar, because
> \U escapes have no effect in bytes and raw literals).
>
> Still, even if it is "only" handled at the parsing level, I
> don't see why it needs to be a codec. Instead, implementing
> escapes in the compiler would still allow for proper diagnostics
> (notice that in the AST the original lexical form of the string
> literal is gone).

I guess because it was deemed useful to have a codec for this purpose
too, thereby exposing the algorithm to Python code that needs the same
functionality (e.g. the compiler package, RIP).

> > (Both the location of the bad
> > literal in the source file, and the origin of the error in the parser.
> > :-) Can someone come up with a fix?
>
> The language definition makes it difficult to fix it where I would
> consider the "proper" place, i.e. in the tokenization:
>
> http://docs.python.org/ref/strings.html
>
> says that escapeseq is "\" <any ASCII character>. So
> "\x" is a valid shortstring.
>
> Then it becomes fuzzy: It says that any unrecognized escape
> sequences are left in the string. While that appears like a clear
> specification, it is not implemented (and has not since Python
> 2.0 anymore). According to the spec, '\U12345678' is well-formed,
> and denotes the same string as '\\U12345678'.
>
> I now see the following choices:
> 1. Restore implementing the spec again. Stop complaining about
>    invalid escapes for \x and \U, and just interpret the \
>    as '\\'. In this case, the current design could be left in
>    place, and the codecs would just stop raising these errors.

Sounds like a bad idea. I think \xNN (where N is not a hex digit) once
behaved this way, and it was changed to explicitly complain instead as
a service to users.

> 2. Change the spec to make it an error if \x is not followed
>    by two hex digits, \u not by four hex digits, \U not by
>    8, or the value denoted by the \U digits is out of range.
>    In this case, I would propose to move the lexical analysis
>    back into the parser, or just make an internal API that
>    will raise a proper SyntaxError (it will be tricky to
>    compute the column in the original source line, though).

I'm all in favor of this spec change. Eventually we should change the
lexer to do this right; for now, Kurt's patch is good enough.

> 3. Change the spec to make constrain escapeseq, giving up
>    the rule that uninterpreted escapes silently become
>    two characters. That's difficult to write down in EBNF,
>    so should be formulated through constraints in natural
>    language. The lexer would have to keep track of what kind
>    of literal it is processing, and reject invalid escapes
>    directly on source level.

-1

> There are probably other options as well.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From unknown_kev_cat at hotmail.com  Wed Jul 18 19:56:07 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Wed, 18 Jul 2007 13:56:07 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org>
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>
Message-ID: <f7lk7q$9m6$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc20707181002w64e076aco9a509ec7e4e15b9a at mail.gmail.com...
> On 7/17/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>> Building Py3k_struni under Cygwin I've noticed a few more tests failing 
>> than
>> the wiki shows.
>> These are using SVN revision 56413.
>>
>> Some spurious errors seem to occur if Python/ is not remaned temporally. 
>> I
>> have not included those. (This is an oddity of the cygwin '.exe'
>> autohandling combined with case-insensitivity)
>>
>>
>> Test_coding: Errors. Traceback included at end of message.
>> "test test_descr failed -- ['foo\u1234bar'] slots not caught"
>> "test test_largefile failed -- got b'z', but expected 'z'"
>> test_marshal: Tests that fail are fasiling with a recursion limit 
>> exceeded
>> error.
>>
>>
>>
>> Tracebacks:
>>
>> test test_coding failed -- Traceback (most recent call last):
>>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in
>> test_bad_c
>> oding2
>>     self.verify_bad_module(module_name)
>>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in
>> verify_bad
>> _module
>>     text = fp.read()
>>   File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read
>>     res += decoder.decode(self.buffer.read(), True)
>>   File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in 
>> decode
>>     return codecs.ascii_decode(input, self.errors)[0]
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0:
>> ordinal
>> not in range(128)
>
> The test_descr and test_largefile failures are reproducible on Ubuntu
> and someone will eventually fix them.
>
> I can't reproduce the test_marshal and test_coding failures; please
> investigate more on CYGWIN.

For the test coding, apprently the module's contents are intended to be 
loaded, and then verified that a syntax error occurs when trying to parse 
the module. However, on cygwin i'm consistantly getting an error on the line 
that reads the file. Specificly fp.read().

Fp.read() appears to be trying to export a unicode string by interpreting 
the byte string as ascii. The byte string is most certainly not valid ascii. 
So the codec throws an error. I'm guessing for some reason python normally 
chose a different codec, but on my cygwin compiles it is choosing ascii. I'm 
not sure why. Nor am I sure how to inestigate further.


Heres a fairly useless loking traceback for test_marshal. Many of the tests 
fail with nearly identical tracebacks:

#======================================================================
#ERROR: test_tuple (test.test_marshal.ContainerTestCase)
#----------------------------------------------------------------------
#Traceback (most recent call last):
#  File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 134, in 
test_tuple
#   self.helper(tuple(self.d.keys()))
# File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 21, in 
helper
#   new = marshal.load(f)
#ValueError: recursion limit exceeded

For what it's worth here is the fll subtest list and status for 
test_marshal:

#test_bool (test.test_marshal.IntTestCase) ... ERROR
#test_int64 (test.test_marshal.IntTestCase) ... ok
#test_ints (test.test_marshal.IntTestCase) ... ERROR
#test_floats (test.test_marshal.FloatTestCase) ... ERROR
#test_buffer (test.test_marshal.StringTestCase) ... ERROR
#test_string (test.test_marshal.StringTestCase) ... ERROR
#test_unicode (test.test_marshal.StringTestCase) ... ERROR
#test_code (test.test_marshal.CodeTestCase) ... ok
#test_dict (test.test_marshal.ContainerTestCase) ... ERROR
#test_list (test.test_marshal.ContainerTestCase) ... ERROR
#test_sets (test.test_marshal.ContainerTestCase) ... ERROR
#test_tuple (test.test_marshal.ContainerTestCase) ... ERROR
#test_exceptions (test.test_marshal.ExceptionTestCase) ... ok
#test_bug_5888452 (test.test_marshal.BugsTestCase) ... ok
#test_fuzz (test.test_marshal.BugsTestCase) ... ok
#test_loads_recursion (test.test_marshal.BugsTestCase) ... ok
#test_patch_873224 (test.test_marshal.BugsTestCase) ... ok
#test_recursion_limit (test.test_marshal.BugsTestCase) ... ok
#test_version_argument (test.test_marshal.BugsTestCase) ... ok

I'm wondering if the recusion limit on my build is getting set too low 
somehow.


From guido at python.org  Wed Jul 18 20:13:24 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 11:13:24 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f7lk7q$9m6$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>
	<f7lk7q$9m6$1@sea.gmane.org>
Message-ID: <ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>

On 7/18/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>
> "Guido van Rossum" <guido at python.org> wrote in message
> news:ca471dc20707181002w64e076aco9a509ec7e4e15b9a at mail.gmail.com...
> > On 7/17/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> >> Building Py3k_struni under Cygwin I've noticed a few more tests failing
> >> than
> >> the wiki shows.
> >> These are using SVN revision 56413.
> >>
> >> Some spurious errors seem to occur if Python/ is not remaned temporally.
> >> I
> >> have not included those. (This is an oddity of the cygwin '.exe'
> >> autohandling combined with case-insensitivity)
> >>
> >>
> >> Test_coding: Errors. Traceback included at end of message.
> >> "test test_descr failed -- ['foo\u1234bar'] slots not caught"
> >> "test test_largefile failed -- got b'z', but expected 'z'"
> >> test_marshal: Tests that fail are fasiling with a recursion limit
> >> exceeded
> >> error.
> >>
> >>
> >>
> >> Tracebacks:
> >>
> >> test test_coding failed -- Traceback (most recent call last):
> >>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in
> >> test_bad_c
> >> oding2
> >>     self.verify_bad_module(module_name)
> >>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in
> >> verify_bad
> >> _module
> >>     text = fp.read()
> >>   File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read
> >>     res += decoder.decode(self.buffer.read(), True)
> >>   File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in
> >> decode
> >>     return codecs.ascii_decode(input, self.errors)[0]
> >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0:
> >> ordinal
> >> not in range(128)
> >
> > The test_descr and test_largefile failures are reproducible on Ubuntu
> > and someone will eventually fix them.
> >
> > I can't reproduce the test_marshal and test_coding failures; please
> > investigate more on CYGWIN.
>
> For the test coding, apprently the module's contents are intended to be
> loaded, and then verified that a syntax error occurs when trying to parse
> the module. However, on cygwin i'm consistantly getting an error on the line
> that reads the file. Specificly fp.read().
>
> Fp.read() appears to be trying to export a unicode string by interpreting
> the byte string as ascii. The byte string is most certainly not valid ascii.
> So the codec throws an error. I'm guessing for some reason python normally
> chose a different codec, but on my cygwin compiles it is choosing ascii. I'm
> not sure why. Nor am I sure how to inestigate further.

The encoding defaults to the filesystem encoding or otherwise Latin-1.
 There's an XXX comment in io.py, in TextIOWrapper.__init__, admitting
this is questionable. I'm guessing CYGWIN has a filesystem encoding
equal to ASCII? Is this a good idea?

Maybe the default encoding should always be UTF-8 (matching the source
code default encoding).

I can also fix it by changing test_coding.py to add encoding="utf-8"
to the open() call in verify_bad_module().

> Heres a fairly useless loking traceback for test_marshal. Many of the tests
> fail with nearly identical tracebacks:
>
> #======================================================================
> #ERROR: test_tuple (test.test_marshal.ContainerTestCase)
> #----------------------------------------------------------------------
> #Traceback (most recent call last):
> #  File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 134, in
> test_tuple
> #   self.helper(tuple(self.d.keys()))
> # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 21, in
> helper
> #   new = marshal.load(f)
> #ValueError: recursion limit exceeded
>
> For what it's worth here is the fll subtest list and status for
> test_marshal:
>
> #test_bool (test.test_marshal.IntTestCase) ... ERROR
> #test_int64 (test.test_marshal.IntTestCase) ... ok
> #test_ints (test.test_marshal.IntTestCase) ... ERROR
> #test_floats (test.test_marshal.FloatTestCase) ... ERROR
> #test_buffer (test.test_marshal.StringTestCase) ... ERROR
> #test_string (test.test_marshal.StringTestCase) ... ERROR
> #test_unicode (test.test_marshal.StringTestCase) ... ERROR
> #test_code (test.test_marshal.CodeTestCase) ... ok
> #test_dict (test.test_marshal.ContainerTestCase) ... ERROR
> #test_list (test.test_marshal.ContainerTestCase) ... ERROR
> #test_sets (test.test_marshal.ContainerTestCase) ... ERROR
> #test_tuple (test.test_marshal.ContainerTestCase) ... ERROR
> #test_exceptions (test.test_marshal.ExceptionTestCase) ... ok
> #test_bug_5888452 (test.test_marshal.BugsTestCase) ... ok
> #test_fuzz (test.test_marshal.BugsTestCase) ... ok
> #test_loads_recursion (test.test_marshal.BugsTestCase) ... ok
> #test_patch_873224 (test.test_marshal.BugsTestCase) ... ok
> #test_recursion_limit (test.test_marshal.BugsTestCase) ... ok
> #test_version_argument (test.test_marshal.BugsTestCase) ... ok
>
> I'm wondering if the recusion limit on my build is getting set too low
> somehow.

Can you find out what it is? sys.getrecursionlimit().

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alexandre at peadrop.com  Wed Jul 18 20:27:40 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Wed, 18 Jul 2007 14:27:40 -0400
Subject: [Python-3000] Introspection broken for objects using
	Py_FindMethod()
In-Reply-To: <ca471dc20707171652w254d597bl9068abae61b64da4@mail.gmail.com>
References: <acd65fa20707171627t29f9fc03p37164f3d87d94a25@mail.gmail.com>
	<ca471dc20707171652w254d597bl9068abae61b64da4@mail.gmail.com>
Message-ID: <acd65fa20707181127w3507c064rc02e6241c24d86f2@mail.gmail.com>

On 7/17/07, Guido van Rossum <guido at python.org> wrote:
> Yes, see a thread between me, Georg and Brett around March 7-10:
>
> http://mail.python.org/pipermail/python-3000/2007-March/006061.html
>

Thanks for the pointer.

> I think the conclusion was to get rid of Py_FindMethod altogether. The
> replacement isn't very hard. But it hasn't been done yet.

Do you need you some help for that? Perhaps, I could try to write a
patch to replace the trivial use cases of Py_FindMethod in the stdlib.
Also, I think it would be a good idea to document the change, too.

-- Alexandre

From unknown_kev_cat at hotmail.com  Wed Jul 18 20:50:14 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Wed, 18 Jul 2007 14:50:14 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org><ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com><f7lk7q$9m6$1@sea.gmane.org>
	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>
Message-ID: <f7lnd8$l2s$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc20707181113m360db736h2fd079f29f71220 at mail.gmail.com...
> On 7/18/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>>
>> "Guido van Rossum" <guido at python.org> wrote in message
>> news:ca471dc20707181002w64e076aco9a509ec7e4e15b9a at mail.gmail.com...
>> > On 7/17/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>> >> Building Py3k_struni under Cygwin I've noticed a few more tests 
>> >> failing
>> >> than
>> >> the wiki shows.
>> >> These are using SVN revision 56413.
>> >>
>> >> Some spurious errors seem to occur if Python/ is not remaned 
>> >> temporally.
>> >> I
>> >> have not included those. (This is an oddity of the cygwin '.exe'
>> >> autohandling combined with case-insensitivity)
>> >>
>> >>
>> >> Test_coding: Errors. Traceback included at end of message.
>> >> "test test_descr failed -- ['foo\u1234bar'] slots not caught"
>> >> "test test_largefile failed -- got b'z', but expected 'z'"
>> >> test_marshal: Tests that fail are fasiling with a recursion limit
>> >> exceeded
>> >> error.
>> >>
>> >>
>> >>
>> >> Tracebacks:
>> >>
>> >> test test_coding failed -- Traceback (most recent call last):
>> >>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in
>> >> test_bad_c
>> >> oding2
>> >>     self.verify_bad_module(module_name)
>> >>   File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in
>> >> verify_bad
>> >> _module
>> >>     text = fp.read()
>> >>   File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read
>> >>     res += decoder.decode(self.buffer.read(), True)
>> >>   File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in
>> >> decode
>> >>     return codecs.ascii_decode(input, self.errors)[0]
>> >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 
>> >> 0:
>> >> ordinal
>> >> not in range(128)
>> >
>> > The test_descr and test_largefile failures are reproducible on Ubuntu
>> > and someone will eventually fix them.
>> >
>> > I can't reproduce the test_marshal and test_coding failures; please
>> > investigate more on CYGWIN.
>>
>> For the test coding, apprently the module's contents are intended to be
>> loaded, and then verified that a syntax error occurs when trying to parse
>> the module. However, on cygwin i'm consistantly getting an error on the 
>> line
>> that reads the file. Specificly fp.read().
>>
>> Fp.read() appears to be trying to export a unicode string by interpreting
>> the byte string as ascii. The byte string is most certainly not valid 
>> ascii.
>> So the codec throws an error. I'm guessing for some reason python 
>> normally
>> chose a different codec, but on my cygwin compiles it is choosing ascii. 
>> I'm
>> not sure why. Nor am I sure how to inestigate further.
>
> The encoding defaults to the filesystem encoding or otherwise Latin-1.
> There's an XXX comment in io.py, in TextIOWrapper.__init__, admitting
> this is questionable. I'm guessing CYGWIN has a filesystem encoding
> equal to ASCII? Is this a good idea?

Quite possibly. I know they have wanted to move using the unicode API's to 
support everything,
but that is a pain because of the meathod that windows uses internally to 
support Unicode.

> Maybe the default encoding should always be UTF-8 (matching the source
> code default encoding).
>
> I can also fix it by changing test_coding.py to add encoding="utf-8"
> to the open() call in verify_bad_module().
>
>> Heres a fairly useless loking traceback for test_marshal. Many of the 
>> tests
>> fail with nearly identical tracebacks:
>>
>> #======================================================================
>> #ERROR: test_tuple (test.test_marshal.ContainerTestCase)
>> #----------------------------------------------------------------------
>> #Traceback (most recent call last):
>> #  File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 134, in
>> test_tuple
>> #   self.helper(tuple(self.d.keys()))
>> # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 21, in
>> helper
>> #   new = marshal.load(f)
>> #ValueError: recursion limit exceeded
>>
>> For what it's worth here is the fll subtest list and status for
>> test_marshal:
>>
>> #test_bool (test.test_marshal.IntTestCase) ... ERROR
>> #test_int64 (test.test_marshal.IntTestCase) ... ok
>> #test_ints (test.test_marshal.IntTestCase) ... ERROR
>> #test_floats (test.test_marshal.FloatTestCase) ... ERROR
>> #test_buffer (test.test_marshal.StringTestCase) ... ERROR
>> #test_string (test.test_marshal.StringTestCase) ... ERROR
>> #test_unicode (test.test_marshal.StringTestCase) ... ERROR
>> #test_code (test.test_marshal.CodeTestCase) ... ok
>> #test_dict (test.test_marshal.ContainerTestCase) ... ERROR
>> #test_list (test.test_marshal.ContainerTestCase) ... ERROR
>> #test_sets (test.test_marshal.ContainerTestCase) ... ERROR
>> #test_tuple (test.test_marshal.ContainerTestCase) ... ERROR
>> #test_exceptions (test.test_marshal.ExceptionTestCase) ... ok
>> #test_bug_5888452 (test.test_marshal.BugsTestCase) ... ok
>> #test_fuzz (test.test_marshal.BugsTestCase) ... ok
>> #test_loads_recursion (test.test_marshal.BugsTestCase) ... ok
>> #test_patch_873224 (test.test_marshal.BugsTestCase) ... ok
>> #test_recursion_limit (test.test_marshal.BugsTestCase) ... ok
>> #test_version_argument (test.test_marshal.BugsTestCase) ... ok
>>
>> I'm wondering if the recusion limit on my build is getting set too low
>> somehow.
>
> Can you find out what it is? sys.getrecursionlimit().

Hmm...  It is a limit of 1000.
That is probably large enough, no?

Anyway, from some basic testing it looks like marshal is always throwing 
that error when marshal.load() is called.
However, marshal.loads() works fine.

Might this be another encoding related error? 


From guido at python.org  Wed Jul 18 20:56:07 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 11:56:07 -0700
Subject: [Python-3000] Introspection broken for objects using
	Py_FindMethod()
In-Reply-To: <acd65fa20707181127w3507c064rc02e6241c24d86f2@mail.gmail.com>
References: <acd65fa20707171627t29f9fc03p37164f3d87d94a25@mail.gmail.com>
	<ca471dc20707171652w254d597bl9068abae61b64da4@mail.gmail.com>
	<acd65fa20707181127w3507c064rc02e6241c24d86f2@mail.gmail.com>
Message-ID: <ca471dc20707181156h5c8b874coefe02a58307d9d7c@mail.gmail.com>

On 7/18/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> On 7/17/07, Guido van Rossum <guido at python.org> wrote:
> > Yes, see a thread between me, Georg and Brett around March 7-10:
> >
> > http://mail.python.org/pipermail/python-3000/2007-March/006061.html
> >
>
> Thanks for the pointer.
>
> > I think the conclusion was to get rid of Py_FindMethod altogether. The
> > replacement isn't very hard. But it hasn't been done yet.
>
> Do you need you some help for that? Perhaps, I could try to write a
> patch to replace the trivial use cases of Py_FindMethod in the stdlib.
> Also, I think it would be a good idea to document the change, too.

That would be great!

The Python 3000 project can use all the help it can get!

Please use the py3k-struni branch.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 18 20:58:17 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 11:58:17 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f7lnd8$l2s$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>
	<f7lk7q$9m6$1@sea.gmane.org>
	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>
	<f7lnd8$l2s$1@sea.gmane.org>
Message-ID: <ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>

On 7/18/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> >> I'm wondering if the recusion limit on my build is getting set too low
> >> somehow.
> >
> > Can you find out what it is? sys.getrecursionlimit().
>
> Hmm...  It is a limit of 1000.
> That is probably large enough, no?

Yes, that's what it is for me.

> Anyway, from some basic testing it looks like marshal is always throwing
> that error when marshal.load() is called.
> However, marshal.loads() works fine.
>
> Might this be another encoding related error?

Perhaps. Or something else. Do try to investigate.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alexandre at peadrop.com  Wed Jul 18 22:32:57 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Wed, 18 Jul 2007 16:32:57 -0400
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231052x561e7acfpf84373ea670c2974@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
	<ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
	<acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>
Message-ID: <acd65fa20707181332n480bf6fsa7bff17403770786@mail.gmail.com>

So, any decision on the proposed semantic change of truncate?

-- Alexandre

On 7/3/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> On 7/2/07, Guido van Rossum <guido at python.org> wrote:
> > Honestly, I think truncate() should always set the current position to
> > the new size, even though that's not what it currently does.
>
> Thought about that and I think that would be the best thing to do.
> That would avoid making StringIO unnecessary different from BytesIO.
> And IMHO, it is less prone to bugs. If someone wants to truncate while
> keeping the current position, then he will have to state is intention
> explicitly by saving the value of tell() and calling seek() after
> truncating.
>
> I also find the semantic make more sense too. For example:
>
>    >>> s = StringIO("Good bye, world")
>    >>> s.truncate(10)
>    >>> s.write("cruel world")
>    >>> s.getvalue()
>    ???
>
> I think that should return "Good bye, cruel world", not "cruel world".
>
> So, does anyone else agree with this small semantic change of truncate()?
>

From guido at python.org  Wed Jul 18 22:36:26 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 13:36:26 -0700
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <acd65fa20707181332n480bf6fsa7bff17403770786@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231052x561e7acfpf84373ea670c2974@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
	<ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
	<acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>
	<acd65fa20707181332n480bf6fsa7bff17403770786@mail.gmail.com>
Message-ID: <ca471dc20707181336n294fc353vc4eefc82854a8759@mail.gmail.com>

Unless anyone cares, it should imply a seek to the indicated position
if an argument was present.

On 7/18/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> So, any decision on the proposed semantic change of truncate?
>
> -- Alexandre
>
> On 7/3/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> > On 7/2/07, Guido van Rossum <guido at python.org> wrote:
> > > Honestly, I think truncate() should always set the current position to
> > > the new size, even though that's not what it currently does.
> >
> > Thought about that and I think that would be the best thing to do.
> > That would avoid making StringIO unnecessary different from BytesIO.
> > And IMHO, it is less prone to bugs. If someone wants to truncate while
> > keeping the current position, then he will have to state is intention
> > explicitly by saving the value of tell() and calling seek() after
> > truncating.
> >
> > I also find the semantic make more sense too. For example:
> >
> >    >>> s = StringIO("Good bye, world")
> >    >>> s.truncate(10)
> >    >>> s.write("cruel world")
> >    >>> s.getvalue()
> >    ???
> >
> > I think that should return "Good bye, cruel world", not "cruel world".
> >
> > So, does anyone else agree with this small semantic change of truncate()?
> >
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From kbk at shore.net  Wed Jul 18 23:34:05 2007
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed, 18 Jul 2007 17:34:05 -0400
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <ca471dc20707181027u182550dyaaf362fc718dd883@mail.gmail.com>
	(Guido van Rossum's message of "Wed, 18 Jul 2007 10:27:01 -0700")
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
	<87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com>
	<ca471dc20707181027u182550dyaaf362fc718dd883@mail.gmail.com>
Message-ID: <87d4yp5rci.fsf@hydra.bayview.thirdcreek.com>

"Guido van Rossum" <guido at python.org> writes:

>> www.python.org/sf/1755885
>
> Thanks! Checked in, and merged into p3yk.

Thanks!

Unfortunately, I see there's an error from test_unicode.py, which I
neglected to re-run.  My apologies!

I've checked in a fix on the trunk and the buildbots are relatively
happy once more, it seems.

Should be caught in the next merge.

-- 
KBK

From guido at python.org  Wed Jul 18 23:42:37 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 14:42:37 -0700
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <87d4yp5rci.fsf@hydra.bayview.thirdcreek.com>
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>
	<87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com>
	<ca471dc20707181027u182550dyaaf362fc718dd883@mail.gmail.com>
	<87d4yp5rci.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <ca471dc20707181442w6d82d090rd2b8341f4ee097ee@mail.gmail.com>

On 7/18/07, Kurt B. Kaiser <kbk at shore.net> wrote:
> "Guido van Rossum" <guido at python.org> writes:
>
> >> www.python.org/sf/1755885
> >
> > Thanks! Checked in, and merged into p3yk.
>
> Thanks!
>
> Unfortunately, I see there's an error from test_unicode.py, which I
> neglected to re-run.  My apologies!
>
> I've checked in a fix on the trunk and the buildbots are relatively
> happy once more, it seems.
>
> Should be caught in the next merge.

Ah, I see. I fixed it separately in the py3k-struni branch. I'll try
to remember the next time I merge.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Wed Jul 18 23:42:56 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 Jul 2007 23:42:56 +0200
Subject: [Python-3000] Invalid \U escape in source code give
	hard-to-trace error
In-Reply-To: <ca471dc20707181031sa2339a4u4900de65a549c4e2@mail.gmail.com>
References: <ca471dc20707150717m7344c9cfh3237b78e9dcf681f@mail.gmail.com>	<469D8AA5.1080502@v.loewis.de>
	<ca471dc20707181031sa2339a4u4900de65a549c4e2@mail.gmail.com>
Message-ID: <f7m1gn$odp$1@sea.gmane.org>

Guido van Rossum schrieb:
> On 7/17/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> > When a source file contains a string literal with an out-of-range \U
>> > escape (e.g. "\U12345678"), instead of a syntax error pointing to the
>> > offending literal, I get this, without any indication of the file or
>> > line:
>> >
>> > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in
>> > position 0-9: illegal Unicode character
>> >
>> > This is quite hard to track down.
>>
>> I think the fundamental flaw is that a codec is used to implement
>> the Python syntax (or, rather, lexical rules).
>>
>> Not quite sure what the rationale for this design was; doing it on
>> the lexical level is (was) tricky because \u escapes were allowed
>> only for Unicode literals, and the lexer had no knowledge of the
>> prefix preceding a literal. (In 3k, it's still similar, because
>> \U escapes have no effect in bytes and raw literals).
>>
>> Still, even if it is "only" handled at the parsing level, I
>> don't see why it needs to be a codec. Instead, implementing
>> escapes in the compiler would still allow for proper diagnostics
>> (notice that in the AST the original lexical form of the string
>> literal is gone).
> 
> I guess because it was deemed useful to have a codec for this purpose
> too, thereby exposing the algorithm to Python code that needs the same
> functionality (e.g. the compiler package, RIP).

And it still is useful. If you want to convert a string into a printable
representation, you can use repr(), but for the inverse you need this
codec. (or eval()...)

Georg


From alexandre at peadrop.com  Wed Jul 18 23:43:54 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Wed, 18 Jul 2007 17:43:54 -0400
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <f7e24l$92e$1@sea.gmane.org>
References: <f7e24l$92e$1@sea.gmane.org>
Message-ID: <acd65fa20707181443y121705cevb0d36a3f816ef6d@mail.gmail.com>

On 7/15/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Most obvious would be a special comment, something like
>
> for x in curiousobject.iteritems():  # 2to3:keep
>      foo(x)
>
> Does that make sense?

It would be a good idea to define a convention for these special
comments. For example, we could define something similar to C's
pragma:

  #pragma <feature> <option> ...
  or perhaps,
  #: <feature> <option> ...

So, your example would become:

  for x in curiousobject.iteritems(): #pragma 2to3 keep
     foo(x)

I expect other tools, like pdb.py and trace.py could follow this
convention as well. For example:

  def buggy_func(): #pragma pdb break
     pass

  if debug: #pragma trace ignore
     pass

The motivation for making a such convention, is to make it easy for
programmers to identify comments that are in fact control lines.

-- Alexandre

From g.brandl at gmx.net  Wed Jul 18 23:44:11 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 Jul 2007 23:44:11 +0200
Subject: [Python-3000] Introspection broken for objects using
	Py_FindMethod()
In-Reply-To: <acd65fa20707181127w3507c064rc02e6241c24d86f2@mail.gmail.com>
References: <acd65fa20707171627t29f9fc03p37164f3d87d94a25@mail.gmail.com>	<ca471dc20707171652w254d597bl9068abae61b64da4@mail.gmail.com>
	<acd65fa20707181127w3507c064rc02e6241c24d86f2@mail.gmail.com>
Message-ID: <f7m1j2$odp$2@sea.gmane.org>

Alexandre Vassalotti schrieb:
> On 7/17/07, Guido van Rossum <guido at python.org> wrote:
>> Yes, see a thread between me, Georg and Brett around March 7-10:
>>
>> http://mail.python.org/pipermail/python-3000/2007-March/006061.html
>>
> 
> Thanks for the pointer.
> 
>> I think the conclusion was to get rid of Py_FindMethod altogether. The
>> replacement isn't very hard. But it hasn't been done yet.
> 
> Do you need you some help for that? Perhaps, I could try to write a
> patch to replace the trivial use cases of Py_FindMethod in the stdlib.
> Also, I think it would be a good idea to document the change, too.

I once started a patch for that, but deferred it IIRC in pyexpat or
elementtree.  I'll look it I still have it lying around somewhere.

Georg


From benji at benjiyork.com  Wed Jul 18 23:59:19 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 18 Jul 2007 17:59:19 -0400
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <acd65fa20707181443y121705cevb0d36a3f816ef6d@mail.gmail.com>
References: <f7e24l$92e$1@sea.gmane.org>
	<acd65fa20707181443y121705cevb0d36a3f816ef6d@mail.gmail.com>
Message-ID: <469E8D37.9050006@benjiyork.com>

Alexandre Vassalotti wrote:
> I expect other tools, like pdb.py and trace.py could follow this
> convention as well. For example:

I used the time machine to convince the author of trace.py use this 
convention.

He didn't like your spelling, but eventually agreed to #pragma NO COVER.
-- 
Benji York
http://benjiyork.com

From guido at python.org  Thu Jul 19 01:11:56 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 16:11:56 -0700
Subject: [Python-3000] Announcing PEP 3136
In-Reply-To: <ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
References: <20070630205444.GD22221@theory.org>
	<ca471dc20707030114m7fa2d74btb21c8bd1ae8023db@mail.gmail.com>
Message-ID: <ca471dc20707181611t1fad2e32qaf15d4407a580915@mail.gmail.com>

(FWIW, I've formally rejected the PEP now, referring to this message.)

--Guido

On 7/3/07, Guido van Rossum <guido at python.org> wrote:
> On 6/30/07, Matt Chisholm <matt-python at theory.org> wrote:
> > I've created and submitted a new PEP proposing support for labels in
> > Python's break and continue statements.  Georg Brandl has graciously
> > added it to the PEP list as PEP 3136:
> >
> > http://www.python.org/dev/peps/pep-3136/
>
> I think this is a good summary of various proposals that have been
> floated in the past, plus some new ones. As a PEP, it falls short
> because it doesn't pick a solution but merely offers a large menu of
> possible options. Also, there is nothing about implementation yet.
>
> However, I'm rejecting it on the basis that code so complicated to
> require this feature is very rare. In most cases there are existing
> work-arounds that produce clean code, for example using 'return'.
> While I'm sure there are some (rare) real cases where clarity of the
> code would suffer from a refactoring that makes it possible to use
> return, this is offset by two issues:
>
> 1. The complexity added to the language, permanently. This affects not
> only all Python implementations, but also every source analysis tool,
> plus of course all documentation for the language.
>
> 2. My expectation that the feature will be abused more than it will be
> used right, leading to a net decrease in code clarity (measured across
> all Python code written henceforth). Lazy programmers are everywhere,
> and before you know it you have an incredible mess on your hands of
> unintelligible code.
>
> I realize this is a heavy bar to pass, and somewhat subjective. That's
> okay. There is real value in having a small language. Also, as I said,
> while there are no past PEPs to document it, this has been brought up
> and rejected many times before.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Thu Jul 19 01:51:09 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 19 Jul 2007 11:51:09 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<469D6EC6.9010005@canterbury.ac.nz>
	<20070718020310.2168A3A403A@sparrow.telecommunity.com>
	<ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com>
Message-ID: <469EA76D.7000204@canterbury.ac.nz>

Guido van Rossum wrote:
> Sorry, but I'm still totally uncomfortable with this. While I admit
> the feature exists, I really, really, really don't want it to be used
> on a regular basis.

As long as the objects defined by a regular def statement
aren't modifiable, it seems like it won't be possible
to support retroactive generification of functions that
haven't initially been defined as generic somehow.

So effectively you're saying that you're against this,
or willing to forego it? Not arguing one way or the
other, just seeking to clarify your position.

--
Greg


From guido at python.org  Thu Jul 19 01:57:01 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 16:57:01 -0700
Subject: [Python-3000] Heaptypes
In-Reply-To: <469C4B0B.50605@v.loewis.de>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
	<46979811.2050405@v.loewis.de>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
	<469C4B0B.50605@v.loewis.de>
Message-ID: <ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>

On 7/16/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Guido van Rossum schrieb:
> > That sounds like a good idea to try. It may break some more tests but
> > those are all indications of places that incorrectly still require
> > str8.
> >
> >> I wonder whether the "s" specifier in CallFunction, BuildValue etc
> >> should create Unicode objects, rather than str8 objects.
>
> Done. I fixed a number of test cases that broke because of that.
> In particular, bytes.__reduce__ could not easily return str8 objects
> as its marshalling state anymore (and shouldn't do so, anyway).
> So I made bytes a builtin type of pickle, using the S code.
> As a consequence, a number of other types had to get fixed.
>
> So in total, it adds one new failure: something in test_pickle
> now complains that bytes objects are not hashable.

Now that this is checked in, I understand the problem. You are using
the same opcodes for pickling bytes and str8 -- save_bytes() is a
clone of save_string() (the latter is the callback for str8, not for
str). But you made load_string() always return bytes. The broken tests
fail because they use hardcoded pickles which use the STRING opcode to
save a str8 which is used as a dict key.

You broke backwards compatibility this way; I think that a pickle
produced by Python 2.x should be readable by Python 3.0.

Now, one could argue about whether an 8-bit string pickled in 2.x
should be returned as a Unicode string in 3.0 or as a bytes array.
There is even an argument to be made that it should be a bytes array,
since an 8-bit string in 2.x it's just as likely to represent binary
data as text data, and even if it's text, we don't know the encoding.
But I think that there is a counter-argument that's stronger: the dict
{'a': 42} pickled in 2.x must unpickle as a dict with an immutable
object as key. So we should either unpickle 'a' as a (unicode) str
with value 'a', or as (8-bit) str8, as long as the latter type exists
(I haven't decided whether to keep str8 or something like it, or
whether to try to get rid of it completely).

One possibility might be to first try to decode the STRING argument as
utf-8, and if that fails to convert it to str8 instead. What do you
think? I don't understand all of the changes you made in r56438,
perhaps you can save most of them.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Jul 19 01:59:52 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 16:59:52 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <469EA76D.7000204@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<469D6EC6.9010005@canterbury.ac.nz>
	<20070718020310.2168A3A403A@sparrow.telecommunity.com>
	<ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com>
	<469EA76D.7000204@canterbury.ac.nz>
Message-ID: <ca471dc20707181659n740ba5a0va8342f833094a855@mail.gmail.com>

On 7/18/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
> > Sorry, but I'm still totally uncomfortable with this. While I admit
> > the feature exists, I really, really, really don't want it to be used
> > on a regular basis.
>
> As long as the objects defined by a regular def statement
> aren't modifiable, it seems like it won't be possible
> to support retroactive generification of functions that
> haven't initially been defined as generic somehow.
>
> So effectively you're saying that you're against this,
> or willing to forego it? Not arguing one way or the
> other, just seeking to clarify your position.

The only approach to retroactive generification that I approve of is
replacing the entire object with a wrapper of sorts, e.g.

  foo = generify(foo)

or (more likely)

  import bar
  bar.foo = generify(bar.foo)

I know this has a downside when someone else did "from bar import foo"
before the generification was applied; that is a general problem with
"from foo import bar" and should be addressed by not using that style
in cases where this matters. (It is fine for importing a submodule
from a package of course.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu Jul 19 02:15:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 19 Jul 2007 02:15:30 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>	
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>	
	<46979811.2050405@v.loewis.de>	
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>	
	<469C4B0B.50605@v.loewis.de>
	<ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>
Message-ID: <469EAD22.1040603@v.loewis.de>

> You broke backwards compatibility this way; I think that a pickle
> produced by Python 2.x should be readable by Python 3.0.

It is, is it not?

> (I haven't decided whether to keep str8 or something like it, or
> whether to try to get rid of it completely).

I assumed the latter - and if it indeed goes away, it's certainly
a bug to ever return str8 from pickle, right?

> One possibility might be to first try to decode the STRING argument as
> utf-8, and if that fails to convert it to str8 instead. What do you
> think? I don't understand all of the changes you made in r56438,
> perhaps you can save most of them.

The question really is what bytes should be pickled as; that needs to
be decided before fixing the code. Should it be built-in (and if so,
using what code)? If not, it probably needs to go through __reduce__,
and if so, what should __reduce__ return for bytes object?

__reduce__ currently does (O(s#)) with (ob_type, ob_bytes, ob_size).
Now, s# creates a Unicode object, and the pickling fails to round-trip
correctly.

If __reduce__ returns a Unicode object, what encoding should be assumed?
(which then needs to be symmetric with bytes())

If __reduce__ returns a str8 object, you will have to keep str8 (or
else you cannot pickle bytes).

Regards,
Martin


From guido at python.org  Thu Jul 19 05:01:18 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 18 Jul 2007 20:01:18 -0700
Subject: [Python-3000] Heaptypes
In-Reply-To: <469EAD22.1040603@v.loewis.de>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
	<46979811.2050405@v.loewis.de>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
	<469C4B0B.50605@v.loewis.de>
	<ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>
	<469EAD22.1040603@v.loewis.de>
Message-ID: <ca471dc20707182001g241ef15cj5aacea9971e7d2b0@mail.gmail.com>

On 7/18/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > You broke backwards compatibility this way; I think that a pickle
> > produced by Python 2.x should be readable by Python 3.0.
>
> It is, is it not?

No; {'a': 1} pickled on 2.x results in an error complaining about an
unhashable object when the pickle is read in 3.0; this is the error
you saw in test_pickle.py.

> > (I haven't decided whether to keep str8 or something like it, or
> > whether to try to get rid of it completely).
>
> I assumed the latter - and if it indeed goes away, it's certainly
> a bug to ever return str8 from pickle, right?

If indeed it goes away, it can't be returned. If it's still around, we
can argue about the desirability of returning one.

> > One possibility might be to first try to decode the STRING argument as
> > utf-8, and if that fails to convert it to str8 instead. What do you
> > think? I don't understand all of the changes you made in r56438,
> > perhaps you can save most of them.
>
> The question really is what bytes should be pickled as; that needs to
> be decided before fixing the code. Should it be built-in (and if so,
> using what code)? If not, it probably needs to go through __reduce__,
> and if so, what should __reduce__ return for bytes object?

Either a new opcode (which would such a pickle fail hard when
unpickled with 2.5, but that's probably fine as it would fail anyway),
or some variation of what I coded before, using __reduce__.

> __reduce__ currently does (O(s#)) with (ob_type, ob_bytes, ob_size).
> Now, s# creates a Unicode object, and the pickling fails to round-trip
> correctly.

I thought that before your patch a bytes object roundtripped correctly
with all three protocols. Or maybe it got broken when s# was changed?

An additional requirement might be that if bytes are introduced in
2.6, a pickle containing bytes written by 3.0 should be readable by
2.6. Ideally, pickles not containing bytes written in 3.0 should
always be readable in 2.6 (assuming the user-defined types it
references exist).

> If __reduce__ returns a Unicode object, what encoding should be assumed?
> (which then needs to be symmetric with bytes())
>
> If __reduce__ returns a str8 object, you will have to keep str8 (or
> else you cannot pickle bytes).

When __reduce__ returns a string at all, that means it's the name of a
global. I guess that should be encoded using UTF-8, so that as long as
the name is ASCII, 2.x can unpickle it. But I'm not sure if that's
what you were asking.

Anyway, one reason this is such a mess is clearly that the pickle
protocol has no independent spec -- it's grown organically in code.
Reverse-engineering the intent of the code is a pain.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu Jul 19 09:06:58 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 19 Jul 2007 09:06:58 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707182001g241ef15cj5aacea9971e7d2b0@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>	
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>	
	<46979811.2050405@v.loewis.de>	
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>	
	<469C4B0B.50605@v.loewis.de>	
	<ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>	
	<469EAD22.1040603@v.loewis.de>
	<ca471dc20707182001g241ef15cj5aacea9971e7d2b0@mail.gmail.com>
Message-ID: <469F0D92.7080301@v.loewis.de>

>> __reduce__ currently does (O(s#)) with (ob_type, ob_bytes, ob_size).
>> Now, s# creates a Unicode object, and the pickling fails to round-trip
>> correctly.
> 
> I thought that before your patch a bytes object roundtripped correctly
> with all three protocols. Or maybe it got broken when s# was changed?

It did, and it got. s# used to return a str8, which then was pickled
byte-for-byte. When s# started to return Unicode strings, bytes
above 128 got widened to Py_UNICODE (which is what currently
PyUnicode_FromString does), so b'\xFF' became bytes('\uFFFF').
That got pickled and unpickled; then bytes('\uFFFF') is
b'\xef\xbf\xbf' (because it applies the default encoding to
the unicode argument), and it failed to roundtrip to b'\xFF'.

It's actually not possible to generate b'\xFF' using
a unicode string argument, as string the default encoding will
never return s'\xFF' (as that's not valid UTF-8).

> An additional requirement might be that if bytes are introduced in
> 2.6, a pickle containing bytes written by 3.0 should be readable by
> 2.6.

Sure: whatever we decide now needs to be applied to 2.6 also.

>> If __reduce__ returns a Unicode object, what encoding should be assumed?
>> (which then needs to be symmetric with bytes())
>>
>> If __reduce__ returns a str8 object, you will have to keep str8 (or
>> else you cannot pickle bytes).
> 
> When __reduce__ returns a string at all, that means it's the name of a
> global. I guess that should be encoded using UTF-8, so that as long as
> the name is ASCII, 2.x can unpickle it. But I'm not sure if that's
> what you were asking.

No.
py> b'foo'.__reduce__()
(<type 'bytes'>, ('foo',))
py> b'\xff'.__reduce__()
(<type 'bytes'>, ('\uffff',))

It returns one string each time, as the first element of a one-element
tuple (that is then passed to the bytes() constructor on unpickling)

> Anyway, one reason this is such a mess is clearly that the pickle
> protocol has no independent spec -- it's grown organically in code.
> Reverse-engineering the intent of the code is a pain.

That's also true, but I don't see it much as a problem here. If it
had a spec, that spec would have said that b'S', b'T' and b'U'
have a str payload. That spec would break if str8 goes away, and
the spec would be changed to explain how these codes act in 2.x
and 3.x. It would not talk at all about the bytes type, and that
it's __reduce__ might return different things in 2.x and 3.x
(unless bytes gets a primitive code for pickle).

Regards,
Martin

From p.f.moore at gmail.com  Thu Jul 19 10:30:35 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 19 Jul 2007 09:30:35 +0100
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707181659n740ba5a0va8342f833094a855@mail.gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<469D6EC6.9010005@canterbury.ac.nz>
	<20070718020310.2168A3A403A@sparrow.telecommunity.com>
	<ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com>
	<469EA76D.7000204@canterbury.ac.nz>
	<ca471dc20707181659n740ba5a0va8342f833094a855@mail.gmail.com>
Message-ID: <79990c6b0707190130g7a5c7804kcf96b0e9956724c2@mail.gmail.com>

On 19/07/07, Guido van Rossum <guido at python.org> wrote:
> The only approach to retroactive generification that I approve of is
> replacing the entire object with a wrapper of sorts, e.g.
>
>   foo = generify(foo)

Which (again, just to clarify) means that you would require that
generic functions be introduced by a decorator?

    @generic
    def foo():
        pass

(your explicit equivalent would be for "after the fact" conversion to
a generic).

Paul

From aurelien.campeas at logilab.fr  Thu Jul 19 10:42:15 2007
From: aurelien.campeas at logilab.fr (=?iso-8859-1?Q?Aur=E9lien_Camp=E9as?=)
Date: Thu, 19 Jul 2007 10:42:15 +0200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070713173936.53C213A404D@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
Message-ID: <20070719084215.GA18244@crater.logilab.fr>

On Fri, Jul 13, 2007 at 01:41:47PM -0400, Phillip J. Eby wrote:
> At 07:39 AM 7/13/2007 +0200, Michele Simionato wrote:
> >But I want to ask your opinion first, in order to understand if you 
> >are willing to scale down your proposal or not. At EuroPython Guido 
> >said that in private mail you made some strong argument explaining 
> >why the PEP could not be simplified, but he did not say more than that
> 
> It's not an argument that the PEP can't be simplified; only that a 
> simpler PEP won't accomplish my original goal for the PEP (of having 
> a generic API for generic functions) vs. simply having a generic 
> function implementation in the stdlib.  The first goal requires the 
> second, but the second doesn't need the first, and as far as I'm 
> aware, I'm the only person who really wants the first.

At least on this list ?
Well, you could add me yo your count ... but who am I ? ;-)

> 
> A simpler PEP could exist to implement the second goal only, 
> implementing dynamic overloading in Python 3.0 with all of the 
> non-controversial features of 3124, and using Guido's preferred API.
> 
> The holdup is that I don't have time to work on the *implementation* 
> of both my version *and* this simplified version; there is little 
> overlap between the two because mine is highly 
> self-referential/self-bootstrapping, absolutely dependent on being 
> able to modify functions in-place (a feature Guido seems near -1 on), 
> and virtually impossible to scale down.
> 
> So, it is much lower on my priorities at the moment to implement the 
> simplified version, because I will neither gain code reuse *nor* the 
> API standardization I'd hoped for.
> 
> At the moment, my plan is to finish implementing a PEP 3124-like, 
> fully extensible implementation for Python 2.x (see PEAK-Rules), then 
> look at splitting 3124 into a simplified version and a separate 
> extension API PEP aimed at Python 3.1 or later.  At that point, I 
> will know for sure what extension API features are necessary to 
> implement the more advanced features I want in PEAK-Rules.
> 
> I expect to be able to start work on this (i.e., revisiting the 
> proposal) in about a month.  With luck, I will be able to carve out 
> enough time to create the simpler implementation and update the PEP 
> in a reasonable amount of time.
> 
> However, there is nothing stopping anyone else who wishes it from 
> either making the simpler implementation or drafting the scaled-down 
> PEP.  The simpler version Guido wants isn't really that different 
> from his existing generic function prototype, especially if you drop 
> all forms of method combination (including :next_method).  It will 

Maybe it's just a silly data point, but the current Zope/Plone &
assorted products codebases are riddled with ad-hoc before, after
methods and hard-coded super-calls ... I don't know what these have
become in Zope 3 but at least this shows a need. Having standard ways
to specify these methods as gfs, would be a boon. OTOH having generic
functions without the standard method combination looks a bit like a
futile exercise; these are especially useful when you build hog
frameworks such as zope and whatever sits and tries to cooperate on
top of it.

Maybe thinking about method combination as 'dynamic decoration'
(paralelling the 'generic functions'/'dynamically overloadable
functions' terminology shift) would be a more friedly way to teach
python folks about the feature ? (Since it seems to me that python
wants to absorb foreign languages features under different names.)

I would have liked to have input on this from other people using
RuleDispatch features also (doesn't one of Django/Turbogears project
use them extensively ?). Just so the BDFL & lieutenants don't argue
too much in the direction of 'the community has no experience with
these things'. I think (wishfully ?) a sizeable, if not big, part of
the python *user* community is knwoledgeable about it. These people do
not necessarily express themselves there.

My two cents,
Aur?lien.

> also need positional dispatching, but that's another feature that 
> could perhaps wait for 3.1 as well.
> 
> In short, if you want a PEP 3124 implementation started on sooner 
> than about a month from now, you need to find a volunteer or do it yourself.
> 
> 
> >The point is that for 95% of my use cases, simplegeneric would be 
> >enough, and it is alreay available *now*. So, if Guido was willing 
> >to accept something like simplegeneric for Python 3.0, I would not 
> >mind waiting for multiple dispatch in 3.1.
> 
> You'll have to ask him about that.  For what it's worth, the pkgutil 
> module already contains an even simpler generic function 
> implementation than simplegeneric, and is already in the stdlib 
> albeit undocumented.
> 
> 
> >The reason why I am not using simplegeneric or RuleDispatch already, 
> >is that I do not want to commit in production to a technology 
> >without the official approval of the BDFL, and I prefer to wait now 
> >than having to change my code later.
> 
> I guess this means you never use any packages from the Cheeseshop?  :)
> 
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/aurelien.campeas%40logilab.fr

From p.f.moore at gmail.com  Thu Jul 19 12:58:54 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 19 Jul 2007 11:58:54 +0100
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070719084215.GA18244@crater.logilab.fr>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<20070719084215.GA18244@crater.logilab.fr>
Message-ID: <79990c6b0707190358q3b64d67ejd342a24ada9ca8fd@mail.gmail.com>

On 19/07/07, Aur?lien Camp?as <aurelien.campeas at logilab.fr> wrote:
> On Fri, Jul 13, 2007 at 01:41:47PM -0400, Phillip J. Eby wrote:
> > At 07:39 AM 7/13/2007 +0200, Michele Simionato wrote:
> > >But I want to ask your opinion first, in order to understand if you
> > >are willing to scale down your proposal or not. At EuroPython Guido
> > >said that in private mail you made some strong argument explaining
> > >why the PEP could not be simplified, but he did not say more than that
> >
> > It's not an argument that the PEP can't be simplified; only that a
> > simpler PEP won't accomplish my original goal for the PEP (of having
> > a generic API for generic functions) vs. simply having a generic
> > function implementation in the stdlib.  The first goal requires the
> > second, but the second doesn't need the first, and as far as I'm
> > aware, I'm the only person who really wants the first.
>
> At least on this list ?
> Well, you could add me yo your count ... but who am I ? ;-)

I don't think the issue is quite as black and white as Phillip is
stating it. I personally have no immediate need for his more advanced
API, but I'd support its inclusion if that meant increasing the chance
of *any* GF API going into the core.

There really ought to be an "Open Issues" section of the PEP,
capturing the key areas where we don't have agreement. The lack of
such a section is what makes it almost impossible to follow the
discussions, insofar as how they make progress towards accepting the
PEP.

As a contribution to the discussion, may I offer the following as the
key items I believe are open:

1. The "Advanced" API - some people (including Guido?) do not see the
need for the advanced features of the PEP such as method combinations.
On the other hand, no-one has offered to write up of implement a
reduced version.

2. Functions being modifiable in-place. Technical issues with the
implementation of the advanced API are complex to code without
assuming that function objects can be modified (which Guido is
unwilling to sanction in the general case). Furthermore, the PEP
specifically states that @overload modifies existing functions
in-place.

3. All functions are generic - The PEP states that the @overload
decorator will work on any function, which requires in-place
modification. By requiring overloadable functions to be declared
somehow (for example, using a decorator) this requirement could
possibly be removed.

My apologies if I've misrepresented anyone's views. Please correct me
if I have! I hope this is of some use.

Paul.

From guido at python.org  Thu Jul 19 16:07:09 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 Jul 2007 07:07:09 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <79990c6b0707190130g7a5c7804kcf96b0e9956724c2@mail.gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<loom.20070713T201857-3@post.gmane.org>
	<ca471dc20707171447p68c59c44w254ee9890eb44b8f@mail.gmail.com>
	<20070717223550.7B1B13A403A@sparrow.telecommunity.com>
	<469D6EC6.9010005@canterbury.ac.nz>
	<20070718020310.2168A3A403A@sparrow.telecommunity.com>
	<ca471dc20707180947p41fdcd8k9be97b50658b7385@mail.gmail.com>
	<469EA76D.7000204@canterbury.ac.nz>
	<ca471dc20707181659n740ba5a0va8342f833094a855@mail.gmail.com>
	<79990c6b0707190130g7a5c7804kcf96b0e9956724c2@mail.gmail.com>
Message-ID: <ca471dc20707190707o31ada610w2c5a7133233d5406@mail.gmail.com>

On 7/19/07, Paul Moore <p.f.moore at gmail.com> wrote:
> On 19/07/07, Guido van Rossum <guido at python.org> wrote:
> > The only approach to retroactive generification that I approve of is
> > replacing the entire object with a wrapper of sorts, e.g.
> >
> >   foo = generify(foo)
>
> Which (again, just to clarify) means that you would require that
> generic functions be introduced by a decorator?
>
>     @generic
>     def foo():
>         pass
>
> (your explicit equivalent would be for "after the fact" conversion to
> a generic).

Yes.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Jul 19 16:22:37 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 Jul 2007 07:22:37 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070719084215.GA18244@crater.logilab.fr>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<20070719084215.GA18244@crater.logilab.fr>
Message-ID: <ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>

On 7/19/07, Aur?lien Camp?as <aurelien.campeas at logilab.fr> wrote:
> I would have liked to have input on this from other people using
> RuleDispatch features also (doesn't one of Django/Turbogears project
> use them extensively ?). Just so the BDFL & lieutenants don't argue
> too much in the direction of 'the community has no experience with
> these things'. I think (wishfully ?) a sizeable, if not big, part of
> the python *user* community is knwoledgeable about it. These people do
> not necessarily express themselves there.

Thanks for posting. It's been excruciatingly hard to find anyone
besides Phillip interested in GFs or able to provide use cases. For me
they're mostly still something theoretically interesting from other
languages, like continuations. Maybe you can round up some more users?

FWIW, I think the Turbogears use you're thinking of is jsonify, a GF
for converting arbitrary Python data into JSON (JavaScript Object
Notation). But I'm not aware of it using any of the advanced features
-- it seems to be using just the basic facility of overloading on a
single argument type, which could be done with my own "overloading"
example (see the Python subversion sandbox). At least that's what I
got from skimming the docs:
http://docs.turbogears.org/1.0/JsonifyDecorator . That article claims
that TurboGears uses RuleDispatch extensively. I'd love to hear from
them about how they use the advanced features.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alexandre at peadrop.com  Thu Jul 19 17:34:38 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Thu, 19 Jul 2007 11:34:38 -0400
Subject: [Python-3000] exclusion feature for 2to3?
In-Reply-To: <469E8D37.9050006@benjiyork.com>
References: <f7e24l$92e$1@sea.gmane.org>
	<acd65fa20707181443y121705cevb0d36a3f816ef6d@mail.gmail.com>
	<469E8D37.9050006@benjiyork.com>
Message-ID: <acd65fa20707190834y7f6034e3uba166b7eaa5066ed@mail.gmail.com>

On 7/18/07, Benji York <benji at benjiyork.com> wrote:
> Alexandre Vassalotti wrote:
> > I expect other tools, like pdb.py and trace.py could follow this
> > convention as well. For example:
>
> I used the time machine to convince the author of trace.py use this
> convention.

Uh?

> He didn't like your spelling, but eventually agreed to #pragma NO COVER.

Ah! :) Yes, that is where I got the spelling. I don't really like it
either, but I haven't found anything better.

-- Alexandre

From aurelien.campeas at logilab.fr  Thu Jul 19 17:41:42 2007
From: aurelien.campeas at logilab.fr (=?iso-8859-1?Q?Aur=E9lien_Camp=E9as?=)
Date: Thu, 19 Jul 2007 17:41:42 +0200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<20070719084215.GA18244@crater.logilab.fr>
	<ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>
Message-ID: <20070719154141.GD18244@crater.logilab.fr>

On Thu, Jul 19, 2007 at 07:22:37AM -0700, Guido van Rossum wrote:
> On 7/19/07, Aur?lien Camp?as <aurelien.campeas at logilab.fr> wrote:
>> I would have liked to have input on this from other people using
>> RuleDispatch features also (doesn't one of Django/Turbogears project
>> use them extensively ?). Just so the BDFL & lieutenants don't argue
>> too much in the direction of 'the community has no experience with
>> these things'. I think (wishfully ?) a sizeable, if not big, part of
>> the python *user* community is knwoledgeable about it. These people do
>> not necessarily express themselves there.
>
> Thanks for posting. It's been excruciatingly hard to find anyone
> besides Phillip interested in GFs or able to provide use cases. For me
> they're mostly still something theoretically interesting from other
> languages, like continuations. Maybe you can round up some more
> users?

I will try.

Please note that (imho) unlike scheme's first class continuations
(which are clearly an ?ber-powerful, hard-to-master meta-programming
feature), method combinations are just another tool for day-to-day
programming (in languages that already provide them), especially large
systems. One can certainly live without them, just like one can
program without the python 2.5 with statement. I sincerely believe
Zope cries for gfs, including standard method combination, since its
inception.

Btw, I like to think of 'with' as a (static) decorator for code
blocks. Why not see before/after/around methods like a variation on
the theme of (dynamic) decoration of existing methods ? Terminology
change seems important for the Python community as it (perhaps) helps
assimilation of new concepts in the light of ones that are already
mastered. Dunno if that makes sense, yet.

>
> FWIW, I think the Turbogears use you're thinking of is jsonify, a GF
> for converting arbitrary Python data into JSON (JavaScript Object
> Notation). 

Yes and I remember well Simon Belak's presentation (and enthusiasm) at
EP 2006.

At least from http://turbogears.org/ultimate.html one sees that
generic functions are somewhat used also in :

# choose widgets for data entry (tgfastdata.formmaker)
# pick an output method for expose() (turbogears.controllers)
# choose an error handler when something goes wrong (turbogears.errorhandling)

> But I'm not aware of it using any of the advanced features
> -- it seems to be using just the basic facility of overloading on a
> single argument type, which could be done with my own "overloading"
> example (see the Python subversion sandbox). At least that's what I
> got from skimming the docs:
> http://docs.turbogears.org/1.0/JsonifyDecorator . That article claims
> that TurboGears uses RuleDispatch extensively. I'd love to hear from
> them about how they use the advanced features.

I might want to take some time next week to have a look at the source.

Anyway thanks for leting that door still open, I felt like it was all
done.

Aur?lien.

>
> -- 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>

From pje at telecommunity.com  Thu Jul 19 17:56:17 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 19 Jul 2007 11:56:17 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<20070719084215.GA18244@crater.logilab.fr>
	<ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>
Message-ID: <20070719160724.62AE73A403A@sparrow.telecommunity.com>

At 07:22 AM 7/19/2007 -0700, Guido van Rossum wrote:
>On 7/19/07, Aur?lien Camp?as <aurelien.campeas at logilab.fr> wrote:
> > I would have liked to have input on this from other people using
> > RuleDispatch features also (doesn't one of Django/Turbogears project
> > use them extensively ?). Just so the BDFL & lieutenants don't argue
> > too much in the direction of 'the community has no experience with
> > these things'. I think (wishfully ?) a sizeable, if not big, part of
> > the python *user* community is knwoledgeable about it. These people do
> > not necessarily express themselves there.
>
>Thanks for posting. It's been excruciatingly hard to find anyone
>besides Phillip interested in GFs or able to provide use cases. For me
>they're mostly still something theoretically interesting from other
>languages, like continuations. Maybe you can round up some more users?

About a month ago, googling PEP 3124 turned up a handful of blog 
posts in support.  I also got a few private emails of support.  The 
blog posts weren't from anybody I know or who are past users of my 
libraries, AFAICT, and at any rate aren't the same people who emailed.

My simplegeneric package has hundreds of downloads logged at the 
Cheeseshop -- about 1/8th as many as wsgiref, if that gives you any 
idea of relative popularity.

RuleDispatch isn't on the Cheeseshop, so I don't know how many people 
are using that.  But the people that are, are very 
enthusiastic.  During the time period when RuleDispatch wasn't 
working properly on Python 2.5 yet, I got fairly regular emails 
asking when it would.  :)  RuleDispatch uses my DecoratorTools 
package, whose 1.4 version had over 8000 Cheeseshop downloads (more 
than double wsgiref), and I believe that those are mostly due to 
TurboGears' use of RuleDispatch (as well as direct use of DecoratorTools).


>FWIW, I think the Turbogears use you're thinking of is jsonify, a GF
>for converting arbitrary Python data into JSON (JavaScript Object
>Notation). But I'm not aware of it using any of the advanced features
>-- it seems to be using just the basic facility of overloading on a
>single argument type, which could be done with my own "overloading"
>example (see the Python subversion sandbox).

Actually, for that use case even simplegeneric would suffice, but at 
the time JSONify was written, it didn't exist yet.

By the way, I recently came across a use case for @around that I 
hadn't mentioned before.  I'm in the process of re-implementing 
RuleDispatch's expression features in PEAK-Rules, and as I was 
defining the rules for intersecting logical conditions, it occurred 
to me that you could define intersection in terms of 
implication.  When intersecting conditions A and B, you can return A 
if it implies B, or B if it implies A.

So I just wrote this (translated here to the PEP 3124 dialect):

@around(intersect)
def intersect_if_implies(c1:object, c2:object, nm:next_method):
     if implies(c1, c2):
         return c1
     elif implies(c2, c1):
         return c2
     return nm(c1, c2)

Because this method is @around, it is called before any ordinary 
methods are called, even if they apply to more specific types than 
'object'.  This means you only have to define intersection algorithms 
to handle conditions that don't imply each other.  (Assuming of 
course you've defined implies() relationships.)

When I realized I could do this, I was able to ditch a bunch of 
duplicated code in the individual intersect() relationships I had, 
and avoided having to write that code for the rest of the intersect() 
methods I had left to write.


From pje at telecommunity.com  Thu Jul 19 18:16:30 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 19 Jul 2007 12:16:30 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <79990c6b0707190358q3b64d67ejd342a24ada9ca8fd@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<20070719084215.GA18244@crater.logilab.fr>
	<79990c6b0707190358q3b64d67ejd342a24ada9ca8fd@mail.gmail.com>
Message-ID: <20070719161414.8EB723A403A@sparrow.telecommunity.com>

At 11:58 AM 7/19/2007 +0100, Paul Moore wrote:
>1. The "Advanced" API - some people (including Guido?) do not see the
>need for the advanced features of the PEP such as method combinations.
>On the other hand, no-one has offered to write up of implement a
>reduced version.

Actually, two people have, if you count me.  The other one hasn't yet 
done any of the things we discussed that they could do, and it's 
still on my "to do eventually" list to take care of the rest, 
including an implementation.


>2. Functions being modifiable in-place. Technical issues with the
>implementation of the advanced API are complex to code without
>assuming that function objects can be modified (which Guido is
>unwilling to sanction in the general case). Furthermore, the PEP
>specifically states that @overload modifies existing functions
>in-place.
>
>3. All functions are generic - The PEP states that the @overload
>decorator will work on any function, which requires in-place
>modification. By requiring overloadable functions to be declared
>somehow (for example, using a decorator) this requirement could
>possibly be removed.

I've agreed to Guido's terms for this stuff, more than once, and am 
fine with having a restricted implementation that does things his 
way.  It just won't help me much with my goals for all this, unless 
we figure out a way for that to co-exist with what I want to do, and 
I haven't figured that out yet.

In the meantime, I've got other pressing projects for OSAF that are 
mostly keeping me from doing *anything* related to generic functions, 
even the stuff I *want* to do.  OSAF does use simplegeneric in parts 
of Chandler, btw, but my current work doesn't relate to those parts.

I don't have the cycles at the moment for a PEP rewrite *and* 
implementing another generic function engine besides the five I've 
already written (and the sixth one that's in progress now).  The 
original plan for PEP 3124 was to port peak.rules.core to 3.0 after 
some feature additions, but the stripped-down design calls for a 
different implementation -- especially since peak.rules.core modifies 
functions in place.

(A minor irony: one of the reasons I did it that way instead of 
creating custom objects and then optimizing them with C, was to make 
it possible for PyPy and Psyco to optimize the code.  In other words, 
it was intended to *enhance* portability to other Python platforms, 
not inhibit it!)


From tjreedy at udel.edu  Thu Jul 19 19:15:30 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 Jul 2007 13:15:30 -0400
Subject: [Python-3000] pep 3124 plans
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com><20070713173936.53C213A404D@sparrow.telecommunity.com><20070719084215.GA18244@crater.logilab.fr><ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>
	<ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.co m>
	<20070719160724.62AE73A403A@sparrow.telecommunity.com>
Message-ID: <f7o67j$du5$1@sea.gmane.org>


"Phillip J. Eby" <pje at telecommunity.com> wrote in message 
news:20070719160724.62AE73A403A at sparrow.telecommunity.com...
By the way, I recently came across a use case for @around that I
hadn't mentioned before.  I'm in the process of re-implementing
RuleDispatch's expression features in PEAK-Rules, and as I was
defining the rules for intersecting logical conditions, it occurred
to me that you could define intersection in terms of
implication.  When intersecting conditions A and B, you can return A
if it implies B, or B if it implies A.

So I just wrote this (translated here to the PEP 3124 dialect):

@around(intersect)
def intersect_if_implies(c1:object, c2:object, nm:next_method):
     if implies(c1, c2):
         return c1
     elif implies(c2, c1):
         return c2
     return nm(c1, c2)

Because this method is @around, it is called before any ordinary
methods are called, even if they apply to more specific types than
'object'.  This means you only have to define intersection algorithms
to handle conditions that don't imply each other.  (Assuming of
course you've defined implies() relationships.)

When I realized I could do this, I was able to ditch a bunch of
duplicated code in the individual intersect() relationships I had,
and avoided having to write that code for the rest of the intersect()
methods I had left to write.

=====================================
As a side note: if you have either a negate() or disjoint(), you can also 
handle a 3rd of the 4 cases object-generically:
   elif disjoint(c1,c2): return <empty> #or
   elif implies(c1, negate(c2): return <empty> # symmetrical with
   elif implies(c2, negate(c1): trturn <empty>

and then the intersection algorithms can assume non-disjointness.

tjr


From guido at python.org  Thu Jul 19 20:32:14 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 Jul 2007 11:32:14 -0700
Subject: [Python-3000] Heaptypes
In-Reply-To: <469F0D92.7080301@v.loewis.de>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>
	<46979811.2050405@v.loewis.de>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
	<469C4B0B.50605@v.loewis.de>
	<ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>
	<469EAD22.1040603@v.loewis.de>
	<ca471dc20707182001g241ef15cj5aacea9971e7d2b0@mail.gmail.com>
	<469F0D92.7080301@v.loewis.de>
Message-ID: <ca471dc20707191132j7837ec90w1971bca72dac282a@mail.gmail.com>

On 7/19/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> __reduce__ currently does (O(s#)) with (ob_type, ob_bytes, ob_size).
> >> Now, s# creates a Unicode object, and the pickling fails to round-trip
> >> correctly.
> >
> > I thought that before your patch a bytes object roundtripped correctly
> > with all three protocols. Or maybe it got broken when s# was changed?
>
> It did, and it got. s# used to return a str8, which then was pickled
> byte-for-byte. When s# started to return Unicode strings, bytes
> above 128 got widened to Py_UNICODE (which is what currently
> PyUnicode_FromString does), so b'\xFF' became bytes('\uFFFF').

Ouch!!! This turns out to be a bug in PyUnicode_FronStringAndSize()
due to signed characters. It can even cause a segfault:

Python 3.0x (py3k-struni, Jul 18 2007, 11:01:59)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> b"\x80".__reduce__()
Segmentation fault

Fixed by applying Py_CHARMASK() to all occurrences of *u in that function.
Committed revision 56460.

> That got pickled and unpickled; then bytes('\uFFFF') is
> b'\xef\xbf\xbf' (because it applies the default encoding to
> the unicode argument), and it failed to roundtrip to b'\xFF'.
>
> It's actually not possible to generate b'\xFF' using
> a unicode string argument, as string the default encoding will
> never return s'\xFF' (as that's not valid UTF-8).

But you can do it using bytes('\xff', 'latin-1'). I think that's a
reasonable thing for bytes.__reduce__() to return.

> > An additional requirement might be that if bytes are introduced in
> > 2.6, a pickle containing bytes written by 3.0 should be readable by
> > 2.6.
>
> Sure: whatever we decide now needs to be applied to 2.6 also.

Right.

> >> If __reduce__ returns a Unicode object, what encoding should be assumed?
> >> (which then needs to be symmetric with bytes())
> >>
> >> If __reduce__ returns a str8 object, you will have to keep str8 (or
> >> else you cannot pickle bytes).
> >
> > When __reduce__ returns a string at all, that means it's the name of a
> > global. I guess that should be encoded using UTF-8, so that as long as
> > the name is ASCII, 2.x can unpickle it. But I'm not sure if that's
> > what you were asking.
>
> No.
> py> b'foo'.__reduce__()
> (<type 'bytes'>, ('foo',))
> py> b'\xff'.__reduce__()
> (<type 'bytes'>, ('\uffff',))
>
> It returns one string each time, as the first element of a one-element
> tuple (that is then passed to the bytes() constructor on unpickling)

I see. It returns a tuple containing a string. I was confused. Sorry.
(But the \uffff is due to the bug above.)

> > Anyway, one reason this is such a mess is clearly that the pickle
> > protocol has no independent spec -- it's grown organically in code.
> > Reverse-engineering the intent of the code is a pain.
>
> That's also true, but I don't see it much as a problem here. If it
> had a spec, that spec would have said that b'S', b'T' and b'U'
> have a str payload. That spec would break if str8 goes away, and
> the spec would be changed to explain how these codes act in 2.x
> and 3.x. It would not talk at all about the bytes type, and that
> it's __reduce__ might return different things in 2.x and 3.x
> (unless bytes gets a primitive code for pickle).

How about the following. it's not perfect but it's the best I can
think of that doesn't break any pickles.

In 3.0, when an S, T or U pickle code is encountered, the returned
value is a Unicode string decoded from the bytes using Latin-1. This
means that all S, T or U pickle codes returns Unicode objects. In
those cases where this was really meant to transfer binary data, the
application running under 3.0 can fix this by calling bytes(X,
'latin-1'). If it was meant to be UTF-8-encoded text, the app can call
str(Y, 'utf-8') after that.

But 3.0 should only *generate* the S, T or U pickle codes for str8
values (as long as that type exists) or for str values containing only
7-bit ASCII bytes; for all else it should use the unicode pickle
codes.

For bytes, I propose that b"ab\xff".__reduce__() return (bytes,
("ab\xff", "latin-1")).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu Jul 19 22:26:35 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 19 Jul 2007 22:26:35 +0200
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707191132j7837ec90w1971bca72dac282a@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>	
	<ca471dc20707110715s4cd53401t53e9075bdc2ea1df@mail.gmail.com>	
	<46979811.2050405@v.loewis.de>	
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>	
	<469C4B0B.50605@v.loewis.de>	
	<ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>	
	<469EAD22.1040603@v.loewis.de>	
	<ca471dc20707182001g241ef15cj5aacea9971e7d2b0@mail.gmail.com>	
	<469F0D92.7080301@v.loewis.de>
	<ca471dc20707191132j7837ec90w1971bca72dac282a@mail.gmail.com>
Message-ID: <469FC8FB.2050004@v.loewis.de>

> But you can do it using bytes('\xff', 'latin-1'). I think that's a
> reasonable thing for bytes.__reduce__() to return.

That's certainly a choice. Another choice is that bytes defaults to
latin-1, rather than the system default encoding. This is roughly
equivalent, and gives a slightly more compact pickle result.

> How about the following. it's not perfect but it's the best I can
> think of that doesn't break any pickles.
> 
> In 3.0, when an S, T or U pickle code is encountered, the returned
> value is a Unicode string decoded from the bytes using Latin-1. This
> means that all S, T or U pickle codes returns Unicode objects. In
> those cases where this was really meant to transfer binary data, the
> application running under 3.0 can fix this by calling bytes(X,
> 'latin-1'). If it was meant to be UTF-8-encoded text, the app can call
> str(Y, 'utf-8') after that.

It would actually have to be Y.encode('latin-1').decode('utf-8')
(assuming Y is what you get from unpickling):

py> str('\xc3\xb6', 'utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: decoding Unicode is not supported

> But 3.0 should only *generate* the S, T or U pickle codes for str8
> values (as long as that type exists) or for str values containing only
> 7-bit ASCII bytes; for all else it should use the unicode pickle
> codes.

Sounds fine to me.

> For bytes, I propose that b"ab\xff".__reduce__() return (bytes,
> ("ab\xff", "latin-1")).

See above. Unless somebody objects, I'd rather make latin-1 the
default for bytes when a string is passed (I'm uncertain myself
of how much explicit is better than implicit here).

I'll look into implementing that strategy.

Regards,
Martin


From jonathan-lists at cleverdevil.org  Thu Jul 19 22:00:53 2007
From: jonathan-lists at cleverdevil.org (Jonathan LaCour)
Date: Thu, 19 Jul 2007 16:00:53 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<20070719084215.GA18244@crater.logilab.fr>
	<ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>
Message-ID: <B46B3E00-F0B0-40BE-A59F-561487652FBE@cleverdevil.org>

Guido van Rossum wrote:

> FWIW, I think the Turbogears use you're thinking of is jsonify,
> a GF for converting arbitrary Python data into JSON (JavaScript
> Object Notation). But I'm not aware of it using any of the
> advanced features -- it seems to be using just the basic facility
> of overloading on a single argument type, which could be done
> with my own "overloading" example (see the Python subversion
> sandbox). At least that's what I got from skimming the docs:
> http://docs.turbogears.org/1.0/JsonifyDecorator . That article claims
> that TurboGears uses RuleDispatch extensively. I'd love to hear from
> them about how they use the advanced features.

There are several places in TurboGears that we use generic functions:

TurboJSON
---------
TurboGears controllers work by returning dictionaries, which are
then passed to template engines to generate and render responses.
TurboJSON is a Buffet-compatible template plugin that jsonifies data
that is returned from a TurboGears controller.  The jsonify function
is a generic function that is used to perform the serialization,
and is commonly extended to provide custom JSON serialization in a
cross-cutting way:

    # TurboGears Controller
    class PeopleController(controllers.Controller):

        @expose('json')
        def person(self, person_id):
            person = Person.get(person_id)
            return dict(person=person)


    # generic function for JSONifying Person objects
    @jsonify.when('isinstance(obj, Person)')
    def jsonify_person(obj):
        return dict(
            name=person.name,
            age=person.age,
            birthdate=person.birthdate.strftime('%Y-%M-%D')
        )

I use this feature heavily, and find it to be easy to understand once
you get used to the concept of generic functions.

Of course, we don't restrict @jsonify.when() to isinstance checking.
I've seen production code which checks the value of an object before
jsonifying it, or which checks and attribute on the object to determine
how it should be rendered in the JSON.  For example if one of our users
has a bunch of different contacts in a contact object, but she wants
different JSON for contacts who are also leads, she can use predicate
dispatch in the @jsonify.when decorator to do that...


Picking a Template Engine
-------------------------

TurboGears supports a variety of templating engines in a cross-framework
way using a standard API called Buffet.  TurboGears controllers can
specify different templating engines and different templates for a
controller method if they so desire, and we use generic functions to
implement this on the backend so that you can regester multiple template
options for rendering the same controller method.

    class Root(controllers.RootController):
        @expose(template='mako:path.to.mako.template.html')
        def get_mako(self):
            return dict(...)

        @expose("actionflow.templates.tasks")
        @expose("cheetah:actionflow.templates.tasktext",
                accept_format="text/plain")
        @expose("kid:actionflow.templates.taskfeed,
               accept_format="rss")
        @expose("json", accept_format = "text/javascript",
               as_format="json")
        def task(self):
            return dict(...)

Rule dispatch gets used to check what format is requested (either in the
headers, or explicitly via a tg_format parameter) and calls the correct
rendering function in the correct way to turn the dict that's returned
into what the client asked for.  We're going to be improving this and
making it even more powerful in TurboGears 2.0.


Validation and Error Handling
-----------------------------

TurboGears has a built-in framework for validating parameters that are
passed in over HTTP.  This integrates with an underlying widget system
which can be used to generate forms, called ToscaWidgets, that you can
use to validate against.  You can find good documentation and examples
here: http://docs.turbogears.org/1.0/ErrorHandling

Here is an example:

import turbogears
from turbogears import controllers, expose, validate, redirect
from turbogears import exception_handler

class Root(controllers.RootController):
      def vh(self, tg_exceptions=None):
          return dict(
              handling_value=True,
              exception=str(tg_exceptions)
          )

      def ih(self, tg_exceptions=None):
          return dict(
              handling_index=True,
              exception=str(tg_exceptions)
          )

      @expose()
      @exception_handler(vh, "isinstance(tg_exceptions, ValueError)")
      @exception_handler(ih, "isinstance(tg_exceptions, IndexError)")
      def exceptional(self, number=2):
          number = int(number)
          if number < 42:
              raise IndexError("Number too Low!")
          if number == 42:
              raise IndexError("Wise guy, eh?")
          if number > 100:
              raise Exception("This number is exceptionally high!")
          return dict(result="No errors!")


Lots of users are currently making use of this functionality in
TurboGears, and it seems to be fairly well received.  And again, you
can use predicate dispatch to regoster different error_handlers for
different kinds of errors.

I for one, as a committer on TurboGears, would absolutely love to see
a good, solid generic function capability integrated into the standard
library, and find PEP 3124 to completely cover my needs.  There are
certainly things in the PEP that I do not have a use for, but nothing in
the PEP seems to be much of a stretch to me.

Just my 2 cents (or maybe 50 cents...)

--
Jonathan LaCour
http://cleverdevil.org

From pje at telecommunity.com  Thu Jul 19 23:08:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 19 Jul 2007 17:08:04 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <B46B3E00-F0B0-40BE-A59F-561487652FBE@cleverdevil.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<20070719084215.GA18244@crater.logilab.fr>
	<ca471dc20707190722w53fcb1d9te591db51035de487@mail.gmail.com>
	<B46B3E00-F0B0-40BE-A59F-561487652FBE@cleverdevil.org>
Message-ID: <20070719210547.8D5BA3A403A@sparrow.telecommunity.com>

At 04:00 PM 7/19/2007 -0400, Jonathan LaCour wrote:
>I for one, as a committer on TurboGears, would absolutely love to see
>a good, solid generic function capability integrated into the standard
>library, and find PEP 3124 to completely cover my needs.  There are
>certainly things in the PEP that I do not have a use for, but nothing in
>the PEP seems to be much of a stretch to me.

FYI, Jonathan, the version of PEAK-Rules that's in SVN implements 
everything that's currently in PEP 3124 except the Interface bits.

It does not, however, implement RuleDispatch-style predicate 
expressions, just argument-isinstance tests.  I'd hoped to have 
predicates done this month, but it's running a couple weeks 
behind.  After it's done, I plan to throw together a 
RuleDispatch-style API over it, to make porting/testing easier, using 
something like "from peak.rules import dispatch" to get a module that 
fakes the RuleDispatch API (e.g. somefunc.when() instead of when(somefunc)).


From guido at python.org  Fri Jul 20 00:25:07 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 Jul 2007 15:25:07 -0700
Subject: [Python-3000] Heaptypes
In-Reply-To: <469FC8FB.2050004@v.loewis.de>
References: <f72o9f$v6i$1@sea.gmane.org> <46979811.2050405@v.loewis.de>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
	<469C4B0B.50605@v.loewis.de>
	<ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>
	<469EAD22.1040603@v.loewis.de>
	<ca471dc20707182001g241ef15cj5aacea9971e7d2b0@mail.gmail.com>
	<469F0D92.7080301@v.loewis.de>
	<ca471dc20707191132j7837ec90w1971bca72dac282a@mail.gmail.com>
	<469FC8FB.2050004@v.loewis.de>
Message-ID: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>

On 7/19/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > But you can do it using bytes('\xff', 'latin-1'). I think that's a
> > reasonable thing for bytes.__reduce__() to return.
>
> That's certainly a choice. Another choice is that bytes defaults to
> latin-1, rather than the system default encoding. This is roughly
> equivalent, and gives a slightly more compact pickle result.

I don't like bytes defaulting to anything at all; that they currently
do is a transitional issue in the branch. Java used to have a default
of Latin-1 for converting bytes <--> string and it was considered a
mistake AFAIK.

I've implemented the explicit latin-1version for now; we can change this later.

> > How about the following. it's not perfect but it's the best I can
> > think of that doesn't break any pickles.
> >
> > In 3.0, when an S, T or U pickle code is encountered, the returned
> > value is a Unicode string decoded from the bytes using Latin-1. This
> > means that all S, T or U pickle codes returns Unicode objects. In
> > those cases where this was really meant to transfer binary data, the
> > application running under 3.0 can fix this by calling bytes(X,
> > 'latin-1'). If it was meant to be UTF-8-encoded text, the app can call
> > str(Y, 'utf-8') after that.
>
> It would actually have to be Y.encode('latin-1').decode('utf-8')
> (assuming Y is what you get from unpickling):

That's another way of saying it. I meant for Y to be the result of
bytes(X, 'latin-1') but that was non-obvious. Anyway I think we're in
agreement here. :-)

> py> str('\xc3\xb6', 'utf-8')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: decoding Unicode is not supported
>
> > But 3.0 should only *generate* the S, T or U pickle codes for str8
> > values (as long as that type exists) or for str values containing only
> > 7-bit ASCII bytes; for all else it should use the unicode pickle
> > codes.
>
> Sounds fine to me.
>
> > For bytes, I propose that b"ab\xff".__reduce__() return (bytes,
> > ("ab\xff", "latin-1")).
>
> See above. Unless somebody objects, I'd rather make latin-1 the
> default for bytes when a string is passed (I'm uncertain myself
> of how much explicit is better than implicit here).

See above.

> I'll look into implementing that strategy.

How about instead you help with fixing pickling of datetime objects?
This broke when I fixed test_pickle. Rolling back your changes to
datetime pickling didn't seem to help.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Jul 20 01:58:30 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 Jul 2007 16:58:30 -0700
Subject: [Python-3000] Heaptypes
In-Reply-To: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
References: <f72o9f$v6i$1@sea.gmane.org>
	<ca471dc20707140708n413bfe9fwc6d223f50ff44573@mail.gmail.com>
	<469C4B0B.50605@v.loewis.de>
	<ca471dc20707181657o4ccfcc7eu94134972b0b78fb5@mail.gmail.com>
	<469EAD22.1040603@v.loewis.de>
	<ca471dc20707182001g241ef15cj5aacea9971e7d2b0@mail.gmail.com>
	<469F0D92.7080301@v.loewis.de>
	<ca471dc20707191132j7837ec90w1971bca72dac282a@mail.gmail.com>
	<469FC8FB.2050004@v.loewis.de>
	<ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
Message-ID: <ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>

On 7/19/07, Guido van Rossum <guido at python.org> wrote:
> How about instead you help with fixing pickling of datetime objects?
> This broke when I fixed test_pickle. Rolling back your changes to
> datetime pickling didn't seem to help.

Never mind; this was shallow -- cPickle doesn't pickle bytes
correctly. I've decided to get rid of cPickle -- someone is writing a
replacement for the summer of code anyway. The new approach will be
that you always write "import pickle" and this transparently attempts
to use the C accelerator if it can be imported, like heapq.py and
_heapq.c.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Jul 20 04:37:53 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 19 Jul 2007 22:37:53 -0400
Subject: [Python-3000] Fwd: Re:  pep 3124 plans
Message-ID: <20070720023537.681DA3A403A@sparrow.telecommunity.com>

FYI...  another TurboGears developer speaks up re: their generic function use.

>Date: Thu, 19 Jul 2007 20:17:23 -0400
>From: "Mark Ramm"
>To: "Phillip J. Eby"
>Subject: Re: [Python-3000] pep 3124 plans
>
>>FYI, Jonathan, the version of PEAK-Rules that's in SVN implements
>>everything that's currently in PEP 3124 except the Interface bits.
>>
>>It does not, however, implement RuleDispatch-style predicate
>>expressions, just argument-isinstance tests.  I'd hoped to have
>>predicates done this month, but it's running a couple weeks
>>behind.  After it's done, I plan to throw together a
>>RuleDispatch-style API over it, to make porting/testing easier, using
>>something like "from peak.rules import dispatch" to get a module that
>>fakes the RuleDispatch API (e.g. somefunc.when() instead of when(somefunc)).
>
>This is good news indeed.   TurboGears 2 is looking for rule based
>dispatch, and I'm very interested in PEAK Rules as an alternative to
>RD since you've pretty much deprecated RD.   But an RD like interface
>on PEAK-Rules will make TG2 more API compatible, and opens up the
>possibility of moving  over in the tg 1.x line.
>
>Predicate dispatch isn't really needed for some of the things in
>TurboGears, and there are a couple of places where we went overboard
>with generic functions everywhere.   But, at the same time there are
>other places where generic functions and predicate dispatch really
>make things a lot easier to understand, and it would hurt quite a bit
>to have to to give it up.
>
>As the maintainer of tg2, my main interest is to have a viable,
>reasonably well supported, generic function implementation that we can
>use and rely on.
>
>I don't so much care that it's baked into the core language, or
>included in the standard library -- though I think those would be
>great things.  Generic functions helped me to think about problems in
>a new way, and have been a remarkably useful tool to have in my
>toolbox.
>
>--Mark Ramm


From unknown_kev_cat at hotmail.com  Fri Jul 20 07:19:09 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Fri, 20 Jul 2007 01:19:09 -0400
Subject: [Python-3000] pep 3124 plans
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.co m>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
Message-ID: <f7pgki$6o3$1@sea.gmane.org>


So the state of the PEP? From the rest of the posts so far,
it sounds like there is no real objection to the basic end user API as 
described in the PEP,
except for the case of retroactive generification, which GvR wants made 
explict in the user's code, AIUI.

But there are concerns about the implementation. Overiding inside classes 
would need a new implementation, but at the moment your not sure how to 
implement that. Also your current bootstrapping system requires in-place 
modifing of some functions. You think using a third type of function could 
perhaps fix that if no cleaner solution appears, correct?

Also what has happened with the Interfaces/Adpatation/Aspects part of the 
document? How does that mesh with the ABC's?
After all adaptable interfaces and ABCs have such similar use cases users 
may not be sure which to use.
Or has that part been defered for now, as the GF and method combination part 
is not dependent on those?


From unknown_kev_cat at hotmail.com  Fri Jul 20 08:20:55 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Fri, 20 Jul 2007 02:20:55 -0400
Subject: [Python-3000] PEP 368: Standard image protocol and class
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com><cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com><740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>
	<cc93256f0707010959o44c77912sb989c68cf890b846@mail.gmail.com>
Message-ID: <f7pk8b$evu$1@sea.gmane.org>


"Lino Mastrodomenico" <l.mastrodomenico at gmail.com> wrote in message 
news:cc93256f0707010959o44c77912sb989c68cf890b846 at mail.gmail.com...
>2007/7/1, BJ?rn Lindqvist <bjourne at gmail.com>:
>> But I cannot see how it would solve the problem with to many image
>> classes. The reason why PIL, PyGame and wxPython has different image
>> classes is because each of them use different C functions for
>> manipulating said image classes. These differences bubble up through
>> the bindings and results in PIL exposing an Image, PyGame a Surface
>> and wxPython a wxImage. The result is that if you want to use a PIL
>> Image in say PyGame, you  still need to convert it.
>
>Actually, this is not always true. :-)
>
>For example it's entirely possible to have the *same* python RGBA
>image considered as a SDL_Surface by SDL (the underlying library used
>by pygame), as an ImagingMemoryInstance by the PIL C library and have
>its buffer directly accepted by the OpenGL function glTexImage2D (with
>a bit of care in the order of the corners passed to glTexCoord2f),
>independently by who created the image in the first place.
>
>This works because most C/C++ libraries give the possibility of
>creating a native image struct/class using an existing memory buffer
>(without copying it) and they support at least a subset of the modes
>currently defined, with the exact byte order, padding, etc, specified
>in the PEP (usually L and at least one of RGB or RGBA).
>
>But you are right, the particular format specified in the PEP is not
>always supported by existing the libraries, even when they support
>that particular mode. Sometimes this can be fixed (e.g. PIL currently
>uses by default 4 bytes per pixel for RGB images and has only
>experimental support for 3 bytes per pixel, but its C library is
>written by the same people that maintain the Python bindings, so they
>can change it if they want) and sometimes it cannot be easily fixed
>(e.g. a wxImage class will happily accept a RGB buffer as defined by
>the PEP, but it has a funny memory arrangement for RGBA images that is
>completely incompatible).
>
>So I expect that each Python library that jumps on the PEP bandwagon
>will have three levels of support for the modes listed:
>
>  1) no support at all (e.g. most 3D libraries will probably never
>accept CMYK images as textures); the user can explicitly convert the
>image using "new_image = Image(new_mode, source=old_image)";
>
>  2) limited support: they support a particular mode, but cannot
>directly use the standard memory arrangement, so when they receive an
>alien image object they convert it on the fly to their preferred byte
>order and they do the reverse operation when a foreign library tries
>to access the buffer property of their images (they may offer a
>read-only buffer); this is not ideal, but it's better than the current
>situation because it's transparent to the user and it requires only a
>single memory copy/conversion instead of the two usually performed by
>the current tostring/fromstring dance;
>
>  3) full support: no conversion or memory copy ever necessary for the
>exchange of images between two libraries if they both have full
>support for a particular mode. Of course the Image class that I'm
>writing and that I hope will be included in the stdlib, will have full
>support for all the modes.
>
>Please note that the conversions in "2)" above can be avoided in some
>(most?) cases if PEP 3118 is accepted, because it will become possible
>to expose and discover the "native" memory arrangement of an image
>without accessing its buffer property (that, in my vision, will always
>offer the "standard" arrangement defined in the PEP, to simplify
>things for libraries that prefer a simpler interface, even if it may
>be slightly less efficient in some, hopefully rare, cases).
>

If the maintainers of most of the large packages that do imaging are willing 
to support this,
and your code is good, I see absolutely no reason why this PEP would not be 
accepted.

It appears you worked hard to make sure that it would be possible for
the existing libraries to use the Image protocol without too much work.
(Unless they need to use "support level 2" as you described above, for some 
modes. That would add some extra work).

Will you provide an abstract base class for Image Protocol implentations to 
inherit from? (The ImageMixin could inheirit from that class, just not 
providing implemenations of info, buffer, mode, and size. [Hmm. If any of 
those were functions then that would prevent somebody from directly 
instancing ImageMixin, which would be a good thing, as it was really only 
intended to be used as a base class as far as I can tell.])

Will the simple "Image" class have no extra functionally beyond the 
protocol's minimum requirements and the stated resizing/mode-changing 
constructors?

If an image-protocol object is passed to the Image-constructor requesting a 
mode conversion or resizing, but is already in the requested mode/size what 
happens? Is the underlying image data duplicated? Or does the new instance 
basically point to the old data?


From jcarlson at uci.edu  Fri Jul 20 10:18:01 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 20 Jul 2007 01:18:01 -0700
Subject: [Python-3000] _heapq.c, etc. (was Re:  Heaptypes)
In-Reply-To: <ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
Message-ID: <20070720010804.85A7.JCARLSON@uci.edu>


"Guido van Rossum" <guido at python.org> wrote:
> On 7/19/07, Guido van Rossum <guido at python.org> wrote:
> > How about instead you help with fixing pickling of datetime objects?
> > This broke when I fixed test_pickle. Rolling back your changes to
> > datetime pickling didn't seem to help.
> 
> Never mind; this was shallow -- cPickle doesn't pickle bytes
> correctly. I've decided to get rid of cPickle -- someone is writing a
> replacement for the summer of code anyway. The new approach will be
> that you always write "import pickle" and this transparently attempts
> to use the C accelerator if it can be imported, like heapq.py and
> _heapq.c.

On a related note, since I had been supporting only Python 2.3 for quite
a while, I didn't notice the fact that Python's _heapq.c (in 2.4 at
least, I haven't tested on 2.5) only supported lists as containers, and
not a list-like object with all methods that heapq calls (which was an
issue for a pure-Python pair heap implementation I posted last December
or so).

What made it really annoying is that there was no way to tell the heapq
module not to load the C version so that I could use a generic container. 
I ended up just commenting out the C module heapq import and moving on.

I don't know if we want to make it possible to disable the loading of
certain C modules that *don't* offer all of the same features, or if we
want to limit the Python versions to what the C versions support, or
even if we want to expand the C versions to handle all cases that the
Python versions support.  While the pickle/cPickle, StringIO/cStringIO,
etc., naming can be a bit annoying, it does give me the choice whether I
want it to be fast or flexible.

 - Josiah


From guido at python.org  Fri Jul 20 16:44:09 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Jul 2007 07:44:09 -0700
Subject: [Python-3000] _heapq.c, etc. (was Re:  Heaptypes)
In-Reply-To: <20070720010804.85A7.JCARLSON@uci.edu>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
	<20070720010804.85A7.JCARLSON@uci.edu>
Message-ID: <ca471dc20707200744r4a8efc1an444d7f4f894ff23a@mail.gmail.com>

On 7/20/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Guido van Rossum" <guido at python.org> wrote:
> > On 7/19/07, Guido van Rossum <guido at python.org> wrote:
> > > How about instead you help with fixing pickling of datetime objects?
> > > This broke when I fixed test_pickle. Rolling back your changes to
> > > datetime pickling didn't seem to help.
> >
> > Never mind; this was shallow -- cPickle doesn't pickle bytes
> > correctly. I've decided to get rid of cPickle -- someone is writing a
> > replacement for the summer of code anyway. The new approach will be
> > that you always write "import pickle" and this transparently attempts
> > to use the C accelerator if it can be imported, like heapq.py and
> > _heapq.c.
>
> On a related note, since I had been supporting only Python 2.3 for quite
> a while, I didn't notice the fact that Python's _heapq.c (in 2.4 at
> least, I haven't tested on 2.5) only supported lists as containers, and
> not a list-like object with all methods that heapq calls (which was an
> issue for a pure-Python pair heap implementation I posted last December
> or so).
>
> What made it really annoying is that there was no way to tell the heapq
> module not to load the C version so that I could use a generic container.
> I ended up just commenting out the C module heapq import and moving on.
>
> I don't know if we want to make it possible to disable the loading of
> certain C modules that *don't* offer all of the same features, or if we
> want to limit the Python versions to what the C versions support, or
> even if we want to expand the C versions to handle all cases that the
> Python versions support.  While the pickle/cPickle, StringIO/cStringIO,
> etc., naming can be a bit annoying, it does give me the choice whether I
> want it to be fast or flexible.

This was an example of a performance improvement that changed the
specs of an API in an incompatible way. Breaking your code was an
unintended side effect of the speedup.

We're going to do a few more of these in Py3k, and this time breaking
the specs is the name of the game. I think going forward (post 3.0) we
should be more careful to write specs that can easily be optimized
without breaking existing usage, or writing speedups that can handle
all the argument types that the original code supported.

I definitely *don't* want to continue the old habit of having a slow
and a fast module with different names; the experience with especially
cPickle and cStringIO is that everyone believes their code is
performance critical and hence uses the C version if it exists,
thereby repeating the same idiom over and over.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Jul 20 16:49:12 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Jul 2007 07:49:12 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <f7pgki$6o3$1@sea.gmane.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
Message-ID: <ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>

On 7/19/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> So the state of the PEP? From the rest of the posts so far,
> it sounds like there is no real objection to the basic end user API as
> described in the PEP,

Actually I want to reserve judgment on that until the PEP is rewritten
to explain and document the underlying mechanisms. It is currently
impossible (for me, anyway) to understand how the machinery to support
the described features could be built. Without that I cannot approve
the PEP. Phillip knows this but is too busy to work on it.

> except for the case of retroactive generification, which GvR wants made
> explict in the user's code, AIUI.
>
> But there are concerns about the implementation. Overiding inside classes
> would need a new implementation, but at the moment your not sure how to
> implement that. Also your current bootstrapping system requires in-place
> modifing of some functions. You think using a third type of function could
> perhaps fix that if no cleaner solution appears, correct?
>
> Also what has happened with the Interfaces/Adpatation/Aspects part of the
> document? How does that mesh with the ABC's?
> After all adaptable interfaces and ABCs have such similar use cases users
> may not be sure which to use.
> Or has that part been defered for now, as the GF and method combination part
> is not dependent on those?

AFAIK Phillip has declared that his implementation only uses (or could
be made to only use) isinstance()/issubclass(), and the overriding of
these two used by the ABCs is actually very convenient for the GF PEP.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alexandre at peadrop.com  Fri Jul 20 18:21:56 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Fri, 20 Jul 2007 12:21:56 -0400
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <ca471dc20707181336n294fc353vc4eefc82854a8759@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<acd65fa20706231124q4e5d5192kdc5694d52175e660@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
	<ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
	<acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>
	<acd65fa20707181332n480bf6fsa7bff17403770786@mail.gmail.com>
	<ca471dc20707181336n294fc353vc4eefc82854a8759@mail.gmail.com>
Message-ID: <acd65fa20707200921r55a12be8n908de344e4327607@mail.gmail.com>

How this different from setting the position to the new size? What
should happen when someone call truncate() with an argument greater
than the current size? Should it do a seek, or nothing?

Thanks,
-- Alexandre

On 7/18/07, Guido van Rossum <guido at python.org> wrote:
> Unless anyone cares, it should imply a seek to the indicated position
> if an argument was present.
>
> On 7/18/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> > So, any decision on the proposed semantic change of truncate?
> >
> > On 7/3/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> > > On 7/2/07, Guido van Rossum <guido at python.org> wrote:
> > > > Honestly, I think truncate() should always set the current position to
> > > > the new size, even though that's not what it currently does.
> > >
> > > Thought about that and I think that would be the best thing to do.
> > > That would avoid making StringIO unnecessary different from BytesIO.
> > > And IMHO, it is less prone to bugs. If someone wants to truncate while
> > > keeping the current position, then he will have to state is intention
> > > explicitly by saving the value of tell() and calling seek() after
> > > truncating.
> > >
> > > I also find the semantic make more sense too. For example:
> > >
> > >    >>> s = StringIO("Good bye, world")
> > >    >>> s.truncate(10)
> > >    >>> s.write("cruel world")
> > >    >>> s.getvalue()
> > >    ???
> > >
> > > I think that should return "Good bye, cruel world", not "cruel world".
> > >
> > > So, does anyone else agree with this small semantic change of truncate()?
> > >

From guido at python.org  Fri Jul 20 18:51:08 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Jul 2007 09:51:08 -0700
Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek
	properly
In-Reply-To: <acd65fa20707200921r55a12be8n908de344e4327607@mail.gmail.com>
References: <acd65fa20706230853w32f8895g91b7715c456900b7@mail.gmail.com>
	<ca471dc20706231148p7cbb9953tb31099dfe68c9a32@mail.gmail.com>
	<acd65fa20706251114u60bae701ve95a84ffee27e0b2@mail.gmail.com>
	<acd65fa20706280737n54b8dea8l5362b8545c990236@mail.gmail.com>
	<acd65fa20707021046o4349aafdxd7b895f502edd32@mail.gmail.com>
	<ca471dc20707021138o3392bc11u9a9be3f1a6f4dda1@mail.gmail.com>
	<acd65fa20707030806i60b0e77dm71b394f279e2172c@mail.gmail.com>
	<acd65fa20707181332n480bf6fsa7bff17403770786@mail.gmail.com>
	<ca471dc20707181336n294fc353vc4eefc82854a8759@mail.gmail.com>
	<acd65fa20707200921r55a12be8n908de344e4327607@mail.gmail.com>
Message-ID: <ca471dc20707200951s9989585pfd1fe19d43f6beec@mail.gmail.com>

They shouldn't, really, and I don't care too much about what happens
in that case. It may depend on whether the I/O device honors seeks
beyond EOF or not.

On 7/20/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> How this different from setting the position to the new size? What
> should happen when someone call truncate() with an argument greater
> than the current size? Should it do a seek, or nothing?
>
> Thanks,
> -- Alexandre
>
> On 7/18/07, Guido van Rossum <guido at python.org> wrote:
> > Unless anyone cares, it should imply a seek to the indicated position
> > if an argument was present.
> >
> > On 7/18/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> > > So, any decision on the proposed semantic change of truncate?
> > >
> > > On 7/3/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> > > > On 7/2/07, Guido van Rossum <guido at python.org> wrote:
> > > > > Honestly, I think truncate() should always set the current position to
> > > > > the new size, even though that's not what it currently does.
> > > >
> > > > Thought about that and I think that would be the best thing to do.
> > > > That would avoid making StringIO unnecessary different from BytesIO.
> > > > And IMHO, it is less prone to bugs. If someone wants to truncate while
> > > > keeping the current position, then he will have to state is intention
> > > > explicitly by saving the value of tell() and calling seek() after
> > > > truncating.
> > > >
> > > > I also find the semantic make more sense too. For example:
> > > >
> > > >    >>> s = StringIO("Good bye, world")
> > > >    >>> s.truncate(10)
> > > >    >>> s.write("cruel world")
> > > >    >>> s.getvalue()
> > > >    ???
> > > >
> > > > I think that should return "Good bye, cruel world", not "cruel world".
> > > >
> > > > So, does anyone else agree with this small semantic change of truncate()?
> > > >
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From unknown_kev_cat at hotmail.com  Fri Jul 20 19:15:51 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Fri, 20 Jul 2007 13:15:51 -0400
Subject: [Python-3000] pep 3124 plans
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com><20070713173936.53C213A404D@sparrow.telecommunity.com><f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
Message-ID: <f7qqka$igc$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc20707200749p4ed42134h453c7535c98cc73d at mail.gmail.com...
> On 7/19/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>> So the state of the PEP? From the rest of the posts so far,
>> it sounds like there is no real objection to the basic end user API as
>> described in the PEP,
>
> Actually I want to reserve judgment on that until the PEP is rewritten
> to explain and document the underlying mechanisms. It is currently
> impossible (for me, anyway) to understand how the machinery to support
> the described features could be built. Without that I cannot approve
> the PEP. Phillip knows this but is too busy to work on it.
>

Fair enough. However, You see nothing terribly broken with the end user side 
of the PEP,
assuming the underlining machinery can be built in a reasonable way, 
correct?


>> except for the case of retroactive generification, which GvR wants made
>> explict in the user's code, AIUI.
>>
>> But there are concerns about the implementation. Overiding inside classes
>> would need a new implementation, but at the moment your not sure how to
>> implement that. Also your current bootstrapping system requires in-place
>> modifing of some functions. You think using a third type of function 
>> could
>> perhaps fix that if no cleaner solution appears, correct?
>>
>> Also what has happened with the Interfaces/Adpatation/Aspects part of the
>> document? How does that mesh with the ABC's?
>> After all adaptable interfaces and ABCs have such similar use cases users
>> may not be sure which to use.
>> Or has that part been defered for now, as the GF and method combination 
>> part
>> is not dependent on those?
>
> AFAIK Phillip has declared that his implementation only uses (or could
> be made to only use) isinstance()/issubclass(), and the overriding of
> these two used by the ABCs is actually very convenient for the GF PEP.
>

Ok, but what about the potential for confusion between @abc.abstractmethod 
and @overloading.abstract?
They are similar, but the ABC's one appears to block instantiation of a 
class that contains (or whoses ancestors contain) an abstractmethod that has 
not been overrideen by inheritance. On the other hand the interfaces in PEP 
3124 work quite differently. Implementations of the abstract functions can 
be provided by GFs. As such, an interface can be used even if there are no 
classes implementing it.

Yet despite those differences, the common use cases for interfaces seem 
pretty much identical to the common use cases of ABCs, which I fear will be 
a problem, as the end user may not be able to easily decide which to use. 
(My personal thoughts would be to use ABCs normally, and use the PEP 3124 
interfaces only as adapters.) 


From guido at python.org  Fri Jul 20 19:30:41 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Jul 2007 10:30:41 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <f7qqka$igc$1@sea.gmane.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<f7qqka$igc$1@sea.gmane.org>
Message-ID: <ca471dc20707201030s49a02240veab2c125f75ab68d@mail.gmail.com>

On 7/20/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> "Guido van Rossum" <guido at python.org> wrote in message
> news:ca471dc20707200749p4ed42134h453c7535c98cc73d at mail.gmail.com...
> > On 7/19/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> >> So the state of the PEP? From the rest of the posts so far,
> >> it sounds like there is no real objection to the basic end user API as
> >> described in the PEP,
> >
> > Actually I want to reserve judgment on that until the PEP is rewritten
> > to explain and document the underlying mechanisms. It is currently
> > impossible (for me, anyway) to understand how the machinery to support
> > the described features could be built. Without that I cannot approve
> > the PEP. Phillip knows this but is too busy to work on it.
>
> Fair enough. However, You see nothing terribly broken with the end user side
> of the PEP,
> assuming the underlining machinery can be built in a reasonable way,
> correct?

Not at all true. How can I be in agreement with an incomplete PEP? I
don't want to reject the PEP only because it's incomplete, but a good
understanding of the interaction between the simple end-user API and
machinery is essential for acceptance.

> >> except for the case of retroactive generification, which GvR wants made
> >> explict in the user's code, AIUI.
> >>
> >> But there are concerns about the implementation. Overiding inside classes
> >> would need a new implementation, but at the moment your not sure how to
> >> implement that. Also your current bootstrapping system requires in-place
> >> modifing of some functions. You think using a third type of function
> >> could
> >> perhaps fix that if no cleaner solution appears, correct?
> >>
> >> Also what has happened with the Interfaces/Adpatation/Aspects part of the
> >> document? How does that mesh with the ABC's?
> >> After all adaptable interfaces and ABCs have such similar use cases users
> >> may not be sure which to use.
> >> Or has that part been defered for now, as the GF and method combination
> >> part
> >> is not dependent on those?
> >
> > AFAIK Phillip has declared that his implementation only uses (or could
> > be made to only use) isinstance()/issubclass(), and the overriding of
> > these two used by the ABCs is actually very convenient for the GF PEP.
> >
>
> Ok, but what about the potential for confusion between @abc.abstractmethod
> and @overloading.abstract?
> They are similar, but the ABC's one appears to block instantiation of a
> class that contains (or whoses ancestors contain) an abstractmethod that has
> not been overrideen by inheritance. On the other hand the interfaces in PEP
> 3124 work quite differently. Implementations of the abstract functions can
> be provided by GFs. As such, an interface can be used even if there are no
> classes implementing it.

You're right, there are conflicting ideas here. A quick read of the
"Interfaces and Adaptation" section doesn't make me think that I'd
like to use it instead of PEP 3119 though; the mechanism is more
powerful (it lets you convert a list to an IStack whose pop method
calls the list's append method) but also more verbose (you have to
make declarations about each individual method).

> Yet despite those differences, the common use cases for interfaces seem
> pretty much identical to the common use cases of ABCs, which I fear will be
> a problem, as the end user may not be able to easily decide which to use.
> (My personal thoughts would be to use ABCs normally, and use the PEP 3124
> interfaces only as adapters.)

Agreed.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Jul 20 19:45:54 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 20 Jul 2007 13:45:54 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
Message-ID: <20070720174706.AE5773A40A8@sparrow.telecommunity.com>

At 07:49 AM 7/20/2007 -0700, Guido van Rossum wrote:
>On 7/19/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> > So the state of the PEP? From the rest of the posts so far,
> > it sounds like there is no real objection to the basic end user API as
> > described in the PEP,
>
>Actually I want to reserve judgment on that until the PEP is rewritten
>to explain and document the underlying mechanisms. It is currently
>impossible (for me, anyway) to understand how the machinery to support
>the described features could be built. Without that I cannot approve
>the PEP. Phillip knows this but is too busy to work on it.

Actually, I was under the impression you didn't want the API 
described in the PEP, and wanted the following changes in addition to 
dropping method combination, aspects, and interfaces:

* :next_method as a keyword-only argument

* @somegeneric.overload as the standard decorator (w/no @overload or @when)

* advance declaration of a function as overloadable (which is also 
required by the previous change and by your preference not to modify 
functions in-place)

Also, I didn't know you wanted an explanation of how the underlying 
mechanisms work in general.  I thought the only piece you were 
looking for more explanation of was the method combination machinery 
-- which would be moot if we're scaling back the API as described by the above.

Just to be sure I'm clear as to what you want, is that the only 
mechanism you're unclear on, or is the whole thing unclear?  The 
whole thing was inspired by your overloading prototype, I've just 
made all the concrete bits of it more... "generic".

That is, instead of using issubclass or other explicit relationship 
tests between overload signatures, I use a generic function 
implies().  Instead of simply storing a method added as an overload, 
I use a "combine_actions()" generic function to combine it with any 
method that's already there (possibly including a method type for "No 
Method Found").  Instead of simply finding the most-specific matching 
signature on cache misses, I use combine_actions() to combine *all 
applicable* actions (i.e., all those that the calling signature implies()).

The combine_actions() function uses another generic function, 
overrides(), to compare method priorities.  overrides() is defined so 
that Around beats Before beats After beats regular methods beats no 
method found.  The overrides() of two methods of the same type is 
determined by which signature implies() the other, without also being 
implied *by* the other.

If there is no overrides() order between two methods, you get an 
AmbiguousMethod combining the two -- which can be overridden by any 
method whose signature implies() everything in the AmbiguousMethod.

All this is pretty much the same as in your prototype, except that 
it's done by adding these rules to the generic functions, rather than 
by hardcoding them.  That's why it's bigger than your prototype, but 
also why it's extensible in terms of adding new method types or ways 
to specify signatures.

I then also added the ability to attach different dispatchers to a 
function, so that you could replace the simple "tuple of types" 
matching with more sophisticated engines like RuleDispatch's, while 
still retaining the ability to use the same method combinations and 
existing overloads registered for a function.

That is, it lets you keep the same API for defining overloads and 
method combinations as the basic implementation, while allowing the 
actual overload targets and dispatching mechanisms to vary.

That's pretty much it except for Aspects and Interfaces.  I've ended 
up making my Aspect implementation available separately in the 
ObjectRoles cheeseshop package, renaming them Roles instead of Aspects.

(And yes, I will add all the above explanation to the PEP.)


>AFAIK Phillip has declared that his implementation only uses (or could
>be made to only use) isinstance()/issubclass(), and the overriding of
>these two used by the ABCs is actually very convenient for the GF PEP.

Yep.  The overload of "implies(c1:type, c2:type)" is 
"issubclass".  "isinstance()" isn't used, since that would render 
your type-tuple caching strategy unusable.


From guido at python.org  Fri Jul 20 19:52:14 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Jul 2007 10:52:14 -0700
Subject: [Python-3000] uuid creation not thread-safe?
Message-ID: <ca471dc20707201052p68883fc5l3efd8ecc5cfd497f@mail.gmail.com>

I discovered what appears to be a thread-unsafety in uuid.py. This is
in the trunk as well as in 3.x; I'm using the trunk here for easy
reference. There's some code around like 395:

    import ctypes, ctypes.util
    _buffer = ctypes.create_string_buffer(16)

This creates a *global* buffer which is used as the output parameter
to later calls to _uuid_generate_random() and _uuid_generate_time().
For example, around line 481, in uuid1():

        _uuid_generate_time(_buffer)
        return UUID(bytes=_buffer.raw)

Clearly if two threads do this simultaneously they are overwriting
_buffer in unpredictable order. There are a few other occurrences of
this too.

I find it somewhat disturbing that what seems a fairly innocent
function that doesn't *appear* to have global state is nevertheless
not thread-safe. Would it be wise to fix this, e.g. by allocating a
fresh output buffer inside uuid1() and other callers?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bwinton at latte.ca  Fri Jul 20 20:07:36 2007
From: bwinton at latte.ca (Blake Winton)
Date: Fri, 20 Jul 2007 14:07:36 -0400
Subject: [Python-3000] _heapq.c, etc. (was Re:  Heaptypes)
In-Reply-To: <ca471dc20707200744r4a8efc1an444d7f4f894ff23a@mail.gmail.com>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>	<20070720010804.85A7.JCARLSON@uci.edu>
	<ca471dc20707200744r4a8efc1an444d7f4f894ff23a@mail.gmail.com>
Message-ID: <46A0F9E8.8010404@latte.ca>

Guido van Rossum wrote:
>> While the pickle/cPickle, StringIO/cStringIO, etc., naming can be
 >> a bit annoying, it does give me the choice whether I want it to be
 >> fast or flexible.
> I definitely *don't* want to continue the old habit of having a slow
> and a fast module with different names; the experience with especially
> cPickle and cStringIO is that everyone believes their code is
> performance critical and hence uses the C version if it exists,
> thereby repeating the same idiom over and over.

Until they need to turn Unicode strings into file-like objects, at which 
point they go back to StringIO.  (Why yes, I was recently bitten by that 
particular "restriction".  :)

Later,
Blake.

From guido at python.org  Fri Jul 20 20:25:40 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Jul 2007 11:25:40 -0700
Subject: [Python-3000] _heapq.c, etc. (was Re: Heaptypes)
In-Reply-To: <46A0F9E8.8010404@latte.ca>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
	<20070720010804.85A7.JCARLSON@uci.edu>
	<ca471dc20707200744r4a8efc1an444d7f4f894ff23a@mail.gmail.com>
	<46A0F9E8.8010404@latte.ca>
Message-ID: <ca471dc20707201125j3b391fdy67be2a44e0bb4ef1@mail.gmail.com>

On 7/20/07, Blake Winton <bwinton at latte.ca> wrote:
> Guido van Rossum wrote:
> >> While the pickle/cPickle, StringIO/cStringIO, etc., naming can be
>  >> a bit annoying, it does give me the choice whether I want it to be
>  >> fast or flexible.
> > I definitely *don't* want to continue the old habit of having a slow
> > and a fast module with different names; the experience with especially
> > cPickle and cStringIO is that everyone believes their code is
> > performance critical and hence uses the C version if it exists,
> > thereby repeating the same idiom over and over.
>
> Until they need to turn Unicode strings into file-like objects, at which
> point they go back to StringIO.  (Why yes, I was recently bitten by that
> particular "restriction".  :)

Py3k will have separate BytesIO and StringIO classes (both in the io
module). The accelerations, if any, will be transparent. Subclasses or
usage depending on implementation details however are not supported.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Jul 21 02:17:12 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Jul 2007 17:17:12 -0700
Subject: [Python-3000] Need help fixing tests in str/unicode branch
Message-ID: <ca471dc20707201717x457f07d2pd841608db5168c2d@mail.gmail.com>

Thanks to all who helped fixing tests in the str/unicode branch! We're
down to about 35 failing tests. I still need help -- especially since
we're now getting into territory that I don't know all that well, for
example the email package or XML support.

The list of unit tests that need help is still on the wiki:
http://wiki.python.org/moin/Py3kStrUniTests

Instructions on how to help and how to avoid duplicate work are also
there. Please help!

Thanks to all those who already fixed one or more tests!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Sat Jul 21 03:57:23 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 21 Jul 2007 13:57:23 +1200
Subject: [Python-3000] PEP 368: Standard image protocol and class
In-Reply-To: <f7pk8b$evu$1@sea.gmane.org>
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com>
	<cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com>
	<740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com>
	<cc93256f0707010959o44c77912sb989c68cf890b846@mail.gmail.com>
	<f7pk8b$evu$1@sea.gmane.org>
Message-ID: <46A16803.1020200@canterbury.ac.nz>

Joe Smith wrote:
> If the maintainers of most of the large packages that do imaging are willing 
> to support this,
> and your code is good, I see absolutely no reason why this PEP would not be 
> accepted.

Something that bothers me about it a little is that
the core Python/C API seems like the wrong place to put
PyImge_* functions.

--
Greg

From greg.ewing at canterbury.ac.nz  Sat Jul 21 04:01:42 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 21 Jul 2007 14:01:42 +1200
Subject: [Python-3000] _heapq.c, etc. (was Re:  Heaptypes)
In-Reply-To: <20070720010804.85A7.JCARLSON@uci.edu>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
	<20070720010804.85A7.JCARLSON@uci.edu>
Message-ID: <46A16906.7010005@canterbury.ac.nz>

Josiah Carlson wrote:
> What made it really annoying is that there was no way to tell the heapq
> module not to load the C version so that I could use a generic container. 

I would say that all such dual-implementation modules should
make the specific implementations available under different
names, using some convention such as _c_heapq/_p_heapq.

--
Greg

From joe at bitworking.org  Sat Jul 21 06:12:51 2007
From: joe at bitworking.org (Joe Gregorio)
Date: Sat, 21 Jul 2007 00:12:51 -0400
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
Message-ID: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>

Should xml.parsers.expat.XMLParser.ParseFile(file) operate on
both text and binary streams?

If it should operate on text streams then an
issue arises from "read(n)" meaning different
things for text and binary streams. If the stream passed in
is "text" then read(n) will read
'n' unicode characters, but pyexpat.c allocates
a buffer of 2048 bytes and calls read(2048) which could
obviously return more than 2048 bytes.

The simplest solution in the case of a text stream
is to be safe and convert that into read(2048/4)
to accommodate the worst case scenario.

Has this come up before and is there a better solution?

   Thanks,
   -joe


On 7/20/07, Guido van Rossum <guido at python.org> wrote:
> Thanks to all who helped fixing tests in the str/unicode branch! We're
> down to about 35 failing tests. I still need help -- especially since
> we're now getting into territory that I don't know all that well, for
> example the email package or XML support.
>
> The list of unit tests that need help is still on the wiki:
> http://wiki.python.org/moin/Py3kStrUniTests
>
> Instructions on how to help and how to avoid duplicate work are also
> there. Please help!
>
> Thanks to all those who already fixed one or more tests!
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/joe%40bitworking.org
>


-- 
Joe Gregorio        http://bitworking.org

From fdrake at acm.org  Sat Jul 21 06:25:10 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 21 Jul 2007 00:25:10 -0400
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>
Message-ID: <200707210025.11031.fdrake@acm.org>

On Saturday 21 July 2007, Joe Gregorio wrote:
 > Should xml.parsers.expat.XMLParser.ParseFile(file) operate on
 > both text and binary streams?

No.  XML is a serialization of a markup language containing Unicode character 
into an encoded stream.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From talin at acm.org  Sat Jul 21 07:55:24 2007
From: talin at acm.org (Talin)
Date: Fri, 20 Jul 2007 22:55:24 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070720174706.AE5773A40A8@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<20070713173936.53C213A404D@sparrow.telecommunity.com>	<f7pgki$6o3$1@sea.gmane.org>	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
Message-ID: <46A19FCC.7070609@acm.org>

Phillip J. Eby wrote:
> At 07:49 AM 7/20/2007 -0700, Guido van Rossum wrote:
>> On 7/19/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>>> So the state of the PEP? From the rest of the posts so far,
>>> it sounds like there is no real objection to the basic end user API as
>>> described in the PEP,
>> Actually I want to reserve judgment on that until the PEP is rewritten
>> to explain and document the underlying mechanisms. It is currently
>> impossible (for me, anyway) to understand how the machinery to support
>> the described features could be built. Without that I cannot approve
>> the PEP. Phillip knows this but is too busy to work on it.
> 
> Actually, I was under the impression you didn't want the API 
> described in the PEP, and wanted the following changes in addition to 
> dropping method combination, aspects, and interfaces:

I'd like to clarify these requirements a little bit:

On the issue of method combination, aspects, and interfaces: Guido has 
not made a pronouncement on whether these things may or may not be 
accepted at some time in the future. What he has said is that he doesn't 
*yet* understand the use case for them, and that these should be 
separate PEPs so that we can argue their merits independently. What he's 
strongly against (if my understanding is correct) is a "package deal" 
where he is forced to accept all of the features, or none.

I get the sense that the need for some of these advanced features 
becomes apparent only after having worked with generics for a while. If 
that's the case, then the best hope for including them in the stdlib is 
to get an implementation of generics into the hands of lots of Python 
programmers so that they can become familiar with them.

> * :next_method as a keyword-only argument
> 
> * @somegeneric.overload as the standard decorator (w/no @overload or @when)

You mentioned earlier that there was a design reason for preferring 
@overload and @when vs. the earlier RuleDispatch syntax, but the 
explanation you gave wasn't very clear (to me anyway).

(I personally prefer the @somegeneric.overload, but that's purely an 
aesthetic value judgement - if there's a strong architectural advantage 
of the other syntax, I'd like to hear it.)

> * advance declaration of a function as overloadable (which is also 
> required by the previous change and by your preference not to modify 
> functions in-place)

Right. There are two reasons that I think that post-hoc overloading runs 
into problems. The first, as you mentioned, is that it's difficult to 
implement without some kind of trickery.

The second reason - this is my opinion - is that it too much resembles 
the mythical "comefrom" statement (the opposite of "goto"). The 
"comefrom" statement is intended to be a joke - the worst possible 
language feature from the standpoint of being able to manually trace the 
flow of execution of a program.

I do think that there are use cases for being able to 'decorate' (in the 
broader sense) the execution of a function, in an aspect-like way; But I 
also think that such power should not be used casually, and places where 
its used should stick out in a way that makes them visually obvious and 
searchable.

> Also, I didn't know you wanted an explanation of how the underlying 
> mechanisms work in general.  I thought the only piece you were 
> looking for more explanation of was the method combination machinery 
> -- which would be moot if we're scaling back the API as described by the above.
> 
> Just to be sure I'm clear as to what you want, is that the only 
> mechanism you're unclear on, or is the whole thing unclear?  The 
> whole thing was inspired by your overloading prototype, I've just 
> made all the concrete bits of it more... "generic".

It seems to me that PEPs should only be required to explain their 
mechanisms if there's some doubt or controversy about the 
implementation. It seems to me that this PEP pushes the bounds of what 
is efficiently doable, so some extra explanation is required.

One issue that hasn't been satisfactorily resolved is the handling of 
the 'self' parameter. At least, let me give my explanation of what I 
think the issue is and see if we're on the same page:

Overloading a class method requires special treatment of the 'self' 
parameter because there's an implicit constraint on what types of 
objects can be passed as 'self': for any method defined in any class, 
the 'self' parameter must be an instance of the class (or a subclass) in 
which the method is defined. Now, this would be trivial if we required 
the programmer to explicitly declare the type of 'self', but this 
violates DRY and has the potential to cause mischief if the programmer 
forgets to update the method signature when they change the class.

In order to avoid this syntactical redundancy, there is a desire to be 
able to automatically detect the type of the class in which the overload 
is declared.

This is hard to do, because the "overload" machinery is handled by a 
function decorator, which runs before the class is actually constructed. 
Various methods for deducing the class have been proposed, but they have 
all so far been somewhat problematic, especially in light of "new-style" 
metaclasses.

I can think of only two approaches for solving this cleanly.

The first is that the overload decorator should be given some C-code 
help. Now, I recognize that part of your goal was to make the initial 
prototype a "pure Python" implementation in order to make life easier 
for Jython/IronPython and friends. That is certainly laudable. However, 
if the C-code help is a relatively small function that can be 
reimplemented for the other interpreters, then the impact on portability 
will be small.

The other approach is to somehow defer the work until after the class is 
fully constructed. The question then is when will the work be done - in 
other words, where should the decorator hook its fixup callback?

Even assuming we had some sort of hook that would be triggered when a 
class has finished construction, then the question is what about 
non-member generic functions? Since they are not contained in a class 
body, this hypothetical hook will never be called, and thus the methods 
won't be "finished". (A way around this would be to say that the only 
thing that the class-construction hook does is to add the additional 
type information for 'self', and the method is otherwise finished and 
ready to go as soon as the decorator is completed.)

If it turns out that there's no way to get a callback when the class has 
finished being built, then we may have to defer finishing the 
construction until the first time the generic function is called. This 
wouldn't be too bad, considering that there's a bunch of other stuff 
that is lazily calculated on first call anyway, from what I understand.

> That is, instead of using issubclass or other explicit relationship 
> tests between overload signatures, I use a generic function 
> implies().  Instead of simply storing a method added as an overload, 
> I use a "combine_actions()" generic function to combine it with any 
> method that's already there (possibly including a method type for "No 
> Method Found").  Instead of simply finding the most-specific matching 
> signature on cache misses, I use combine_actions() to combine *all 
> applicable* actions (i.e., all those that the calling signature implies()).
> 
> The combine_actions() function uses another generic function, 
> overrides(), to compare method priorities.  overrides() is defined so 
> that Around beats Before beats After beats regular methods beats no 
> method found.  The overrides() of two methods of the same type is 
> determined by which signature implies() the other, without also being 
> implied *by* the other.
> 
> If there is no overrides() order between two methods, you get an 
> AmbiguousMethod combining the two -- which can be overridden by any 
> method whose signature implies() everything in the AmbiguousMethod.
> 
> All this is pretty much the same as in your prototype, except that 
> it's done by adding these rules to the generic functions, rather than 
> by hardcoding them.  That's why it's bigger than your prototype, but 
> also why it's extensible in terms of adding new method types or ways 
> to specify signatures.
> 
> I then also added the ability to attach different dispatchers to a 
> function, so that you could replace the simple "tuple of types" 
> matching with more sophisticated engines like RuleDispatch's, while 
> still retaining the ability to use the same method combinations and 
> existing overloads registered for a function.
> 
> That is, it lets you keep the same API for defining overloads and 
> method combinations as the basic implementation, while allowing the 
> actual overload targets and dispatching mechanisms to vary.
> 
> That's pretty much it except for Aspects and Interfaces.  I've ended 
> up making my Aspect implementation available separately in the 
> ObjectRoles cheeseshop package, renaming them Roles instead of Aspects.
> 
> (And yes, I will add all the above explanation to the PEP.)
> 
> 
>> AFAIK Phillip has declared that his implementation only uses (or could
>> be made to only use) isinstance()/issubclass(), and the overriding of
>> these two used by the ABCs is actually very convenient for the GF PEP.
> 
> Yep.  The overload of "implies(c1:type, c2:type)" is 
> "issubclass".  "isinstance()" isn't used, since that would render 
> your type-tuple caching strategy unusable.
> 
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/talin%40acm.org
> 

From unknown_kev_cat at hotmail.com  Sat Jul 21 09:20:47 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Sat, 21 Jul 2007 03:20:47 -0400
Subject: [Python-3000] pep 3124 plans
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<20070713173936.53C213A404D@sparrow.telecommunity.com>	<f7pgki$6o3$1@sea.gmane.org>	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com><20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
Message-ID: <f7sc4k$bht$1@sea.gmane.org>


"Talin" <talin at acm.org> wrote in message news:46A19FCC.7070609 at acm.org...
> Phillip J. Eby wrote:
>> At 07:49 AM 7/20/2007 -0700, Guido van Rossum wrote:
>>> On 7/19/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>>>> So the state of the PEP? From the rest of the posts so far,
>>>> it sounds like there is no real objection to the basic end user API as
>>>> described in the PEP,
>>> Actually I want to reserve judgment on that until the PEP is rewritten
>>> to explain and document the underlying mechanisms. It is currently
>>> impossible (for me, anyway) to understand how the machinery to support
>>> the described features could be built. Without that I cannot approve
>>> the PEP. Phillip knows this but is too busy to work on it.
>>
>> Actually, I was under the impression you didn't want the API
>> described in the PEP, and wanted the following changes in addition to
>> dropping method combination, aspects, and interfaces:
>
> I'd like to clarify these requirements a little bit:
>
> On the issue of method combination, aspects, and interfaces: Guido has
> not made a pronouncement on whether these things may or may not be
> accepted at some time in the future. What he has said is that he doesn't
> *yet* understand the use case for them, and that these should be
> separate PEPs so that we can argue their merits independently. What he's
> strongly against (if my understanding is correct) is a "package deal"
> where he is forced to accept all of the features, or none.
>
> I get the sense that the need for some of these advanced features
> becomes apparent only after having worked with generics for a while. If
> that's the case, then the best hope for including them in the stdlib is
> to get an implementation of generics into the hands of lots of Python
> programmers so that they can become familiar with them.
>


Well perhaps I can explain a few things. First of all it is important to 
note that generic functions
don't do much that cannot already be done, but sometimes using generic 
functions can make things easier to read and maintain.

For the purposes of talking about this, we will consider a simple function 
of one argument.
The most basic type of generic function dispatch is one that dispatches 
based on object type. Now clearly,
one could achieve the same basic effect by doing type-checking in the body 
and putting what would be the contents of the generic function inside the 
body of an if or switch statement. But lets say there are 15 possible types, 
each of which needs to be handled differently. In that case, something like 
generic functions make the code far more readable.

One of the nice features of Eby's proposal is that more complicated 
dispatching systems can be added. Perhaps some application needs a 
dispatching engine that can dispatch based on the value of an objects 
member. Perhaps the user wants an overload specificly for any product object 
whose price property equals 0. With Eby's system adding a dispatch engine 
that supports that is not difficult.

But realize that generic functions are a type of method combination. 
Basically the alternatives are combined together. Sure, they remain separate 
functions in python memory, but to a caller, it looks like a single method.

As such some of the support framework will be the same for both, it seems 
logical to propose a full method combination system at the same time.

What are the use cases for method combination? Well lets say you are using a 
third party library. One of the functions you want to use works ok, however, 
when it is operating some specific type of object (one of your design 
perhaps), and it does not cleanup properly for that object, because it was 
not aware of the specifics of that type of object. Perhaps it leaves a file 
handle open. One could use an after method to perform the cleanup. Now, one 
may argue that you could also just replace the function with a wrapper 
function that calls the original and then does the cleanup. However, what if 
there were more than one such instance needed. What if there where many? 
Then it would be nice to be able to use a mechanism not unlike the generic 
function system that could keep track of all of them and combine them.

Before methods are useful for things like adding extra bounds checking to an 
existing function.

For what its worth, I've worked with a system that had something related to 
the before and after methods, and found it worked well.

As you can hopefully see so far the name of the game is to combine code from 
different places and perhaps written by different people, and present them 
to the user as one cohesive method. That is what Generic functions do. That 
is what method combination does. It seems to me to be a good idea to 
implement them together to ensure they work together properly.

The effects of this can be wonderful. A package could convert some of a 
frameworks functions to generics, to allow them to handle the new objects 
the package provides. It might also need to add some before and after 
methods to ensure that the user of the module, it looks like the framework 
was designed to support the module in question, when in fact, it was not. 
The idea being the package can basically make the needed changes so that 
everything just works. All without having to duplicate any code from the 
framework itself.
See the benefits? (The framework mentioned could be a major framework like 
ZOPE, just an average package, or even a simple module.)


Now on to the interfaces/adaptation part of the PEP. I would rather see that 
system primarily used as an adaptation system. It seems very well designed 
for that purpose.

To me interfaces are a way for a class to tell other code that it has a 
certain set of properties and methods which act in a specific fashion. While 
Eby's proposal can do that, ABCs seems like a nicer way to do that in my 
opinion.

However, an adaptor provides a means to use a single interface to interact 
with objects that provide similar functionally, natively have different 
interfaces. Eby's described system sounds ideal for that purpose.

That said, I think it can be reasonably spun off into a separate PEP. It is 
very much dependent on an implementation of generic functions, but AFAICT 
the rest of the PEP does not depend on it.


Please feel free to correct me If I made any mistakes in the above analysis. 


From martin at v.loewis.de  Sat Jul 21 11:38:07 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 11:38:07 +0200
Subject: [Python-3000] _heapq.c, etc. (was Re:  Heaptypes)
In-Reply-To: <46A16906.7010005@canterbury.ac.nz>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>	<20070720010804.85A7.JCARLSON@uci.edu>
	<46A16906.7010005@canterbury.ac.nz>
Message-ID: <46A1D3FF.4020000@v.loewis.de>

Greg Ewing schrieb:
> Josiah Carlson wrote:
>> What made it really annoying is that there was no way to tell the heapq
>> module not to load the C version so that I could use a generic container. 
> 
> I would say that all such dual-implementation modules should
> make the specific implementations available under different
> names, using some convention such as _c_heapq/_p_heapq.

You mean, like prefixing it with c, e.g. StringIO vs. cStringIO,
pickle vs. cPickle?

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Sat Jul 21 11:56:10 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 21 Jul 2007 21:56:10 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A19FCC.7070609@acm.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
Message-ID: <46A1D83A.2080308@canterbury.ac.nz>

Talin wrote:
> Overloading a class method requires special treatment of the 'self' 
> parameter because there's an implicit constraint on what types of 
> objects can be passed as 'self'

Hang on a minute. Is it really necessary for the GF
machinery to concern itself with this? By the time
you get to the (possibly overloaded) method object,
dispatching on 'self' has already been done. So the
GF machinery can just ignore 'self' and dispatch on
the rest of the arguments -- can't it?

--
Greg

From greg.ewing at canterbury.ac.nz  Sat Jul 21 12:03:56 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 21 Jul 2007 22:03:56 +1200
Subject: [Python-3000] _heapq.c, etc. (was Re:  Heaptypes)
In-Reply-To: <46A1D3FF.4020000@v.loewis.de>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
	<20070720010804.85A7.JCARLSON@uci.edu>
	<46A16906.7010005@canterbury.ac.nz> <46A1D3FF.4020000@v.loewis.de>
Message-ID: <46A1DA0C.5010107@canterbury.ac.nz>

Martin v. L?wis wrote:
> You mean, like prefixing it with c, e.g. StringIO vs. cStringIO,
> pickle vs. cPickle?

Yes, but with an official scheme for deriving the names
from the main package name, and also an understanding
that these are implementation details to be used only
when really necessary (hence the leading underscores).

Considering Guido's comment about people gratuitously
using the C versions, perhaps only the Python version
should be made available as an official alternative.
It's unlikely that people will gratuitously choose what
they perceive to be a *slower* version of the module. :-)

--
Greg

From dalke at dalkescientific.com  Sat Jul 21 16:23:45 2007
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat, 21 Jul 2007 16:23:45 +0200
Subject: [Python-3000] removing exception .args
Message-ID: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>

Posting here a expansion of a short discussion I had
after Guido's keynote at EuroPython.  In this email
I propose eliminating the ".args" attribute from the
Exception type.  It's not useful, and supporting it
correctly is complicated enough that it's often not
supported correctly


In Python 2 the base Exception class works like this

 >>> x = Exception("spam", "was", "here")
 >>> x[0]
'spam'
 >>> x.args
('spam', 'was', 'here')
 >>>

In Py3K the [0] index lookup disappears.  This is a
good thing.  Positional lookup like this is rarely useful.

The .args attribute remains.  I don't see the need for
it and propose that it be removed in Py3K.

Why?  The "args" attribute is not useful.  People making
non-trivial Exception subclasses often forget to call
__init__ on the parent exception, and using attribute
lookups is much better than using an index lookup.  That's
the experience of the stat call.

Having support for a single object (almost always a
string) passed into the exception is pragmatically useful,
so I think the base exception class should look like

class Exception(object):
   msg = None
   def __init__(self, msg):
     self.msg = msg
   def __str__(self):
     if self.msg is not None:
       return "%s()" % (self.__class__.__name__,)
     else:
       return "%s(%r)" % (self.__class__.__name__, self.msg)

**

The rest of this email is because I'm detail oriented
and present evidence to back up my assertion.

There are a number of subclasses which should but don't
call the base __init__, generic error reporting software
can't use the "args protocol" for anything.  Pretty much
the only thing a generic error report mechanism (like
traceback and logging) can do is call str() on the exception.


Here are some examples to show that some exceptions in the
standard library don't do a good job of calling the base
class __init__.

   (in HTMLParser.py)

class HTMLParseError(Exception):
     """Exception raised for all parse errors."""

     def __init__(self, msg, position=(None, None)):
         assert msg
         self.msg = msg
         self.lineno = position[0]
         self.offset = position[1]

    (in calender.py)

# Exceptions raised for bad input
class IllegalMonthError(ValueError):
     def __init__(self, month):
         self.month = month
     def __str__(self):
         return "bad month number %r; must be 1-12" % self.month


    (in doctest.py)

class DocTestFailure(Exception):
     ...
     def __init__(self, test, example, got):
         self.test = test
         self.example = example
         self.got = got

     def __str__(self):
         return str(self.test)


Eyeballing the numbers, I think about 1/3rd of the
standard library Exception subclasses with an __init__
forget to call the base class and forget to set
.args and .msg.


For better readability and maintainability, complex
exceptions with multiple parameters should make those
parameters accessible via attributes, and not expect
clients to reach into the args list by position.  All
three classes I just listed defined a new __init__
so that the parameters were available by name.


Here's an exception which does the right thing under
Python2.  By that I meaning that it fully implements
the exception API and it makes the parameters available
as named attributes.  It also protects against
subclasses which forget to call GetoptError.__init__
by defining class attributes.

    (from getopt.py )

class GetoptError(Exception):
     opt = ''
     msg = ''
     def __init__(self, msg, opt=''):
         self.msg = msg
         self.opt = opt
         Exception.__init__(self, msg, opt)

     def __str__(self):
         return self.msg

This is correct, but cumbersome.  Why should we
encourage all non-trivial subclasses to look like this?


Historically there has been a problem with the existing
".args".  The base class implementation of __str__ required
that that attribute be present.  This changed some time
between 2.3 and 2.5.

This change invalidated comments like this in httplib.py

class HTTPException(Exception):
     # Subclasses that define an __init__ must call Exception.__init__
     # or define self.args.  Otherwise, str() will fail.
     pass

which later on hacks around not calling __init__ by doing this

class UnknownProtocol(HTTPException):
     def __init__(self, version):
         self.args = version,
         self.version = version


One last existing example to point out.  urllib2.py uses

class URLError(IOError):
     # URLError is a sub-type of IOError, but it doesn't share any of
     # the implementation.  need to override __init__ and __str__.
     # It sets self.args for compatibility with other EnvironmentError
     # subclasses, but args doesn't have the typical format with  
errno in
     # slot 0 and strerror in slot 1.  This may be better than nothing.
     def __init__(self, reason):
         self.args = reason,
         self.reason = reason

     def __str__(self):
         return '<urlopen error %s>' % self.reason

Again, a hack. This time a hack because EnvironmentError
wants an errno and an errorstring.

 >>> EnvironmentError(2,"This is an error message","sp")
EnvironmentError(2, 'This is an error message')
 >>> err = EnvironmentError(2,"This is an error message","sp")
 >>> err.errno
2
 >>> err.strerror
'This is an error message'
 >>> err.filename
'sp'
 >>>

(Note the small bug; the filename is not shown in str(err) )


In closing, given an arbitrary exception, the only thing you can
hope might work is str(exception).  There's a decent chance that
.args and even .msg aren't present.  Generic exception handling
code cannot expect those attribute to exist, and handlers for
specific type should use named attributes rather than the less
readable/less maintainable position attributes.

Python3K is allowed to be non-backwards compatible.  I
propose getting rid of this useless feature.

				Andrew
				dalke at dalkescientific.com


From g.brandl at gmx.net  Sat Jul 21 17:08:37 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 21 Jul 2007 17:08:37 +0200
Subject: [Python-3000] removing exception .args
In-Reply-To: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
Message-ID: <f7t7hk$f3c$1@sea.gmane.org>

Andrew Dalke schrieb:
> Posting here a expansion of a short discussion I had
> after Guido's keynote at EuroPython.  In this email
> I propose eliminating the ".args" attribute from the
> Exception type.  It's not useful, and supporting it
> correctly is complicated enough that it's often not
> supported correctly
> 
> 
> 
> In Python 2 the base Exception class works like this
> 
>  >>> x = Exception("spam", "was", "here")
>  >>> x[0]
> 'spam'
>  >>> x.args
> ('spam', 'was', 'here')
>  >>>
> 
> In Py3K the [0] index lookup disappears.  This is a
> good thing.  Positional lookup like this is rarely useful.
> 
> The .args attribute remains.  I don't see the need for
> it and propose that it be removed in Py3K.

Hm, I always found it useful to just do

class MyCustomError(Exception):
     pass

and give it arbitrary arguments to it without writing __init__
method stuff that I can access from outside.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From guido at python.org  Sat Jul 21 17:16:12 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 21 Jul 2007 08:16:12 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <f7sc4k$bht$1@sea.gmane.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org> <f7sc4k$bht$1@sea.gmane.org>
Message-ID: <ca471dc20707210816r4d663cdaqcef7e9f28c150a75@mail.gmail.com>

On 7/21/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> One of the nice features of Eby's proposal is that more complicated
> dispatching systems can be added. Perhaps some application needs a
> dispatching engine that can dispatch based on the value of an objects
> member. Perhaps the user wants an overload specificly for any product object
> whose price property equals 0. With Eby's system adding a dispatch engine
> that supports that is not difficult.

This is true. However it comes at a cost. Whenever I see an API that
takes a string which is then parsed by the called function as a Python
expression (perhaps constrained to a subset of Python) I cringe,
especially if the common use is to pass a literal. There are just so
many issues with that... It's not colorized by the editor, it's not
syntax-checked by either the editor or the Python parser, it requires
one to build yet another parser...

This is why I don't like the ...when("isinstance(obj, list)") syntax
from (I think) RuleDispatch, and I'm glad it's not in the PEP. I'm
unclear however on how you would do this otherwise -- is overloading
implies() the best approach?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Jul 21 17:21:54 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 21 Jul 2007 08:21:54 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A19FCC.7070609@acm.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
Message-ID: <ca471dc20707210821s160c88dy36c82e2184348afc@mail.gmail.com>

On 7/20/07, Talin <talin at acm.org> wrote:
> On the issue of method combination, aspects, and interfaces: Guido has
> not made a pronouncement on whether these things may or may not be
> accepted at some time in the future. What he has said is that he doesn't
> *yet* understand the use case for them, and that these should be
> separate PEPs so that we can argue their merits independently. What he's
> strongly against (if my understanding is correct) is a "package deal"
> where he is forced to accept all of the features, or none.

I'm mellowing out on this a bit -- I'm no longer requesting a separate
PEP with all the advanced features (I understand Phillip's argument
that that second PEP will just be an easy rejection target). I do want
to understand the motivation and implementation for each of the
advanced features, so we can have a reasonable discussion about
whether a particular feature is really worth adding or can easily be
added later by/for the few users who really need it.

> It seems to me that PEPs should only be required to explain their
> mechanisms if there's some doubt or controversy about the
> implementation.

But referring to my sandbox/overloading implementation is *not*
acceptable; I want whatever that does (not much) spelled out in the
PEP for posterity.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Jul 21 17:31:13 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 21 Jul 2007 08:31:13 -0700
Subject: [Python-3000] removing exception .args
In-Reply-To: <f7t7hk$f3c$1@sea.gmane.org>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
	<f7t7hk$f3c$1@sea.gmane.org>
Message-ID: <ca471dc20707210831s1e304d30m77fe5412b66edebe@mail.gmail.com>

On 7/21/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Andrew Dalke schrieb:
> > Posting here a expansion of a short discussion I had
> > after Guido's keynote at EuroPython.  In this email
> > I propose eliminating the ".args" attribute from the
> > Exception type.  It's not useful, and supporting it
> > correctly is complicated enough that it's often not
> > supported correctly
> >
> >
> >
> > In Python 2 the base Exception class works like this
> >
> >  >>> x = Exception("spam", "was", "here")
> >  >>> x[0]
> > 'spam'
> >  >>> x.args
> > ('spam', 'was', 'here')
> >  >>>
> >
> > In Py3K the [0] index lookup disappears.  This is a
> > good thing.  Positional lookup like this is rarely useful.
> >
> > The .args attribute remains.  I don't see the need for
> > it and propose that it be removed in Py3K.
>
> Hm, I always found it useful to just do
>
> class MyCustomError(Exception):
>      pass
>
> and give it arbitrary arguments to it without writing __init__
> method stuff that I can access from outside.

Right. Also, the fact that there is no *guarantee* that e.args
contains *all* the arguments passed to the constructor doesn't mean
that e.args isn't useful. It's useful for many standard exceptions. I
also happen to think that it's well-defined: it is whatever is passed
to Exception.__init__(), whether called directly or from an overriding
__init__() method.

Given the amount of code that currently uses it, I think removing it
would also be a major undertaking, as we would have to invent names
for everything that's currently accessed via e.args[i]. (I know
there's a lot of code that uses it, because converting the stdlib from
e[i] to e.args[i] was a major pain.)

So -1 on removing e.args. I'd be okay with a recommendation not to
rely on it and to define explicitly named attributes for everything
one cares for.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From unknown_kev_cat at hotmail.com  Sat Jul 21 19:07:12 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Sat, 21 Jul 2007 13:07:12 -0400
Subject: [Python-3000] pep 3124 plans
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com><20070713173936.53C213A404D@sparrow.telecommunity.com><f7pgki$6o3$1@sea.gmane.org><ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com><20070720174706.AE5773A40A8@sparrow.telecommunity.com><46A19FCC.7070609@acm.org>
	<f7sc4k$bht$1@sea.gmane.org>
	<ca471dc20707210816r4d663cdaqcef7e9f28c150a75@mail.gmail.com>
Message-ID: <f7teg3$1lh$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc20707210816r4d663cdaqcef7e9f28c150a75 at mail.gmail.com...
> On 7/21/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>> One of the nice features of Eby's proposal is that more complicated
>> dispatching systems can be added. Perhaps some application needs a
>> dispatching engine that can dispatch based on the value of an objects
>> member. Perhaps the user wants an overload specificly for any product 
>> object
>> whose price property equals 0. With Eby's system adding a dispatch engine
>> that supports that is not difficult.
>
> This is true. However it comes at a cost. Whenever I see an API that
> takes a string which is then parsed by the called function as a Python
> expression (perhaps constrained to a subset of Python) I cringe,
> especially if the common use is to pass a literal. There are just so
> many issues with that... It's not colorized by the editor, it's not
> syntax-checked by either the editor or the Python parser, it requires
> one to build yet another parser...
>
> This is why I don't like the ...when("isinstance(obj, list)") syntax
> from (I think) RuleDispatch, and I'm glad it's not in the PEP. I'm
> unclear however on how you would do this otherwise -- is overloading
> implies() the best approach?

First of all, If i understrand the PEP correectly. that should be:
when(funcname,"isinstance(obj, list)") where funcname is the name of the 
function to be overloaded.


Whatever dispatch engine that is, is it not possible to do something more 
like
when(funcname,{isinstance,{obj,list}))? (I used list syntax here as an 
example only (other syntaxes could work). I'm not sure if the 'obj' part is
refering to a variable that would be in scope at the when declaration. That 
might have to be quoted as a string.

Regardless though, I'm pretty sure dispatch engines can use things other 
than interpreted strings. 


From foom at fuhm.net  Sat Jul 21 19:17:55 2007
From: foom at fuhm.net (James Y Knight)
Date: Sat, 21 Jul 2007 13:17:55 -0400
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <200707210025.11031.fdrake@acm.org>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>
	<200707210025.11031.fdrake@acm.org>
Message-ID: <59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>


On Jul 21, 2007, at 12:25 AM, Fred L. Drake, Jr. wrote:

> On Saturday 21 July 2007, Joe Gregorio wrote:
>> Should xml.parsers.expat.XMLParser.ParseFile(file) operate on
>> both text and binary streams?
>
> No.  XML is a serialization of a markup language containing Unicode  
> character
> into an encoded stream.

Well...there's many reasons why it is useful to be able to parse an  
already-decoded unicode stream into XML, and to serialize XML into a  
unicode string. For example, if combining into a larger unicode  
document, or parsing from a literal string in the source code.

Sure, normally XML is serialized to bytes, but it is also  
serializable to unicode, and that's a useful feature to have (if  
implementable).

James

From pje at telecommunity.com  Sat Jul 21 19:33:08 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 13:33:08 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707210816r4d663cdaqcef7e9f28c150a75@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org> <f7sc4k$bht$1@sea.gmane.org>
	<ca471dc20707210816r4d663cdaqcef7e9f28c150a75@mail.gmail.com>
Message-ID: <20070721173204.C1B913A40D7@sparrow.telecommunity.com>

At 08:16 AM 7/21/2007 -0700, Guido van Rossum wrote:
>On 7/21/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> > One of the nice features of Eby's proposal is that more complicated
> > dispatching systems can be added. Perhaps some application needs a
> > dispatching engine that can dispatch based on the value of an objects
> > member. Perhaps the user wants an overload specificly for any 
> product object
> > whose price property equals 0. With Eby's system adding a dispatch engine
> > that supports that is not difficult.
>
>This is true. However it comes at a cost. Whenever I see an API that
>takes a string which is then parsed by the called function as a Python
>expression (perhaps constrained to a subset of Python) I cringe,
>especially if the common use is to pass a literal. There are just so
>many issues with that... It's not colorized by the editor, it's not
>syntax-checked by either the editor or the Python parser,

Note that it's been previously proposed to add an AST literal syntax 
for "quoting" code to get around this, but such metasyntactic 
features were rejected for 3.0.

There are other applications for such a syntax besides generic 
functions: there exist today Python ORMs that translate Python 
generator expressions to SQL queries.  Today, they work by 
decompiling bytecode, precisely to avoid some of the issues you 
mention.  However, an AST literal syntax would actually work better 
for that, IMO, just as it would for generic functions.


>  it requires
>one to build yet another parser...

Well, for Python and subsets thereof, it suffices to use the stdlib 
for that.  My implementations use the tuple-formatted ASTs from the 
'parser' module.


>This is why I don't like the ...when("isinstance(obj, list)") syntax
>from (I think) RuleDispatch, and I'm glad it's not in the PEP. I'm
>unclear however on how you would do this otherwise -- is overloading
>implies() the best approach?

It's one approach.  However, the idea of "@when" and other decorators 
in the PEP taking a second argument is so that you can pass in 
objects of your own design.  These objects can implement disjuncts() 
to support or-ed conditions, and can request an upgrade to a 
different dispatching engine.

Of course, the ability to pass in such objects means you could pass 
in something like "Expr('some python expression')"...  which was of 
course one thing I planned to use it for.


From fdrake at acm.org  Sat Jul 21 19:36:59 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 21 Jul 2007 13:36:59 -0400
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>
	<200707210025.11031.fdrake@acm.org>
	<59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
Message-ID: <200707211336.59820.fdrake@acm.org>

On Saturday 21 July 2007, James Y Knight wrote:
 > Well...there's many reasons why it is useful to be able to parse an
 > already-decoded unicode stream into XML, and to serialize XML into a
 > unicode string. For example, if combining into a larger unicode
 > document, or parsing from a literal string in the source code.

Yes, but that doesn't mean it's the XML parser's job to take multiple input 
types.  It could easily be supported by creating a wrapper object that 
converts unicode to bytes objects, so the underlying C parser still gets 
bytes.  Such a wrapper could easily be part of xml.parsers.expat if desired, 
but I'd like to avoid adding lots of stuff to the pyexpat C code.

Avoiding complexifying the C code is a good thing.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From talin at acm.org  Sat Jul 21 19:36:05 2007
From: talin at acm.org (Talin)
Date: Sat, 21 Jul 2007 10:36:05 -0700
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>	<200707210025.11031.fdrake@acm.org>
	<59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
Message-ID: <46A24405.1050102@acm.org>

James Y Knight wrote:
> On Jul 21, 2007, at 12:25 AM, Fred L. Drake, Jr. wrote:
> 
>> On Saturday 21 July 2007, Joe Gregorio wrote:
>>> Should xml.parsers.expat.XMLParser.ParseFile(file) operate on
>>> both text and binary streams?
>> No.  XML is a serialization of a markup language containing Unicode  
>> character
>> into an encoded stream.
> 
> Well...there's many reasons why it is useful to be able to parse an  
> already-decoded unicode stream into XML, and to serialize XML into a  
> unicode string. For example, if combining into a larger unicode  
> document, or parsing from a literal string in the source code.
> 
> Sure, normally XML is serialized to bytes, but it is also  
> serializable to unicode, and that's a useful feature to have (if  
> implementable).

The general use case for XML is reading or writing a document, where 
"document" means a bytestream from either a file or a socket.

The question is whether it would also be useful to parse Python strings 
that contain XML markup, or format an XML document into a Python string.

Some care needs to be taken here, because XML has its own way of 
specifying the character encoding. For example, suppose I have a python 
string that contains the characters:

    '<?xml version="1.0" encoding="utf-8" ?>'

Well, the problem with this is that the encoding *isn't* UTF-8. Python 
3000 strings are internally encoded as UTF-16 (although generally it 
tries to hide that fact from you so most of the time you don't have to 
care.)

Suppose then that you write this string out to a file (perhaps after 
combining it with other strings.) If I happen to write the file as 
UTF-8, then everything is fine, but if I happen to pick some other 
encoding that doesn't match the encoding attribute in the prologue then 
we have the potential for confusion.

This matters because there are lots of people who write XML documents 
with print statements (and many of them forget to handle things like 
escaping of entities and such.)

This also matters because the Python XML parsing libraries are mostly 
based on expat, which is C code that doesn't have any special knowledge 
of Python strings - it only works on the encodings that it can detect, 
or which you tell it to use.

So if you wanted to directly parse a Python string as XML, you would 
probably have to treat it as a byte array and override the encoding 
detection, telling it explicitly to use UTF-16.

> James
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/talin%40acm.org
> 

From ncoghlan at gmail.com  Sat Jul 21 20:02:57 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 22 Jul 2007 04:02:57 +1000
Subject: [Python-3000] removing exception .args
In-Reply-To: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
Message-ID: <46A24A51.1090101@gmail.com>

Andrew Dalke wrote:
> Having support for a single object (almost always a
> string) passed into the exception is pragmatically useful,
> so I think the base exception class should look like
> 
> class Exception(object):
>    msg = None
>    def __init__(self, msg):
>      self.msg = msg
>    def __str__(self):
>      if self.msg is not None:
>        return "%s()" % (self.__class__.__name__,)
>      else:
>        return "%s(%r)" % (self.__class__.__name__, self.msg)
> 
> **

Went there, didn't like it, left again. See PEP 352, especially the 
section on the late (unlamented) BaseException.message.

> The rest of this email is because I'm detail oriented
> and present evidence to back up my assertion.
> 
> There are a number of subclasses which should but don't
> call the base __init__, generic error reporting software
> can't use the "args protocol" for anything.  Pretty much
> the only thing a generic error report mechanism (like
> traceback and logging) can do is call str() on the exception.

As of Python 2.5, you can rely on the attribute being present, as it is 
provided automatically by BaseException:

.>>> class MyException(Exception):
...   def __init__(self):
...     pass
...
.>>> MyException().args
()

Of course, as Guido pointed out, args will be empty unless the exception 
sets it directly or via BaseException.__init__.

> This is correct, but cumbersome.  Why should we
> encourage all non-trivial subclasses to look like this?

If you want to avoid requiring that subclasses call your __init__ 
method, you can actually do that by putting any essential initialisation 
into the __new__ method instead. Then the requirement is merely to call 
the parent __new__ method if you override __new__, and you have to do 
something like that in order to create the class instance in the first 
place.

To rewrite the example from getopt using this technique:

class GetoptError(Exception):
     def __new__(cls, msg, opt=''):
         self = super(cls, GetoptError).__new__(cls, msg, opt='')
         self.msg = msg
         self.opt = opt
         return self

     def __str__(self):
         return self.msg

I actually find using __new__ this way to be a useful practice in 
general for setting up class invariants in base classes, as it's easy to 
forget to call __init__ on the base class, but forgetting to call 
__new__ takes some serious effort. Putting the essential parts in 
__new__ means never having to include the instruction that "you must 
call this classes __init__ method when subclassing and overriding 
__init__" into any API documentation I write.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From talin at acm.org  Sat Jul 21 20:04:55 2007
From: talin at acm.org (Talin)
Date: Sat, 21 Jul 2007 11:04:55 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <f7sc4k$bht$1@sea.gmane.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<20070713173936.53C213A404D@sparrow.telecommunity.com>	<f7pgki$6o3$1@sea.gmane.org>	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com><20070720174706.AE5773A40A8@sparrow.telecommunity.com>	<46A19FCC.7070609@acm.org>
	<f7sc4k$bht$1@sea.gmane.org>
Message-ID: <46A24AC7.3050505@acm.org>

Joe Smith wrote:

> The effects of this can be wonderful. A package could convert some of a 
> frameworks functions to generics, to allow them to handle the new objects 
> the package provides. It might also need to add some before and after 
> methods to ensure that the user of the module, it looks like the framework 
> was designed to support the module in question, when in fact, it was not. 
> The idea being the package can basically make the needed changes so that 
> everything just works. All without having to duplicate any code from the 
> framework itself.
> See the benefits? (The framework mentioned could be a major framework like 
> ZOPE, just an average package, or even a simple module.)

When considering the decision to include a new feature into the 
language, one has to consider the costs as well as the benefits. You've 
made an impassioned argument showing all the wonderful power and 
expressiveness of these various features. However, power and 
expressiveness are not the only factors that should be considered.

To give an analogy, think back 20-25 years ago, when there was still a 
vocal contingent of programmers who were in favor of self-modifying 
assembly code. Expert hackers would show the amazing power of this 
technique, all of the wonderfully clever tricks that you could accomplish.

(I remember this because I was writing games back them, and 
self-modifying code was the only way you could write 6502 assembly code 
that was actually efficient. Since the 6502 had no 16-bit index 
registers, the only way to have efficient arrays larger than 256 bytes 
was to calculate the address and then modify the 16-bit address field of 
the subsequent instruction.)

At the same time, however, this clever technique came at a cost: 
Programs that were very difficult to debug or even understand. Many 
people spoke out against it, and for a time it seemed that the technique 
was a dying art.

Today we have the best of both worlds: We still have self-modifying 
code, but nowadays we call it JIT: Just-In-Time compilation. Instead of 
a free-for-all where a programmer can modify any arbitrary memory 
address, instead the power of run-time code generation is safely 
sandboxed inside of a JIT compiler component that is very competent at 
hiding the grisly details from the programmer.

Now, don't think that I am directly comparing method combination to 
self-modifying assembly code. I'm not saying that such things are 
inherently dangerous and should be avoided.

Rather, what I am trying to point out is the *thought process* that 
should be applied to any new feature.

Python is a "small" language in the sense that it's easy to hold the 
entire syntax in your head, and lots of people want to keep it that way. 
This does not mean that we can't move forward with new features. But it 
means that each feature needs to be judged and weighed as to how much it 
affects that "mental smallness" of the language.

Generic functions are favored because they have the potential to 
*shrink* certain kinds of problems. I don't mean in the sense of 
requiring the programmer to type less keystrokes, but in the sense of 
shrinking how much brainpower it takes to think about the problem.

But even then it took Guido several months (according to a posting he 
made some time ago) of thinking about generics before he reached his 
"Aha" moment with regards to completely grokking the concept. This focus 
on practicality rather than rocket science is exactly why Guido's a good 
gatekeeper in these matters - if he doesn't understand it why it's 
important or useful, it probably means that lots of other Python 
developers won't either.

-- Talin

From pje at telecommunity.com  Sat Jul 21 20:16:57 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 14:16:57 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A19FCC.7070609@acm.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
Message-ID: <20070721181442.48FB03A403A@sparrow.telecommunity.com>

At 10:55 PM 7/20/2007 -0700, Talin wrote:
>You mentioned earlier that there was a design reason for preferring
>@overload and @when vs. the earlier RuleDispatch syntax, but the
>explanation you gave wasn't very clear (to me anyway).
>
>(I personally prefer the @somegeneric.overload, but that's purely an
>aesthetic value judgement - if there's a strong architectural advantage
>of the other syntax, I'd like to hear it.)

You can't add new method combinations that way.  If method combining 
is a function, not a method, then you can add as many new method 
types as you like.  If you have to use @somegeneric.before and 
@somegeneric.after, you can't decide on your own to add @somegeneric.debug.

However, if it's @before(somegeneric...), then you can add @debug and 
@authorize and @discount and whatever else you need for your 
application, without needing to monkeypatch them in.

To me, TOOOWTDI means that all (or nearly all) the decorators should 
follow the same pattern.


>Right. There are two reasons that I think that post-hoc overloading runs
>into problems. The first, as you mentioned, is that it's difficult to
>implement without some kind of trickery.

Well, that depends on what you define as "trickery", but clearly 
Guido feels that being able to overload an existing function without 
having to go through the code of every possible client is indeed "trickery".

IMO, however, going through the code of the clients is an 
unreasonable and unscalable task that goes against the whole point of 
the exercise: to "assert qualified statements over oblivious code" 
(one common definition of aspect-oriented programming).  If I have to 
go through all the code that might have imported the function and 
stored it somewhere, that's hardly oblivious.  It creates an 
opportunity for invisible, *import sequence-dependent* bugs, that can 
be reintroduced any time somebody changes an import statement!

So the irony, IMO, of avoiding this "trickery" is that it makes the 
practice error-prone, thereby providing a self-fulfilling 
justification for avoiding its use.  (Whereas, if the "trickery" were 
allowed, it would be much safer to actually use it.)

All that having been said, I'm still willing to make an 
implementation that does it Guido's way.  I just don't agree that the 
restriction is justified.  But more on that below.


>The second reason - this is my opinion - is that it too much resembles
>the mythical "comefrom" statement (the opposite of "goto"). The
>"comefrom" statement is intended to be a joke - the worst possible
>language feature from the standpoint of being able to manually trace the
>flow of execution of a program.

Well, I've worked with people who dislike OO for exactly the same 
reason, since they feel they can never know whether a method might 
have been overridden in a subclass.  Seriously!

However, for the specific use cases *I* have in mind, you'd be using 
oblivious extension to implement customer-specific business rules, 
layered atop a core framework.  You don't want to waste time 
declaring *everything* overloadable, any more than you declare 
classes to be subclassable!  You just need to be able to write the 
customer's rules in one place.  So if you're trying to follow 
something manually, you're going to look at that customer's business 
rule modules in order to know about the exceptional control flow.

I don't think that's really comparable to the joke implementation of 
"come from".  In any system, the more the computer does for you, the 
harder it will be for you to mentally emulate what the computer's 
doing, step-by-step.  That's simply the nature of the beast.

However, in the case of rule-based declarative abstractions, you're 
getting closer to something that's *easier* for the brain to 
model.  Our brains run by pattern recognition, with more-specific 
patterns taking precedence, so this is an easier model for your brain 
to follow than step-by-step computation anyway.  Certainly, it's an 
easier model for your software customers to provide you with in the 
first place.

I.e., customers usually don't give you a step-by-step, "well, first I 
check if the customer has an outstanding balance before I ship them 
anything."  They say, "Don't ship stuff to people with an outstanding balance."

And guess what?  Viewed formally, that's a "come from" statement.

So the most straightforward expression of typical business rules and 
requirements, is going to consist of a list of come-froms.  So coding 
them that way actually gets us more verifiable requirements, and a 
simpler mental model to *produce* the code in the first place.


>One issue that hasn't been satisfactorily resolved is the handling of
>the 'self' parameter. At least, let me give my explanation of what I
>think the issue is and see if we're on the same page:
>
>Overloading a class method requires special treatment of the 'self'
>parameter because there's an implicit constraint on what types of
>objects can be passed as 'self': for any method defined in any class,
>the 'self' parameter must be an instance of the class (or a subclass) in
>which the method is defined. Now, this would be trivial if we required
>the programmer to explicitly declare the type of 'self', but this
>violates DRY and has the potential to cause mischief if the programmer
>forgets to update the method signature when they change the class.

Well, actually that never occurred to me, because obviously you can't 
do that (refer to the class before it's finished being defined).  :)


>If it turns out that there's no way to get a callback when the class has
>finished being built, then we may have to defer finishing the
>construction until the first time the generic function is called. This
>wouldn't be too bad, considering that there's a bunch of other stuff
>that is lazily calculated on first call anyway, from what I understand.

Actually, this isn't anywhere near as complicated as all the stuff I 
just snipped from the above.  :)  All that matters is whether the 
decorator is invoked in the body of a class.  If it is, it needs a 
callback to finish the job.  If it isn't, it can immediately go ahead 
with what it's doing.

Note that this was implemented in RuleDispatch literally years ago; 
it's only the loss of __metaclass__ that presents a problem for a 
Py3K implementation.


From dalke at dalkescientific.com  Sat Jul 21 23:16:35 2007
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat, 21 Jul 2007 23:16:35 +0200
Subject: [Python-3000] removing exception .args
In-Reply-To: <46A24A51.1090101@gmail.com>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
	<46A24A51.1090101@gmail.com>
Message-ID: <590554CB-D940-4424-8CD5-154F73732DE1@dalkescientific.com>

The main statement I have is, excepting backwards compatibility,
nothing would care if .args was removed in 3.0, and those which
currently used .args were changed to use attributes instead.

Please show/advise me otherwise.

> Andrew Dalke wrote:
>> so I think the base exception class should look like
>> class Exception(object):
>>    msg = None
>>    def __init__(self, msg):
>>      self.msg = msg
>>    def __str__(self):
>>      if self.msg is not None:
>>        return "%s()" % (self.__class__.__name__,)
>>      else:
>>        return "%s(%r)" % (self.__class__.__name__, self.msg)

On Jul 21, 2007, at 8:02 PM, Nick Coghlan wrote:
> Went there, didn't like it, left again. See PEP 352,

Sure, fine.  The "pragmatic" thing I care about is allowing a
single argument to be passed in the base exception class, which
in turn is used in the __str__ / __repr__.  If it's called
"message" or "msg" or stored in .args as a single element
tuple, I don't care.

For example, this would also be fine to me.

class Exception(object):
    __obj = object()
    def __init__(self, msg):
      self.__obj = msg
    def __repr__(self):
      if self.__obj is Exception.__obj:
        return "%s()" % (self.__class__.__name__,)
      else:
        return "%s(%r)" % (self.__class__.__name__, self.__obj)


> especially the section on the late (unlamented) BaseException.message.

I'm more hoping for this part of the "retracted ideas" section:

     ... and consider a more long-term transition strategy in
     Python 3.0 to remove multiple-argument support in
     BaseException in preference of accepting only a single argument.

That section also says that removing 'args' during the transition
is hard.  I can believe it.  But Python 3 can be non-backwards
compatible.

> As of Python 2.5, you can rely on the attribute being present,
> as it is provided automatically by BaseException:

Yes, I know that.

Is it useful?  Is having an autogenerated, empty .args useful?

Why?  What code would break? (excepting backwards compatibility
for code that expects to extra information via position instead
of attribut)

As far as I can tell, it's not useful.  And that's why it
should be deleted.

If it were useful, then explain why 'filename' isn't in the
args list for IOError, as in

 >>> import os
 >>> err = IOError(2, os.strerror(2), "/path/to/nowhere")
 >>> err.args
(2, 'No such file or directory')
 >>> repr(err)
"IOError(2, 'No such file or directory')"
 >>> err.errno
2
 >>> err.strerror
'No such file or directory'
 >>> err.filename
'/path/to/nowhere'
 >>>


Answer: it's a bug.  But it's a bug that no one really
cares about.  Its lack affects no one.  And removing 'args'
would affect .. no one.  Excepting code which currently
expects to get fields [0], [1], ... when the original
exception should have defined attributes instead.

> If you want to avoid requiring that subclasses call your
> __init__ method, you can actually do that by putting any
> essential initialisation into the __new__ method instead.

That wasn't my point.  My point is that many non-trivial
exception classes don't currently call the base class
__init__ nor set the .vars attribute.  That one class I
showed was an example of defensive programming - knowing
that there's a decent chance that derived classes won't
call the __init__.

There should be no reason to be this defensive.  Most
other classes are not.  That getopt example was a second-order
effect and should not be a driving case for any future
direction.

The real problem isn't that .args wasn't initialized.
The real problem is that .args shouldn't need to exist.

(In personal email I did a followup on why I think __new__
should not be used for this case, or for the more generally
advocated case of "setting up class invariants in the base
class."  I felt that that was a distracting tangent.)


				Andrew
				dalke at dalkescientific.com


From greg.ewing at canterbury.ac.nz  Sun Jul 22 02:26:24 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Jul 2007 12:26:24 +1200
Subject: [Python-3000] removing exception .args
In-Reply-To: <f7t7hk$f3c$1@sea.gmane.org>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
	<f7t7hk$f3c$1@sea.gmane.org>
Message-ID: <46A2A430.1090405@canterbury.ac.nz>

Georg Brandl wrote:
> Hm, I always found it useful to just do
> 
> class MyCustomError(Exception):
>      pass
> 
> and give it arbitrary arguments to it without writing __init__
> method stuff that I can access from outside.

Maybe

   class Exception(object):

     def __init__(self, msg = None, **kwds):
       self.msg = msg
       self.__dict__.update(kwds)

Then you'd have to pass your extra args as keyword args,
but you could still avoid having an __init__ if you wanted.

--
Greg

From brett at python.org  Sun Jul 22 02:46:21 2007
From: brett at python.org (Brett Cannon)
Date: Sat, 21 Jul 2007 17:46:21 -0700
Subject: [Python-3000] removing exception .args
In-Reply-To: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
Message-ID: <bbaeab100707211746r5a4cb2a2rafb066bb24260206@mail.gmail.com>

On 7/21/07, Andrew Dalke <dalke at dalkescientific.com> wrote:
> Posting here a expansion of a short discussion I had
> after Guido's keynote at EuroPython.  In this email
> I propose eliminating the ".args" attribute from the
> Exception type.  It's not useful, and supporting it
> correctly is complicated enough that it's often not
> supported correctly
>

This was originally proposed in PEP 352.  This was the reason for the
existence of the 'message' attribute as introduced in Python 2.5..  At
PyCon 2007 I actually removed 'args' (see the p3yk_no_args_on_exc
branch in svn: http://svn.python.org/view/python/branches/p3yk_no_args_on_exc/).

But after making everyone at PyCon suffer through my swearing and
frustration and talking with python-dev (and thus should be in the
python-dev/python-3000 archives), the decision was made to not remove
it (which is why 'message' is deprecated in Python 2.6).  This was
because the removal at the C level is very painful.  There are many
places within the code where a tuple is passed to various C functions
that expect that tuple to be treated as multiple arguments to the
exception constructor.

But changing the semantics of a C function has already been labeled a
no-no.  So one would have to remove the C functions that construct
exceptions with arguments and use a new one that only expects a single
argument so not to have unexpected semantics.  That sucks because
those functions are all over.

In the branch I just stuck the tuple into the 'message' attribute, but
that caused its own issues as output was now a little funky since
everything was considered a tuple, including single arguments.

So while I totally understand the desire to ditch 'args' and just have
'message', doing so thoroughly and in any reasonable way that is not
painful is not easy thanks to the C API.

-Brett

From greg.ewing at canterbury.ac.nz  Sun Jul 22 02:47:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Jul 2007 12:47:36 +1200
Subject: [Python-3000] removing exception .args
In-Reply-To: <46A24A51.1090101@gmail.com>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>
	<46A24A51.1090101@gmail.com>
Message-ID: <46A2A928.9070705@canterbury.ac.nz>

Nick Coghlan wrote:
> Putting the essential parts in 
> __new__ means never having to include the instruction that "you must 
> call this classes __init__ method when subclassing and overriding 
> __init__" into any API documentation I write.

I always assume that I *do* have to call the base __init__
if I override it, unless something explicitly says that
I don't. And I assume other people follow the same rule,
so I don't feel obliged to spell it out when I document
my own classes.

--
Greg

From dalke at dalkescientific.com  Sun Jul 22 03:11:46 2007
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sun, 22 Jul 2007 03:11:46 +0200
Subject: [Python-3000] removing exception .args
Message-ID: <D14DF1E2-0737-4D5B-8133-A08FD2F6BA61@dalkescientific.com>

Brett:
> This was originally proposed in PEP 352.

> So while I totally understand the desire to ditch 'args' and just have
> 'message', doing so thoroughly and in any reasonable way that is not
> painful is not easy thanks to the C API.

*sigh*

I read through the back python 3k list postings on this.
I see this topic is pending further input.

     which is why I am asking if people are still supportive
     of this?

I can offer nothing there as I don't dwell in the depths
of the C API.


Does the ".args" needs to be visible to Python code?
That would hide the problem, yes?


I've been reading the docs, and found the clause related to
IOError not having the filename in the args tuple.

     When an EnvironmentError exception is instantiated with a

     3-tuple, the first two items are available as above, while

     the third item is available on the filename attribute.

     However, for backwards compatibility, the args attribute

     contains only a 2-tuple of the first two constructor arguments.


At the very least, could this be fixed?


				Andrew
				dalke at dalkescientific.com


From greg.ewing at canterbury.ac.nz  Sun Jul 22 03:09:05 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Jul 2007 13:09:05 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070721181442.48FB03A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
Message-ID: <46A2AE31.2080105@canterbury.ac.nz>

Phillip J. Eby wrote:
> I.e., customers usually don't give you a step-by-step, "well, first I 
> check if the customer has an outstanding balance before I ship them 
> anything."  They say, "Don't ship stuff to people with an outstanding balance."

In my experience, customers often give you a vague,
incomplete and even contradictory set of rules. It
takes a lot of careful thought to refine them into
something complete and coherent, and it requires
considering all the rules together to see how they
interact with each other.

The GF approach encourages scattering the rules
over different parts of the program, and I can't
see how that helps with this process.

--
Greg

From dalke at dalkescientific.com  Sun Jul 22 03:16:02 2007
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sun, 22 Jul 2007 03:16:02 +0200
Subject: [Python-3000] removing exception .args
In-Reply-To: <D14DF1E2-0737-4D5B-8133-A08FD2F6BA61@dalkescientific.com>
References: <D14DF1E2-0737-4D5B-8133-A08FD2F6BA61@dalkescientific.com>
Message-ID: <261196CB-F8BB-4727-96B5-EDDAEA12E54B@dalkescientific.com>

Andrew Dalke:
> Does the ".args" needs to be visible to Python code?
> That would hide the problem, yes?

I see I'm not getting all messages on this thread.
Looked at the archive and saw:

Guido:
> So -1 on removing e.args. I'd be okay with a recommendation not to
> rely on it and to define explicitly named attributes for everything
> one cares for.

Okay.  Sounds like the best that can happen.


				Andrew
				dalke at dalkescientific.com


From greg.ewing at canterbury.ac.nz  Sun Jul 22 03:28:27 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Jul 2007 13:28:27 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070721181442.48FB03A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
Message-ID: <46A2B2BB.9070305@canterbury.ac.nz>

Phillip J. Eby wrote:
> Well, I've worked with people who dislike OO for exactly the same 
> reason, since they feel they can never know whether a method might 
> have been overridden in a subclass.

I think there's a considerable difference in degree here,
though. When you call a method, you know you're delegating
responsibility to the object for carrying out that operation.
And you know you're delegating it to that object and no
other, so given the run-time type you can find the code
that gets called fairly easily.

With GFs that require overloadable functions to be declared
as such, you know when you call one that you're delegating
to something. But it's a lot less clear what you're
delegating to or where. Any or all of the arguments could
be determining which piece of code gets called, and the
code could be in a much wider variety of places, not
necessarily even near any of the classes involved.

If any function can be overloaded, then *any* call could
potentially be delegating somewhere, increasing the range
of possible behaviours even more.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun Jul 22 03:46:24 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Jul 2007 13:46:24 +1200
Subject: [Python-3000] removing exception .args
In-Reply-To: <D14DF1E2-0737-4D5B-8133-A08FD2F6BA61@dalkescientific.com>
References: <D14DF1E2-0737-4D5B-8133-A08FD2F6BA61@dalkescientific.com>
Message-ID: <46A2B6F0.9080903@canterbury.ac.nz>

Andrew Dalke wrote:

>      However, for backwards compatibility, the args attribute
> 
>      contains only a 2-tuple of the first two constructor arguments.

This is a good reason for having named attributes instead
of a tuple -- it's extensible without requiring these sorts
of hacks.

As for the C function problem -- are these functions
instantiating some known exception class? If so, why
can't that class be given an __init__ that accepts the
appropriate arguments positionally and stores them as
attributes (or passes them on as keywords args as per
my earlier suggestion)?

--
Greg

From pje at telecommunity.com  Sun Jul 22 03:58:49 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 21:58:49 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A2B2BB.9070305@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
Message-ID: <20070722015630.8F34C3A403A@sparrow.telecommunity.com>

At 01:28 PM 7/22/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > Well, I've worked with people who dislike OO for exactly the same
> > reason, since they feel they can never know whether a method might
> > have been overridden in a subclass.
>
>I think there's a considerable difference in degree here,
>though. When you call a method, you know you're delegating
>responsibility to the object for carrying out that operation.
>And you know you're delegating it to that object and no
>other, so given the run-time type

Well, if you're looking at *run-time*, then you can equally well dump 
out the runtime contents of a generic function, complete with 
modules, filenames, and line numbers of every method.  In the 
peak.rules.core case, that operation would look something like:

     from peak.rules.core import rules_for
     print list(rules_for(somefunc))

Although you'd probably want nicer formatting.  But that wouldn't be 
hard to add.


>If any function can be overloaded, then *any* call could
>potentially be delegating somewhere, increasing the range
>of possible behaviours even more.

That's exactly true of today's Python, and always has been.  Heck, 
somebody can change a class' __bases__ at runtime, or change the 
class of an object on the fly.

I don't think that anybody's saying that unrestricted use of dynamism 
is good, or that it can't be abused.  However, the potential for 
abuse is no different.  If anything, generic functions allow more 
*structured* dynamism, because two different modules can safely add 
methods to a function, instead of being tempted to reimplement and 
monkeypatch it.


From pje at telecommunity.com  Sun Jul 22 04:06:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 22:06:40 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A2AE31.2080105@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
Message-ID: <20070722020422.5AAAC3A403A@sparrow.telecommunity.com>

At 01:09 PM 7/22/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > I.e., customers usually don't give you a step-by-step, "well, first I
> > check if the customer has an outstanding balance before I ship them
> > anything."  They say, "Don't ship stuff to people with an 
> outstanding balance."
>
>In my experience, customers often give you a vague,
>incomplete and even contradictory set of rules. It
>takes a lot of careful thought to refine them into
>something complete and coherent, and it requires
>considering all the rules together to see how they
>interact with each other.

Which is why it's good to be able to group those rules *together* -- 
especially grouping one customer's rules separately from 
another's.  Putting them both into your core code would make the 
system harder to understand, and harder to distinguish the rules 
applying to that customer.


>The GF approach encourages scattering the rules
>over different parts of the program,

You seem to be saying that the ability to put things in different 
places encourages disorganization.

I claim the contrary: being able to put GF methods in different 
places means that you are able to put things in a *more* logical 
organization than is possible with only classes.

Yes, it certainly *enables* you to be more disorganized, if that's 
what you wish.  But why would you *do* that?  It makes no sense.  It 
seems to me that by that argument, we shouldn't have modules, because 
people might put a class and its subclass in two different 
modules.  But that's a *feature*, because it lets you organize things 
according to other dimensions that might be more important to 
understanding the program, than the inheritance relationship between classes.


From ncoghlan at gmail.com  Sun Jul 22 05:26:11 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 22 Jul 2007 13:26:11 +1000
Subject: [Python-3000] removing exception .args
In-Reply-To: <46A2A928.9070705@canterbury.ac.nz>
References: <9A9F27CC-D660-4C09-8D8C-5C4DDD66D2E6@dalkescientific.com>	<46A24A51.1090101@gmail.com>
	<46A2A928.9070705@canterbury.ac.nz>
Message-ID: <46A2CE53.9070701@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
>> Putting the essential parts in 
>> __new__ means never having to include the instruction that "you must 
>> call this classes __init__ method when subclassing and overriding 
>> __init__" into any API documentation I write.
> 
> I always assume that I *do* have to call the base __init__
> if I override it, unless something explicitly says that
> I don't. And I assume other people follow the same rule,
> so I don't feel obliged to spell it out when I document
> my own classes.

Andrew actually pointed out a flaw in my suggestion - if the person 
subclassing wants to change the constructor signature, they end up 
needing to override both __new__ and__init__, rather than just __init__. 
So the implementation trick is exposed more than I thought, and the idea 
is far less useful outside of tightly controlled class hierarchies 
(which is where I've personally used it).

/end 
tangent-that-I'd-regret-bringing-up-except-for-the-fact-that-I-learnt-something

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From joe at bitworking.org  Sun Jul 22 06:35:14 2007
From: joe at bitworking.org (Joe Gregorio)
Date: Sun, 22 Jul 2007 00:35:14 -0400
Subject: [Python-3000] pyexpat: returns_unicode str/unicode branch
Message-ID: <3f1451f50707212135l56f90d56p4088957d12ab36cd@mail.gmail.com>

On 7/21/07, Fred L. Drake, Jr. <fdrake at acm.org> wrote:
> On Saturday 21 July 2007, Joe Gregorio wrote:
>  > Should xml.parsers.expat.XMLParser.ParseFile(file) operate on
>  > both text and binary streams?
>
> No.  XML is a serialization of a markup language containing Unicode character
> into an encoded stream.

Along the same lines, since all strings are now unicode,
should "returns_unicode" be dropped from xmlparser objects?
That is, the handler functions will always be passed unicode
strings and not utf-8 bytes.

   Thanks,
   -joe

-- 
Joe Gregorio        http://bitworking.org

From martin at v.loewis.de  Sun Jul 22 09:56:26 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 09:56:26 +0200
Subject: [Python-3000] pyexpat: returns_unicode str/unicode branch
In-Reply-To: <3f1451f50707212135l56f90d56p4088957d12ab36cd@mail.gmail.com>
References: <3f1451f50707212135l56f90d56p4088957d12ab36cd@mail.gmail.com>
Message-ID: <46A30DAA.3040204@v.loewis.de>

> Along the same lines, since all strings are now unicode,
> should "returns_unicode" be dropped from xmlparser objects?
> That is, the handler functions will always be passed unicode
> strings and not utf-8 bytes.

Sure.

Martin

From martin at v.loewis.de  Sun Jul 22 10:00:18 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 10:00:18 +0200
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>	<200707210025.11031.fdrake@acm.org>
	<59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
Message-ID: <46A30E92.5040400@v.loewis.de>

> Sure, normally XML is serialized to bytes, but it is also  
> serializable to unicode, and that's a useful feature to have (if  
> implementable).

It's not reasonably implementable; users who have use cases
will have to encode as UTF-8 first.

Regards,
Martin

From fdrake at acm.org  Sun Jul 22 15:50:51 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 22 Jul 2007 09:50:51 -0400
Subject: [Python-3000] pyexpat: returns_unicode str/unicode branch
In-Reply-To: <3f1451f50707212135l56f90d56p4088957d12ab36cd@mail.gmail.com>
References: <3f1451f50707212135l56f90d56p4088957d12ab36cd@mail.gmail.com>
Message-ID: <200707220950.52076.fdrake@acm.org>

On Sunday 22 July 2007, Joe Gregorio wrote:
 > Along the same lines, since all strings are now unicode,
 > should "returns_unicode" be dropped from xmlparser objects?
 > That is, the handler functions will always be passed unicode
 > strings and not utf-8 bytes.

Yes.  This was always a backward-compatibility point, but it's been a long 
time since the default was to return UTF-8.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From guido at python.org  Sun Jul 22 17:43:54 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 22 Jul 2007 08:43:54 -0700
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <46A30E92.5040400@v.loewis.de>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>
	<200707210025.11031.fdrake@acm.org>
	<59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
	<46A30E92.5040400@v.loewis.de>
Message-ID: <ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>

On 7/22/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Sure, normally XML is serialized to bytes, but it is also
> > serializable to unicode, and that's a useful feature to have (if
> > implementable).
>
> It's not reasonably implementable; users who have use cases
> will have to encode as UTF-8 first.

Now I'm confused. Are we proposing that all our XML APIs read and
write encoded bytes, or are we proposing that they read and write
Unicode strings, leaving the encoding/decoding to the I/O stream? I
thought the latter was preferred but now it looks like you're arguing
for the former?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fdrake at acm.org  Sun Jul 22 17:56:34 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 22 Jul 2007 11:56:34 -0400
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>
	<46A30E92.5040400@v.loewis.de>
	<ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>
Message-ID: <200707221156.34992.fdrake@acm.org>

On Sunday 22 July 2007, Guido van Rossum wrote:
 > Now I'm confused. Are we proposing that all our XML APIs read and
 > write encoded bytes, or are we proposing that they read and write
 > Unicode strings, leaving the encoding/decoding to the I/O stream? I
 > thought the latter was preferred but now it looks like you're arguing
 > for the former?

XML should always be read as bytes, and the output of serialization should be 
bytes (the Py3k "bytes" type, or some immutable flavor of the same).

The APIs that present data parsed from XML, and that accept input that should 
be serialized in XML, should use Unicode strings (the Py3k "str" type).


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From martin at v.loewis.de  Sun Jul 22 18:30:26 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 18:30:26 +0200
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>	
	<200707210025.11031.fdrake@acm.org>	
	<59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>	
	<46A30E92.5040400@v.loewis.de>
	<ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>
Message-ID: <46A38622.1010505@v.loewis.de>

Guido van Rossum schrieb:
> On 7/22/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> > Sure, normally XML is serialized to bytes, but it is also
>> > serializable to unicode, and that's a useful feature to have (if
>> > implementable).
>>
>> It's not reasonably implementable; users who have use cases
>> will have to encode as UTF-8 first.
> 
> Now I'm confused. Are we proposing that all our XML APIs read and
> write encoded bytes, or are we proposing that they read and write
> Unicode strings, leaving the encoding/decoding to the I/O stream? 

Unicode strings in both cases.

I was not talking about writing at all; pyexpat only does reading
(aka parsing). It returns Unicode strings, but processes bytes.

> I
> thought the latter was preferred but now it looks like you're arguing
> for the former?

The XML parser input stream should be byte-oriented. XML has its own
notion of input encoding (expressed in the XML declaration, <?xml...);
it's the job of the parser to figure it out. Having the user provide
a character-oriented stream to the parser is both inconvenient and
error-prone: the application would have to figure out the encoding
itself first.

Regards,
Martin

From unknown_kev_cat at hotmail.com  Sun Jul 22 21:51:51 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Sun, 22 Jul 2007 15:51:51 -0400
Subject: [Python-3000] PEP 368: Standard image protocol and class
References: <cc93256f0706301518kd9fe7a7iaf0e9bd8e2e18edd@mail.gmail.com><cc93256f0706301800m20012379n84aff4ff3df88021@mail.gmail.com><740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com><cc93256f0707010959o44c77912sb989c68cf890b846@mail.gmail.com><f7pk8b$evu$1@sea.gmane.org>
	<46A16803.1020200@canterbury.ac.nz>
Message-ID: <f80cgs$4h4$1@sea.gmane.org>


"Greg Ewing" <greg.ewing at canterbury.ac.nz> wrote in message 
news:46A16803.1020200 at canterbury.ac.nz...
> Joe Smith wrote:
>> If the maintainers of most of the large packages that do imaging are 
>> willing
>> to support this,
>> and your code is good, I see absolutely no reason why this PEP would not 
>> be
>> accepted.
>
> Something that bothers me about it a little is that
> the core Python/C API seems like the wrong place to put
> PyImge_* functions.
>

The document mentions delivering a version of the code that uses python and 
C. That would be an extention module, correct? Couldn't those functions be 
in the C extention? The Docs for 2.5 state that extention modules can 
provide a C API that other modules can use. (I'm assuming that has not 
changed).
That should work. After all any extention that needs those functions will 
likely on the python side be importing the Image module anyway, which would 
require that the C extention for the Image module be loaded.

Or am I missing something?
If am am not missing anything, this sounds like a minor implementation 
issue. 


From greg.ewing at canterbury.ac.nz  Mon Jul 23 01:47:39 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 23 Jul 2007 11:47:39 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070722015630.8F34C3A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
Message-ID: <46A3EC9B.4020507@canterbury.ac.nz>

Phillip J. Eby wrote:
> Well, if you're looking at *run-time*, then you can equally well dump 
> out the runtime contents of a generic function,

I'm not talking about doing this *at* run time. I'm
talking about reasoning about what the program will
do, based on your knowledge of what the run-time
type will be.

With a normal method call, you can take an assumed
run-time type, start at one end and follow things
through step by step. That's not so easy with
generic functions, for two reasons: (1) all of the
arguments can potentially influence where the
control flow goes, and (2) the overloading code
can be anywhere in the program, not confined to
the classes involved.

I'm not saying this makes GFs impossible to use,
but they do make the programmer's world considerably
more complicated. You can't just brush these concerns
off as being no worse than what OO already provides.

> I don't think that anybody's saying that unrestricted use of dynamism is 
> good, or that it can't be abused.  However, the potential for abuse is 
> no different.

I'm not talking about abuse. I'm only talking about using
GFs the way they're meant to be used. There's more to
think about in the presence of GFs even without any
abuse.

--
Greg


From greg.ewing at canterbury.ac.nz  Mon Jul 23 01:48:07 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 23 Jul 2007 11:48:07 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
Message-ID: <46A3ECB7.9070504@canterbury.ac.nz>

Phillip J. Eby wrote:
> You seem to be saying that the ability to put things in different places 
> encourages disorganization.

No. What I'm saying is that there are conflicting organisational
requirements here.

If the things being put in different places were independent
and able to be reasoned about in isolation, everything would
be fine. But they're not independent, because different
overloadings of the same GF can interact, sometimes in
subtle ways, and reasoning about their interactions is
facilitated by being able to see all the relevant rules
together.

Even if the rules don't, in fact, interact, it can be hard
to convince yourself of this without being sure that you
simultaneously know what all the rules are at some point
in time.

--
Greg

From greg.ewing at canterbury.ac.nz  Mon Jul 23 01:59:35 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 23 Jul 2007 11:59:35 +1200
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>
	<200707210025.11031.fdrake@acm.org>
	<59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>
	<46A30E92.5040400@v.loewis.de>
	<ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>
Message-ID: <46A3EF67.8020003@canterbury.ac.nz>

Guido van Rossum wrote:
> Now I'm confused. Are we proposing that all our XML APIs read and
> write encoded bytes, or are we proposing that they read and write
> Unicode strings, leaving the encoding/decoding to the I/O stream?

The design of XML seems a bit braindamaged here, with the
encoding specification being *inside* the XML itself,
rather than being something specified externally. It's
a bit like a self-opening letter that works by having
a letter opener sealed inside the envelope. You can
open it, but you have to open it first...

If this part of the XML spec is to be taken literally, it
would seem that we're forced to treat XML as bytes and
not text... despite that XML is supposed to be a text
format... aaargh!!!

It might make sense to have an XML parser that took
a unicode string containing the body of an XML message
with the encoding line stripped off.

--
Greg

From talin at acm.org  Mon Jul 23 02:13:47 2007
From: talin at acm.org (Talin)
Date: Sun, 22 Jul 2007 17:13:47 -0700
Subject: [Python-3000] str/unicode tests: pyexpat.c and read(n)
In-Reply-To: <46A3EF67.8020003@canterbury.ac.nz>
References: <3f1451f50707202112ye61385fifb4b2307f7fdf536@mail.gmail.com>	<200707210025.11031.fdrake@acm.org>	<59C0A7B2-B334-4984-AA8E-CA024B73553B@fuhm.net>	<46A30E92.5040400@v.loewis.de>	<ca471dc20707220843i345c0fdcld852fb9f26a97b04@mail.gmail.com>
	<46A3EF67.8020003@canterbury.ac.nz>
Message-ID: <46A3F2BB.7060408@acm.org>

Greg Ewing wrote:
> Guido van Rossum wrote:
>> Now I'm confused. Are we proposing that all our XML APIs read and
>> write encoded bytes, or are we proposing that they read and write
>> Unicode strings, leaving the encoding/decoding to the I/O stream?
> 
> The design of XML seems a bit braindamaged here, with the
> encoding specification being *inside* the XML itself,
> rather than being something specified externally. It's
> a bit like a self-opening letter that works by having
> a letter opener sealed inside the envelope. You can
> open it, but you have to open it first...

All of the popular XML parsers have self-bootstrapping code that handles 
detection of the encoding, including auto-detection when no encoding is 
specified.

So basically - don't worry about it, it's taken care of.

-- Talin

From pje at telecommunity.com  Mon Jul 23 02:48:54 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jul 2007 20:48:54 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A3EC9B.4020507@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
Message-ID: <20070723004703.C3A903A40A9@sparrow.telecommunity.com>

At 11:47 AM 7/23/2007 +1200, Greg Ewing wrote:
>With a normal method call, you can take an assumed
>run-time type, start at one end and follow things
>through step by step. That's not so easy with
>generic functions, for two reasons: (1) all of the
>arguments can potentially influence where the
>control flow goes, and (2) the overloading code
>can be anywhere in the program, not confined to
>the classes involved.

In order to follow things through with normal method calls, you have 
to know where a class is in the program, implying that you either 
search for it, or have read enough of the program to figure it out.

Which of these two things is different with generic functions?

(Meanwhile, if you are "starting at one end" and "follow things 
through step-by-step", then you are going to step right through all 
the method definitions, regardless of whether they're standard 
methods or GF methods.)


>I'm not saying this makes GFs impossible to use,
>but they do make the programmer's world considerably
>more complicated.

Since they make my world simpler, I'd have to disagree with such a 
blanket statement.  (I imagine the other developers who are using 
them would similarly disagree.)

If your argument is that it might make it more difficult for you to 
know what's going on in a poorly-organized program, or make it easier 
to write a poorly-organized program, I might agree with you.

But I disagree in the general case, because if you're going to be 
grepping for 'foo', it doesn't matter whether it's a method name or a 
generic function name -- you're still going to find all the definitions.


>You can't just brush these concerns
>off as being no worse than what OO already provides.

Actually I can, and just did.  Grep (or whatever global search tool 
your editor provides) is your friend.  It ain't perfect, but it's 
just as much required (and equally imperfect) for global analysis of 
a traditionally-OO program.


From pje at telecommunity.com  Mon Jul 23 03:10:09 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jul 2007 21:10:09 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A3ECB7.9070504@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
Message-ID: <20070723010750.E27693A40A9@sparrow.telecommunity.com>

At 11:48 AM 7/23/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > You seem to be saying that the ability to put things in different places
> > encourages disorganization.
>
>No. What I'm saying is that there are conflicting organisational
>requirements here.
>
>If the things being put in different places were independent
>and able to be reasoned about in isolation, everything would
>be fine.

And that is in fact the *normal* case, even in GF use.  You seem to 
be arguing that possible == probable, when it simply ain't so.


>  But they're not independent, because different
>overloadings of the same GF can interact, sometimes in
>subtle ways, and reasoning about their interactions is
>facilitated by being able to see all the relevant rules
>together.

Yeah, and a program *can* be full of monkeypatching and change 
classes' __bases__ at runtime, but most people don't write their code 
that way, most of the time.  The whole point of GF's is that they 
make things *simpler*, because you can usually avoid the sort of 
awkwardness that accompanies trying to do those things *without* 
GF's.  (E.g. adapters, registries, and the like -- which are just as 
hard to analyze statically.)

Consider, too, that merely combining super() with multiple 
inheritance can produce very surprising results in today's 
Python.  You cannot statically predict what method super() is going 
to call by looking at the code of the class that calls it.  (Because 
a subclass can effectively insert bases between the class and its 
explicit bases.)

In other words, if you want to know what's going on in a Python 
program today with regard to today's method combination next_method() 
feature (which we call super()), you already have to grep for *all* 
the method definitions.

And this little bit of extra complexity doesn't even have a method 
combination decorator to call out that subtlety to you; you have to 
look in the method *body*.  Even next_method has to at least be 
listed in the argument list.  :)


>Even if the rules don't, in fact, interact, it can be hard
>to convince yourself of this without being sure that you
>simultaneously know what all the rules are at some point
>in time.

Well, as I said before, you can always run the program and dump out 
the entire list, complete with filenames and line numbers if you're 
so inclined.  That's certainly what I'd do, were I investigating some 
code I was unfamiliar with.  And fancier tools could certainly be 
created, if they were needed.

Python already has each and every one of the things you're 
complaining about, as binary operators depend on multiple argument 
values (and you have to know *both* types in order to work out the 
result), the method being called by super() can't be statically 
predicted any more than next_method(), can, and you already have to 
use grep if you're going after global understanding of a large program.

If anything, generic functions give you *better* tools to work with, 
as there is no trivial way to fire up a program and say, "show me all 
the classes that have a foo() method."  (You could probably write 
something to find them using object.__subclasses__, though, at least 
for new-style types.)


From talin at acm.org  Mon Jul 23 09:07:51 2007
From: talin at acm.org (Talin)
Date: Mon, 23 Jul 2007 00:07:51 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070723010750.E27693A40A9@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<20070713173936.53C213A404D@sparrow.telecommunity.com>	<f7pgki$6o3$1@sea.gmane.org>	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2AE31.2080105@canterbury.ac.nz>	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
Message-ID: <46A453C7.9070407@acm.org>

Phillip J. Eby wrote:

> If anything, generic functions give you *better* tools to work with, 
> as there is no trivial way to fire up a program and say, "show me all 
> the classes that have a foo() method."  (You could probably write 
> something to find them using object.__subclasses__, though, at least 
> for new-style types.)

I'm glad we're having this conversation - this is the kind of thing I 
want to hear more of. The intention of my posts is not to argue against 
GFs, but to challenge the proponents of GFs to explain themselves better.

However, GFs are relatively non-controversial compared to method 
combinations and some of the other "advanced" stuff. Getting some kind 
of GF support into 3.0 is a near certainty at this point, if I have 
judged the situation rightly. So you need not waste too much ink 
defending them. I would focus more on the stuff that's built on top of GFs.

-- Talin

From ncoghlan at gmail.com  Mon Jul 23 15:09:53 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Jul 2007 23:09:53 +1000
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A3EC9B.4020507@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<20070713173936.53C213A404D@sparrow.telecommunity.com>	<f7pgki$6o3$1@sea.gmane.org>	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2B2BB.9070305@canterbury.ac.nz>	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
Message-ID: <46A4A8A1.2020705@gmail.com>

Greg Ewing wrote:
> Phillip J. Eby wrote:
>> I don't think that anybody's saying that unrestricted use of dynamism is 
>> good, or that it can't be abused.  However, the potential for abuse is 
>> no different.
> 
> I'm not talking about abuse. I'm only talking about using
> GFs the way they're meant to be used. There's more to
> think about in the presence of GFs even without any
> abuse.

GF's are already used all the time in Python - they're just called magic 
methods.

So I'll assume you're happy with the idea that if you want to analyse 
the expression:

   d[a+b]

statically in current Python, you need to look for __add__ and __radd__ 
methods on both 'a' and 'b' (assuming you know their types), and 
__hash__ and __eq__ methods on whatever type is returned from that 
operation, and then a __getitem__ method on the type of 'd' (again, 
assuming you already know it). In all cases, the methods might not 
actually be on those particular types, but on one of their parent types. 
And if there are any invocations of super() in any of the method 
implementations, then you need to take the MRO into account as well.

Of course, most of the time you wouldn't bother with that level of 
analysis unless you had reason to believe something was going wrong with 
that expression. Otherwise, you would assume that all of the magic 
methods involved were performing as expected.

So what's different if we change that expression to use GF's instead?:

   get_mapping_item(d, binary_add(a, b))

Well, nothing really, except that instead of looking for the magic 
methods referred to above, we are instead looking for all overloads of 
get_mapping_item and binary_add.

And the big benefit here is that whatever techniques you come up with 
for searching for those overloads will work for *any* GF implemented 
using the same tools, whereas the search for magic methods only works in 
some cases. For example, what would you need to search for to figure out 
what code copy.copy, copy.deepcopy, pickle.dumps or pickle.loads invoke 
for a given type? It's significantly more complicated than just looking 
for single magic methods.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From pje at telecommunity.com  Mon Jul 23 17:32:50 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 11:32:50 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A453C7.9070407@acm.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
Message-ID: <20070723153031.D00273A403D@sparrow.telecommunity.com>

At 12:07 AM 7/23/2007 -0700, Talin wrote:
>Phillip J. Eby wrote:
> > If anything, generic functions give you *better* tools to work with,
> > as there is no trivial way to fire up a program and say, "show me all
> > the classes that have a foo() method."  (You could probably write
> > something to find them using object.__subclasses__, though, at least
> > for new-style types.)
>
>I'm glad we're having this conversation - this is the kind of thing I
>want to hear more of. The intention of my posts is not to argue against
>GFs, but to challenge the proponents of GFs to explain themselves better.
>
>However, GFs are relatively non-controversial compared to method
>combinations and some of the other "advanced" stuff.

Well, as I just pointed out (and Greg has in the past, whether 
meaning to or not), method combination is pretty much isomorphic to 
method overriding and calling super()...  except that it's easier to 
say what you really mean, instead of having to work around the fact 
that there's only one native precedence.

For example, one pattern that sometimes comes up in writing methods 
is that you have a base class that always wants to do something 
*after* the subclass version of the method is called.  To implement 
that without method combination, you have to split the method into 
two parts, one of which gets called by the other, and then tell 
everybody writing subclasses to only override the second method.

With method combination and a generic function, you simply declare an 
@after method for the base type, and it'll get called after the 
normal methods for any subclasses.


From pje at telecommunity.com  Mon Jul 23 17:34:27 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 11:34:27 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A4A8A1.2020705@gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz> <46A4A8A1.2020705@gmail.com>
Message-ID: <20070723153210.EA5ED3A403D@sparrow.telecommunity.com>

At 11:09 PM 7/23/2007 +1000, Nick Coghlan wrote:
>And the big benefit here is that whatever techniques you come up with
>for searching for those overloads will work for *any* GF implemented
>using the same tools,

By the way, this is one of the reasons why it would be good to have a 
relatively uniform API for generic functions in Python.


From joe at bitworking.org  Mon Jul 23 18:29:41 2007
From: joe at bitworking.org (Joe Gregorio)
Date: Mon, 23 Jul 2007 12:29:41 -0400
Subject: [Python-3000] str/uni - test_pyexpat.py
Message-ID: <3f1451f50707230929q586015ady464d09be3205c4bb@mail.gmail.com>

I've submitted the following patch to fix test_pyexpat.py:

    http://www.python.org/sf/1759016

Part of the fix was to remove the 'returns_unicode' attribute.
Should the updates to the documentation be added to this
patch or submitted as a separate patch?

   Thanks,
   -joe

-- 
Joe Gregorio        http://bitworking.org

From guido at python.org  Mon Jul 23 19:43:45 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 23 Jul 2007 10:43:45 -0700
Subject: [Python-3000] str/uni - test_pyexpat.py
In-Reply-To: <3f1451f50707230929q586015ady464d09be3205c4bb@mail.gmail.com>
References: <3f1451f50707230929q586015ady464d09be3205c4bb@mail.gmail.com>
Message-ID: <ca471dc20707231043k2137d85bp771bf074f943165d@mail.gmail.com>

On 7/23/07, Joe Gregorio <joe at bitworking.org> wrote:
> I've submitted the following patch to fix test_pyexpat.py:
>
>     http://www.python.org/sf/1759016
>
> Part of the fix was to remove the 'returns_unicode' attribute.
> Should the updates to the documentation be added to this
> patch or submitted as a separate patch?

Thanks! I've submitted this as r56512. An all-in-one patch is fine.

Since I'm not an expat expert, could someone else check the code?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Tue Jul 24 01:58:10 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 24 Jul 2007 11:58:10 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070723004703.C3A903A40A9@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
	<20070723004703.C3A903A40A9@sparrow.telecommunity.com>
Message-ID: <46A54092.8030606@canterbury.ac.nz>

Phillip J. Eby wrote:
> In order to follow things through with normal method calls, you have to 
> know where a class is in the program, implying that you either search 
> for it, or have read enough of the program to figure it out.
> 
> Which of these two things is different with generic functions?

A class is defined in just one place, or a limited number
of places if it has base classes.

It also provides a convenient mental chunk under which to
group all the operations that it implements. With GFs, there
is no such obvious mental grouping.

> if you're going to be 
> grepping for 'foo', it doesn't matter whether it's a method name or a 
> generic function name -- you're still going to find all the definitions.

No, you're going to find every function whose name is 'foo',
whether it's a method of the particular GF you have in mind
or not. A considerably smarter tool than grep would be needed.

> Since they make my world simpler, 

Are you talking about code that you've written yourself here,
or do you find they make code written by others easier to
understand as well?

I'd have to disagree with such a blanket statement.

 > Grep (or whatever global search tool your
> editor provides) is your friend.  It ain't perfect, but it's just as 
> much required (and equally imperfect) for global analysis of a 
> traditionally-OO program.

Most of the time I find that I don't need to perform global
analysis of a traditionally-OO paradigm. The conceptual
encapsulation provided by classes makes that unnecessary.
GF breaks that encapsulation, or at least to my mind it
seems to, and that makes me uncomfortable.

--
Greg


From pje at telecommunity.com  Tue Jul 24 02:51:09 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 20:51:09 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A54092.8030606@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
	<20070723004703.C3A903A40A9@sparrow.telecommunity.com>
	<46A54092.8030606@canterbury.ac.nz>
Message-ID: <20070724004850.2F2343A403D@sparrow.telecommunity.com>

At 11:58 AM 7/24/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > In order to follow things through with normal method calls, you have to
> > know where a class is in the program, implying that you either search
> > for it, or have read enough of the program to figure it out.
> >
> > Which of these two things is different with generic functions?
>
>A class is defined in just one place, or a limited number
>of places if it has base classes.

...and may be subclassed in an unlimited number of places.

A generic function is defined in just one place, with a limited 
number of "generic" methods typically adjoining it, and may be 
extended in an unlimited number of places.

Where's the difference?


>It also provides a convenient mental chunk under which to
>group all the operations that it implements. With GFs, there
>is no such obvious mental grouping.

The function itself is the grouping, in the same way that Python's 
operator.* functions are, or its built-in generics like len() and 
iter().  len() encapsulates the concept of "sequence", just as iter() 
encapsulates "iterable", and operator.add encapsulates "addition".

These are conceptual categories that can't be defined by classes, 
except by conventions like ABCs -- and ISTR the ABCs PEP ran into 
trouble dealing with n-ary operators where n>1.


> > if you're going to be
> > grepping for 'foo', it doesn't matter whether it's a method name or a
> > generic function name -- you're still going to find all the definitions.
>
>No, you're going to find every function whose name is 'foo',
>whether it's a method of the particular GF you have in mind
>or not.

And this doesn't apply to normal methods?  Come on.  This is far 
*more* likely to be a problem with normal methods than it is with 
generic functions.  For one thing, you can isolate your search to 
modules that import the function being overridden -- something you 
can't do with normal methods.


> > Since they make my world simpler,
>
>Are you talking about code that you've written yourself here,
>or do you find they make code written by others easier to
>understand as well?

Yes, I find code written using generics to be generally easier to 
understand, because it's possible to grasp a generic operator without 
needing to understand all the classes it can be applied to.

For example, the generic function operator.add in Python defines the 
concept of addition, without me needing to understand all possible 
types that might be added together.

And since all non-trivial Python code already uses generic functions, 
I find that they do in fact make all Python code simpler to 
understand.  Indeed, they're a significant contributor to Python's 
ease-of-use.  PEP 3124 seeks to expand that ease by allowing people 
to easily add their own generic functions, without needing to use 
workarounds like interfaces and adapters.


>I'd have to disagree with such a blanket statement.

The thing that you seem to keep missing in your analysis is that 
Python already *has* generic functions in the language specification, 
and has had them for what, 10, 15 years?  If any of these problems 
you're talking about actually existed, I think we'd already know about them.

Or are you arguing that functions like len() and iter() make progams 
harder to understand in all the same ways that you're saying that 
adding a standard GF library will?


>  > Grep (or whatever global search tool your
> > editor provides) is your friend.  It ain't perfect, but it's just as
> > much required (and equally imperfect) for global analysis of a
> > traditionally-OO program.
>
>Most of the time I find that I don't need to perform global
>analysis of a traditionally-OO paradigm. The conceptual
>encapsulation provided by classes makes that unnecessary.
>GF breaks that encapsulation, or at least to my mind it
>seems to, and that makes me uncomfortable.

That's because you're ignoring the GFs (and operators implemented as 
GF's) that you use all day long in even the most trivial of Python 
programs, let alone ones that use pickle or copy or pprint.  Even 
computing a sum such as 2+2 involves a generic function in Python!

All PEP 3124 proposes to do is have a standard API for 
programmatically adding methods to generic functions, irrespective of 
how those functions are internally implemented.  Its decorators are 
to generic functions what 'setattr()' is to objects: i.e., a generic 
function for manipulating their contents.

It doesn't really "add generic functions to Python", because Python 
already had them.


From greg.ewing at canterbury.ac.nz  Tue Jul 24 02:54:38 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 24 Jul 2007 12:54:38 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070723010750.E27693A40A9@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
Message-ID: <46A54DCE.8050205@canterbury.ac.nz>

Phillip J. Eby wrote:
> And that is in fact the *normal* case, even in GF use.  You seem to be 
> arguing that possible == probable, when it simply ain't so.

No, I'm saying that it's hard to convince myself that I'm
not going to fall into one of the possible traps, even if
it's an improbable one.

When adding an overload to a GF, what methodology can I
follow to ensure that my overload doesn't interact in an
unfortunate way with another one somewhere else, perhaps
one not written by me? If the only answer to that is
"grep the entire program for things that might be other
overloadings of this GF", that doesn't do much to allay
my misgivings.

> Yeah, and a program *can* be full of monkeypatching and change classes' 
> __bases__ at runtime, but most people don't write their code that way, 
> most of the time.

The difference is that we're talking about a system
specifically *designed* for carrying out monkeypatching.
I don't care what you call it, it still looks like
monkeypatching to me. The fundamental reason that
we think monkeypatching is a bad idea is still there --
something done by one part of the program can affect
the behaviour of another part with no obvious connection.

> The whole point of GF's is that they make things 
> *simpler*,  because you can usually avoid the sort of awkwardness that
> accompanies trying to do those things *without* GF's.  (E.g. adapters, 
> registries, and the like -- which are just as hard to analyze statically.)

Yes, but as far as I can see, GFs don't make these things
much *easier* to analyse statically. Registries are awkward
because of that difficulty, not because they're hard
to implement.

 > Consider, too, that merely combining super() with multiple inheritance
> can produce very surprising results in today's Python.

Yes, which is largely why I've personally never used super(),
and regard it as a misfeature. I wouldn't mind if it went
away completely.

> Well, as I said before, you can always run the program and dump out the 
> entire list, complete with filenames and line numbers if you're so 
> inclined.

Even once I've got such a list, I've then got to examine it
carefully and try to nut out the implications of all the
type relationships, before/after/around/discount/etc method
cominations, and whathaveyou.

Yes, I know you already get some of this with multiple
inheritance -- which is why I use it very rarely and very
carefully. Also the complexities tend to be confined to the
class doing the multiple inheriting and only need to be
considered by the author of that class, not everyone who
uses it.

And what if the program doesn't exist yet, because I'm
still thinking about how to write it? Or it exists but
isn't yet in a state where it can be run successfully?

 > binary operators depend on multiple argument values (and you
> have to know *both* types in order to work out the result)

Yes, that can be a bit more complex, but at least the method
that gets called has to belong to one class or the other.
Also it's easier to follow nowadays with the auto-coercion
system being phased out -- the left operand gets first say,
and if it doesn't care, the right operand gets its say.

--
Greg

From pje at telecommunity.com  Tue Jul 24 03:39:17 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 21:39:17 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A54DCE.8050205@canterbury.ac.nz>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A54DCE.8050205@canterbury.ac.nz>
Message-ID: <20070724013722.B2F5B3A403D@sparrow.telecommunity.com>

At 12:54 PM 7/24/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > And that is in fact the *normal* case, even in GF use.  You seem to be
> > arguing that possible == probable, when it simply ain't so.
>
>No, I'm saying that it's hard to convince myself that I'm
>not going to fall into one of the possible traps, even if
>it's an improbable one.
>
>When adding an overload to a GF, what methodology can I
>follow to ensure that my overload doesn't interact in an
>unfortunate way with another one somewhere else, perhaps
>one not written by me?

What methodology can you follow that ensures that same thing when 
overriding a method in a subclass?


> > Yeah, and a program *can* be full of monkeypatching and change classes'
> > __bases__ at runtime, but most people don't write their code that way,
> > most of the time.
>
>The difference is that we're talking about a system
>specifically *designed* for carrying out monkeypatching.
>I don't care what you call it, it still looks like
>monkeypatching to me.

You're not looking very hard, then.  Is this excerpt from 
peak.rules.core monkeypatching?

def implies(s1,s2):
     """Is s2 always true if s1 is true?"""
     return s1==s2

from types import ClassType

when(implies, (type,      type)     )(issubclass)
when(implies, (ClassType, ClassType))(issubclass)
when(implies, (type,      ClassType))(issubclass)

when(implies, (bool, bool))(lambda c1, c2: c2 or not c1)
when(implies, (bool, object))(lambda c1, c2: not c1)
when(implies, (object, bool))(lambda c1, c2: c2)

To me, this looks like a straightforward explanation of the 
implication rules between new-style and classic classes and boolean 
values.  In fact, it seems much more straightforward to me, than 
writing out a big if-then tree whose *intent* I would have to discern 
from comments or the structure of the tree itself.

And if I had to discern the intent from the structure of the if tree, 
I would have no way of knowing whether the if's as written were in 
fact *correct*.  I could mistake a bug for the author's intention in that case.

This is just one of the ways in which generic functions can be a 
superior tool for code understanding -- even in the complete absence 
of anything that can be described as "monkeypatching".

In truth, every interface or abstract base class is just another way 
of specifying a generic function.  When you say that objects 
implementing a certain interface or protocol must have a 'foo' 
method, then any subclass may add a new *actual* implementation of 
'foo' -- which is no different from adding a method to a generic 
function for a new type.


>  The fundamental reason that
>we think monkeypatching is a bad idea is still there --
>something done by one part of the program can affect
>the behaviour of another part with no obvious connection.

It's FUD to try to associate monkeypatching with GF's.  Generic 
functions have *none* of the bad effects of monkeypatching.

Monkeypatching is bad because:

1. It's hard to see

2. Can't be safely composed (i.e. multiple monkeypatches) without 
introducing dependency order at best and bugs at worst

GF method additions are highly visible, and are safely composable, 
since more-specific methods override each other, and only truly 
independent methods can "float" as to execution order.


> > The whole point of GF's is that they make things
> > *simpler*,  because you can usually avoid the sort of awkwardness that
> > accompanies trying to do those things *without* GF's.  (E.g. adapters,
> > registries, and the like -- which are just as hard to analyze statically.)
>
>Yes, but as far as I can see, GFs don't make these things
>much *easier* to analyse statically. Registries are awkward
>because of that difficulty, not because they're hard
>to implement.
>...
>Yes, which is largely why I've personally never used super(),
>and regard it as a misfeature. I wouldn't mind if it went
>away completely.
>...
>Even once I've got such a list, I've then got to examine it
>carefully and try to nut out the implications of all the
>type relationships, before/after/around/discount/etc method
>cominations, and whathaveyou.
>...
>Yes, I know you already get some of this with multiple
>inheritance -- which is why I use it very rarely and very
>carefully. Also the complexities tend to be confined to the
>class doing the multiple inheriting and only need to be
>considered by the author of that class, not everyone who
>uses it.

Okay, well I guess the above statements all put you squarely in the 
"OO is too scary" category, so I'm not sure there's much else I can 
say that'd be useful.

Keep in mind, however, that without a *standard* way of doing GF's, 
you will have to figure out *each* library or program's ad-hoc 
workarounds, instead of simply getting to know One Obvious Way of doing it.


>And what if the program doesn't exist yet, because I'm
>still thinking about how to write it? Or it exists but
>isn't yet in a state where it can be run successfully?

I don't understand what you're asking, here.


>  > binary operators depend on multiple argument values (and you
> > have to know *both* types in order to work out the result)
>
>Yes, that can be a bit more complex, but at least the method
>that gets called has to belong to one class or the other.
>Also it's easier to follow nowadays with the auto-coercion
>system being phased out -- the left operand gets first say,
>and if it doesn't care, the right operand gets its say.

Oh really?  Are you sure about that?  I was under the impression that 
under certain circumstances, if one object is "more specific" than 
the other (i.e., one is an instance of a subclass of the other's 
type), then that one gets first say.


From guido at python.org  Tue Jul 24 04:57:37 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 23 Jul 2007 19:57:37 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070724004850.2F2343A403D@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
	<20070723004703.C3A903A40A9@sparrow.telecommunity.com>
	<46A54092.8030606@canterbury.ac.nz>
	<20070724004850.2F2343A403D@sparrow.telecommunity.com>
Message-ID: <ca471dc20707231957n2e58258v7b86b904803890dd@mail.gmail.com>

On 7/23/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 11:58 AM 7/24/2007 +1200, Greg Ewing wrote:
> >A class is defined in just one place, or a limited number
> >of places if it has base classes.
>
> ...and may be subclassed in an unlimited number of places.
>
> A generic function is defined in just one place, with a limited
> number of "generic" methods typically adjoining it, and may be
> extended in an unlimited number of places.
>
> Where's the difference?

Phillip, you seem to be dead set on providing a mathematical proof
that the two are equivalent. Unfortunately, my gut tells me otherwise,
and it doesn't want to listen to mathematical proofs. It's like proofs
of God's (non-)existence. They don't work unless you're already in
agreement with the outcome.

Fact is, many people, including me, are uncomfortable with the idea
that a GF can be overridden *anywhere*. I am not letting that get in
the way of acknowledging the value of GFs, but I don't think it's
worth trying to take this fear away by attempting to prove that it is
irrational. Irrationality, as the name implies, is not susceptible to
rational argument.

I could come up with several reasons why it's not the same at all, but
I'm not going to bother, because it'll just encourage you to deny it
even harder. I think the argument (from both sides) is irrelevant;
you're wasting your valuable time and energy that would much better
directed towards updating the PEP and writing an implementation.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue Jul 24 05:42:23 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 23:42:23 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707231957n2e58258v7b86b904803890dd@mail.gmail.com
 >
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
	<20070723004703.C3A903A40A9@sparrow.telecommunity.com>
	<46A54092.8030606@canterbury.ac.nz>
	<20070724004850.2F2343A403D@sparrow.telecommunity.com>
	<ca471dc20707231957n2e58258v7b86b904803890dd@mail.gmail.com>
Message-ID: <20070724034006.9B23E3A403D@sparrow.telecommunity.com>

At 07:57 PM 7/23/2007 -0700, Guido van Rossum wrote:
>On 7/23/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 11:58 AM 7/24/2007 +1200, Greg Ewing wrote:
> > >A class is defined in just one place, or a limited number
> > >of places if it has base classes.
> >
> > ...and may be subclassed in an unlimited number of places.
> >
> > A generic function is defined in just one place, with a limited
> > number of "generic" methods typically adjoining it, and may be
> > extended in an unlimited number of places.
> >
> > Where's the difference?
>
>Phillip, you seem to be dead set on providing a mathematical proof
>that the two are equivalent.

Actually, I don't consider them equivalent; I consider each to have 
its own benefits and drawbacks.  For example, GF declarations are 
more verbose than traditional methods, both at definition and call time.

I just don't see that the things Greg is describing aren't equally 
applicable to traditional methods.


>I could come up with several reasons why it's not the same at all,

I'm genuinely curious as to what those are.  If you have the chance 
to send them to me privately, I'll use them only to improve the PEP 
-- and I won't reply here or privately.  :)


From talin at acm.org  Tue Jul 24 06:44:00 2007
From: talin at acm.org (Talin)
Date: Mon, 23 Jul 2007 21:44:00 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070724034006.9B23E3A403D@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2B2BB.9070305@canterbury.ac.nz>	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>	<46A3EC9B.4020507@canterbury.ac.nz>	<20070723004703.C3A903A40A9@sparrow.telecommunity.com>	<46A54092.8030606@canterbury.ac.nz>	<20070724004850.2F2343A403D@sparrow.telecommunity.com>	<ca471dc20707231957n2e58258v7b86b904803890dd@mail.gmail.com>
	<20070724034006.9B23E3A403D@sparrow.telecommunity.com>
Message-ID: <46A58390.8050408@acm.org>

Phillip J. Eby wrote:

> I just don't see that the things Greg is describing aren't equally 
> applicable to traditional methods.

I wasn't going to get into this, but - since you asked :)

The short form of the argument is that being able to overload any 
function as a generic function retroactively changes the implicit 
contract of what that function is.

I agree with you that the problem of tracing down all of the places 
where a GF could dispatch to is analogous to tracing down all the places 
where a subclass could override a method.

I would argue, though, that the "subclass analogy" that you have raised 
(which is a good one) corresponds most closely to the "explicit 
overload" GF design. In other words, when I create a base class, I know 
at the time I am writing it that because it is a class, its methods may 
be overloaded by someone later; And this knowledge is something that I 
factor in to the design of the class as I am writing it.

(This foreknowledge is even more relevant in languages like C++ and Java 
where you can explicitly control on a per-method basis whether it is 
overridable or not. Regardless of what you think of these languages, I 
think we can all agree that programmers depend on the ability of the 
'virtual' or 'final' keywords to control what subclass writers are able 
to do.)

So I would say that writing a subclass is exactly like explicitly 
declaring a generic function: At the time I write the function, I know 
that people may come along later and overload that function, and I 
factor that knowledge into the design of the function as I am writing.

By extension, I claim that your analogy breaks down when we start 
talking about adding overloads to a function that was not originally 
declared as generic. The reason is because in this case, the original 
author of the function did not expect that someone would be able to come 
along and overload it later.

The ability to overload has always been part of the implicit contract of 
creating a class. It has never been part of the implicit contract of 
writing a function or method. So essentially, you are going back to all 
the functions that have ever been written and changing that implicit 
contract retroactively.

(I'm not claiming that this can never be done, I'm explaining why you 
are getting this reaction from Greg and Guido.)

In the case of __magic__ overloads, they too are explicitly declared: 
Only in this case, the explicit declaration either in the wrapper 
function (such as len(x)), or in some cases the 'declaration' is hidden 
inside the Python interpreter, but everyone knows about it (an example 
being __init__). More broadly, everyone knows in advance that a method 
having a name of the __magic__ form is intended to be a specialization 
of a general pattern.

Now, it's not that hard, for a given function, to use grep to trace down 
the possible GFs that may be overloading that specific function.

But that's only if you have foreknowledge of which functions are 
overloaded and which aren't. There are thousands of functions in a 
typical program (well, more accurately there are thousands of *methods*, 
and relatively few global functions). Suppose that 5% of them are 
overloaded, but you have no idea which 5% of them are. Trying to search 
for each of them to see what overloads there are is an N^2 problem, and 
very different, I would claim, than the situation with subclassing.

(Although admittedly, this problem is really only acute when we talk 
about non-instance-method functions, since the implicit constraints on 
the 'self' parameter already limit the search space for possible 
overloads of instance methods. Although with adaption and bound methods, 
anything can act like an instance method, so I would guess all bets are 
off...)

Now, it may be interesting to compare the implicit overloading with C++ 
overloaded methods. C++ also allow any function to be overloaded without 
explicitly declaring "overloadability", although the overload resolution 
happens in the compiler rather than in the runtime.

But note, however, that this overloading is also carefully hemmed in, 
because only overloads that are actually in scope at the time of the 
call will actually take effect. So again, the search space for finding 
overloads is less than global, and you only need look in header files 
and scopes that are visible to the calling site, which will typically be 
a small fraction of the total source code for an application.

So I hope that explains why overloading regular functions is perceived 
by some people to be of a different order than overloading class methods.

-- Talin

From ncoghlan at gmail.com  Tue Jul 24 14:25:57 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 Jul 2007 22:25:57 +1000
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070724013722.B2F5B3A403D@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<20070713173936.53C213A404D@sparrow.telecommunity.com>	<f7pgki$6o3$1@sea.gmane.org>	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2AE31.2080105@canterbury.ac.nz>	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>	<46A3ECB7.9070504@canterbury.ac.nz>	<20070723010750.E27693A40A9@sparrow.telecommunity.com>	<46A54DCE.8050205@canterbury.ac.nz>
	<20070724013722.B2F5B3A403D@sparrow.telecommunity.com>
Message-ID: <46A5EFD5.80008@gmail.com>

Phillip J. Eby wrote:
> At 12:54 PM 7/24/2007 +1200, Greg Ewing wrote:
>>  > binary operators depend on multiple argument values (and you
>>> have to know *both* types in order to work out the result)
>> Yes, that can be a bit more complex, but at least the method
>> that gets called has to belong to one class or the other.
>> Also it's easier to follow nowadays with the auto-coercion
>> system being phased out -- the left operand gets first say,
>> and if it doesn't care, the right operand gets its say.
> 
> Oh really?  Are you sure about that?  I was under the impression that 
> under certain circumstances, if one object is "more specific" than 
> the other (i.e., one is an instance of a subclass of the other's 
> type), then that one gets first say.

Yep, and that feature stays even with __coerce__ going away. Otherwise 
subclasses would have a hell of a time getting their __r*__ methods to 
be invoked instead of the base classes.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From alexandre at peadrop.com  Tue Jul 24 23:03:39 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 24 Jul 2007 17:03:39 -0400
Subject: [Python-3000] _heapq.c, etc. (was Re: Heaptypes)
In-Reply-To: <ca471dc20707200744r4a8efc1an444d7f4f894ff23a@mail.gmail.com>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
	<20070720010804.85A7.JCARLSON@uci.edu>
	<ca471dc20707200744r4a8efc1an444d7f4f894ff23a@mail.gmail.com>
Message-ID: <acd65fa20707241403k132e0f8el4b76407afe3c8ef1@mail.gmail.com>

On 7/20/07, Guido van Rossum <guido at python.org> wrote:
> I definitely *don't* want to continue the old habit of having a slow
> and a fast module with different names; the experience with especially
> cPickle and cStringIO is that everyone believes their code is
> performance critical and hence uses the C version if it exists,
> thereby repeating the same idiom over and over.

Actually, I am been surprised myself that the C version of StringIO
isn't always faster than the Python one. I have a testcase where using
StringIO, instead of cStringIO, is ~20% faster.

-- Alexandre

From alexandre at peadrop.com  Tue Jul 24 23:11:22 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 24 Jul 2007 17:11:22 -0400
Subject: [Python-3000] _heapq.c, etc. (was Re: Heaptypes)
In-Reply-To: <46A1DA0C.5010107@canterbury.ac.nz>
References: <ca471dc20707191525m5161b04x828e60efd17f6ffb@mail.gmail.com>
	<ca471dc20707191658s14d86b52x24b3a12524d9a97b@mail.gmail.com>
	<20070720010804.85A7.JCARLSON@uci.edu>
	<46A16906.7010005@canterbury.ac.nz> <46A1D3FF.4020000@v.loewis.de>
	<46A1DA0C.5010107@canterbury.ac.nz>
Message-ID: <acd65fa20707241411p50a68a4ayd803ca63b15f6c84@mail.gmail.com>

I am not sure if an official naming scheme is really necessary. For
StringIO and BytesIO, I simply added a leading underscore the Python
implementations and rename them if the C implementations aren't
available. So, the Python versions remain available for testing, or if
someone needs them.

-- Alexandre

On 7/21/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Martin v. L?wis wrote:
> > You mean, like prefixing it with c, e.g. StringIO vs. cStringIO,
> > pickle vs. cPickle?
>
> Yes, but with an official scheme for deriving the names
> from the main package name, and also an understanding
> that these are implementation details to be used only
> when really necessary (hence the leading underscores).
>
> Considering Guido's comment about people gratuitously
> using the C versions, perhaps only the Python version
> should be made available as an official alternative.
> It's unlikely that people will gratuitously choose what
> they perceive to be a *slower* version of the module. :-)

From pje at telecommunity.com  Tue Jul 24 23:56:28 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 17:56:28 -0400
Subject: [Python-3000] New section for PEP 3124
Message-ID: <20070724220011.4F63D3A40B2@sparrow.telecommunity.com>

Taking the recent threads here, and Guido's comments off-list, I've 
attempted to put together a coherent response as a new section for 
the PEP, which I've checked in and included a copy of here.  If I 
have misrepresented anyone's argument, or if you spot something where 
you have a question or need a clarification, please let me know.  Thanks.


Overloading Usage Patterns
==========================

In discussion on the Python-3000 list, the proposed feature of allowing
arbitrary functions to be overloaded has been somewhat controversial,
with some people expressing concern that this would make programs more
difficult to understand.

The general thrust of this argument is that one cannot rely on what a
function does, if it can be changed from anywhere in the program at any
time.  Even though in principle this can already happen through
monkeypatching or code substitution, it is considered poor practice to
do so.

However, providing support for overloading any function (or so the
argument goes), is implicitly blessing such changes as being an
acceptable practice.

This argument appears to make sense in theory, but it is almost entirely
mooted in practice for two reasons.

First, people are generally not perverse, defining a function to do one
thing in one place, and then summarily defining it to do the opposite
somewhere else!  The principal reasons to extend the behavior of a
function that has *not* been specifically made generic are to:

* Add special cases not contemplated by the original function's author,
   such as support for additional types.

* Be notified of an action in order to cause some related operation to
   be performed, either before the original operation is performed,
   after it, or both.  This can include general-purpose operations like
   adding logging, timing, or tracing, as well as application-specific
   behavior.

None of these reasons for adding overloads imply any change to the
intended default or overall behavior of the existing function, however.
Just as a base class method may be overridden by a subclass for these
same two reasons, so too may a function be overloaded to provide for
such enhancements.

In other words, universal overloading does not equal *arbitrary*
overloading, in the sense that we need not expect people to randomly
redefine the behavior of existing functions in illogical or
unpredictable ways.  If they did so, it would be no less of a bad
practice than any other way of writing illogical or unpredictable code!

However, to distinguish bad practice from good, it is perhaps necessary
to clarify further what good practice for defining overloads *is*.  And
that brings us to the second reason why generic functions do not
necessarily make programs harder to understand: overloading patterns in
actual programs tend to follow very predictable patterns.  (Both in
Python and in languages that have no *non*-generic functions.)

If a module is defining a new generic operation, it will usually also
define any required overloads for existing types in the same place.
Likewise, if a module is defining a new type, then it will usually
define overloads there for any generic functions that it knows or cares
about.

As a result, the vast majority of overloads can be found adjacent to
either the function being overloaded, or to a newly-defined type for
which the overload is adding support.  Thus, overloads are highly-
discoverable in the common case, as you are either looking at the
function or the type, or both.

It is only in rather infrequent cases that one will have overloads in a
module that contains neither the function nor the type(s) for which the
overload is added.  This would be the case if, say, a third-party
created a bridge of support between one library's types and another
library's generic function(s).  In such a case, however, best practice
suggests prominently advertising this, especially by way of the module
name.

For example, PyProtocols defines such bridge support for working with
Zope interfaces and legacy Twisted interfaces, using modules called
``protocols.twisted_support`` and ``protocols.zope_support``.  (These
bridges are done with interface adapters, rather than generic functions,
but the basic principle is the same.)

In short, understanding programs in the presence of universal
overloading need not be any more difficult, given that the vast majority
of overloads will either be adjacent to a function, or the definition of
a type that is passed to that function.

And, in the absence of incompetence or deliberate intention to be
obscure, the few overloads that are not adjacent to the relevant type(s)
or function(s), will generally not need to be understood or known about
outside the scope where those overloads are defined.  (Except in the
"support modules" case, where best practice suggests naming them
accordingly.)


From guido at python.org  Wed Jul 25 00:16:46 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Jul 2007 15:16:46 -0700
Subject: [Python-3000] New section for PEP 3124
In-Reply-To: <20070724220011.4F63D3A40B2@sparrow.telecommunity.com>
References: <20070724220011.4F63D3A40B2@sparrow.telecommunity.com>
Message-ID: <ca471dc20707241516m6af5329ax1dada6129718d058@mail.gmail.com>

I'm confused why you spend so much time refuting the argument, given
that you've already agreed to implement explicit decoration. Did I
misread that? As I tried to indicate with my "gut feelings" argument
this is not something that's up to rational argument. Also, the
paragraph starting with "As a result, the vast majority of overloads
can be found adjacent to..." sounds like it isn't a big loss to
require explicit decoration. So I'm sticking with it.

On 7/24/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> Taking the recent threads here, and Guido's comments off-list, I've
> attempted to put together a coherent response as a new section for
> the PEP, which I've checked in and included a copy of here.  If I
> have misrepresented anyone's argument, or if you spot something where
> you have a question or need a clarification, please let me know.  Thanks.
>
>
> Overloading Usage Patterns
> ==========================
>
> In discussion on the Python-3000 list, the proposed feature of allowing
> arbitrary functions to be overloaded has been somewhat controversial,
> with some people expressing concern that this would make programs more
> difficult to understand.
>
> The general thrust of this argument is that one cannot rely on what a
> function does, if it can be changed from anywhere in the program at any
> time.  Even though in principle this can already happen through
> monkeypatching or code substitution, it is considered poor practice to
> do so.
>
> However, providing support for overloading any function (or so the
> argument goes), is implicitly blessing such changes as being an
> acceptable practice.
>
> This argument appears to make sense in theory, but it is almost entirely
> mooted in practice for two reasons.
>
> First, people are generally not perverse, defining a function to do one
> thing in one place, and then summarily defining it to do the opposite
> somewhere else!  The principal reasons to extend the behavior of a
> function that has *not* been specifically made generic are to:
>
> * Add special cases not contemplated by the original function's author,
>    such as support for additional types.
>
> * Be notified of an action in order to cause some related operation to
>    be performed, either before the original operation is performed,
>    after it, or both.  This can include general-purpose operations like
>    adding logging, timing, or tracing, as well as application-specific
>    behavior.
>
> None of these reasons for adding overloads imply any change to the
> intended default or overall behavior of the existing function, however.
> Just as a base class method may be overridden by a subclass for these
> same two reasons, so too may a function be overloaded to provide for
> such enhancements.
>
> In other words, universal overloading does not equal *arbitrary*
> overloading, in the sense that we need not expect people to randomly
> redefine the behavior of existing functions in illogical or
> unpredictable ways.  If they did so, it would be no less of a bad
> practice than any other way of writing illogical or unpredictable code!
>
> However, to distinguish bad practice from good, it is perhaps necessary
> to clarify further what good practice for defining overloads *is*.  And
> that brings us to the second reason why generic functions do not
> necessarily make programs harder to understand: overloading patterns in
> actual programs tend to follow very predictable patterns.  (Both in
> Python and in languages that have no *non*-generic functions.)
>
> If a module is defining a new generic operation, it will usually also
> define any required overloads for existing types in the same place.
> Likewise, if a module is defining a new type, then it will usually
> define overloads there for any generic functions that it knows or cares
> about.
>
> As a result, the vast majority of overloads can be found adjacent to
> either the function being overloaded, or to a newly-defined type for
> which the overload is adding support.  Thus, overloads are highly-
> discoverable in the common case, as you are either looking at the
> function or the type, or both.
>
> It is only in rather infrequent cases that one will have overloads in a
> module that contains neither the function nor the type(s) for which the
> overload is added.  This would be the case if, say, a third-party
> created a bridge of support between one library's types and another
> library's generic function(s).  In such a case, however, best practice
> suggests prominently advertising this, especially by way of the module
> name.
>
> For example, PyProtocols defines such bridge support for working with
> Zope interfaces and legacy Twisted interfaces, using modules called
> ``protocols.twisted_support`` and ``protocols.zope_support``.  (These
> bridges are done with interface adapters, rather than generic functions,
> but the basic principle is the same.)
>
> In short, understanding programs in the presence of universal
> overloading need not be any more difficult, given that the vast majority
> of overloads will either be adjacent to a function, or the definition of
> a type that is passed to that function.
>
> And, in the absence of incompetence or deliberate intention to be
> obscure, the few overloads that are not adjacent to the relevant type(s)
> or function(s), will generally not need to be understood or known about
> outside the scope where those overloads are defined.  (Except in the
> "support modules" case, where best practice suggests naming them
> accordingly.)
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Jul 25 00:30:38 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Jul 2007 15:30:38 -0700
Subject: [Python-3000] [Python-Dev] Py3k: error during 'make install' in
	py3k-struni ?
In-Reply-To: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>
Message-ID: <ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>

Yeah, that particular test is not yet working. (Fixes are welcome --
see http://wiki.python.org/moin/Py3kStrUniTests for how to help.)

I believe I rigged "make install" to continue after this error -- did
the rest of the install complete?

FWIW, a better place to discuss Py3k bleeding edge stuff is
python-3000 at python.org. Sign up at the usual place. (I've CC'ed that
list now -- please remove python-dev from followups.)

--Guido

On 7/24/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> Afther checking out the py3k-struni branch, 'make install' issued this:
>
> Compiling /usr/local/python/3.0/lib/python3.0/test/test_tarfile.py ...
> *** SyntaxError: ('expected string, bytes found',
> ('/usr/local/python/3.0/lib/python3.0/test/test_tarfile.py', 0, 0,
> None))
>
> If this is expected to fail, please forget this.
>
> --
> Lisandro Dalc?n
> ---------------
> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed Jul 25 01:01:06 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 19:01:06 -0400
Subject: [Python-3000] New section for PEP 3124
In-Reply-To: <ca471dc20707241516m6af5329ax1dada6129718d058@mail.gmail.co
 m>
References: <20070724220011.4F63D3A40B2@sparrow.telecommunity.com>
	<ca471dc20707241516m6af5329ax1dada6129718d058@mail.gmail.com>
Message-ID: <20070724225847.804E13A40A7@sparrow.telecommunity.com>

At 03:16 PM 7/24/2007 -0700, Guido van Rossum wrote:
>I'm confused why you spend so much time refuting the argument,

The purpose was to capture the arguments on both sides for posterity 
as part of the PEP.


>  Also, the
>paragraph starting with "As a result, the vast majority of overloads
>can be found adjacent to..." sounds like it isn't a big loss to
>require explicit decoration.

Perhaps these two bits should have been closer together, then:

>On 7/24/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>The principal reasons to extend the behavior of a
>>function that has *not* been specifically made generic are to:
>>
>>* Add special cases not contemplated by the original function's author,
>>    such as support for additional types.
>>
>>* Be notified of an action in order to cause some related operation to
>>    be performed, either before the original operation is performed,
>>    after it, or both.  This can include general-purpose operations like
>>    adding logging, timing, or tracing, as well as application-specific
>>    behavior.
>>...
>>As a result, the vast majority of overloads can be found adjacent to
>>either the function being overloaded, or to a *newly-defined type for
>>which the overload is adding support*

Emphasis added to the last bit -- you can't add support for a 
newly-defined type to a previously-existing function that was not 
declared generic, unless arbitrary overloads are allowed.

For example, epydoc and pydoc contain functions that inspect the type 
of their arguments in order to decide what to with them.  While it's 
arguable that in a GF world, the authors *should* have made those 
functions overloadable, it isn't reasonable to expect everyone to 
rewrite their code to make everything overloadable, nor to correctly 
anticipate every function for which extension might be needed.


>As I tried to indicate with my "gut feelings" argument
>this is not something that's up to rational argument.

Of course...  but the purpose was to document the experiences upon 
which *my* gut feelings are based, since that aspect of the PEP was 
not previously dealt with adequately.

In retrospect, the new section is weak mainly because it's phrased as 
a defense to a critique, rather than being written as a motivation 
for the proposed feature.  So much for the attempt at a quick fix.  :)


From dalcinl at gmail.com  Wed Jul 25 01:14:03 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Tue, 24 Jul 2007 20:14:03 -0300
Subject: [Python-3000] [Python-Dev] Py3k: error during 'make install' in
	py3k-struni ?
In-Reply-To: <ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>
	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
Message-ID: <e7ba66e40707241614g2a5180b3o871f2bd73d57a695@mail.gmail.com>

On 7/24/07, Guido van Rossum <guido at python.org> wrote:
> I believe I rigged "make install" to continue after this error -- did
> the rest of the install complete?

Yes, it continued fine. BTW, are you interested in sending the output
of python testsuite? I'm on a Fedora Core 6 box.

I could build my wrappers for MPI without problems (they were working
against p3yk branch, but I was warned that development has moved to
py3k-struni).

However, I am having trouble with 'pickle', but perhaps this is only
my fault, i just imported pickle instead of cPickle (and all this in a
C extension module). I am using that because cPickle seems to be not
available in the py3k-struni.


-- 
Lisandro Dalc?n

From lists at cheimes.de  Wed Jul 25 01:28:24 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 25 Jul 2007 01:28:24 +0200
Subject: [Python-3000] Py3k: error during 'make install' in py3k-struni ?
In-Reply-To: <ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>
	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
Message-ID: <f861uv$55h$1@sea.gmane.org>

Guido van Rossum wrote:
> On 7/24/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
>> Afther checking out the py3k-struni branch, 'make install' issued this:
>>
>> Compiling /usr/local/python/3.0/lib/python3.0/test/test_tarfile.py ...
>> *** SyntaxError: ('expected string, bytes found',
>> ('/usr/local/python/3.0/lib/python3.0/test/test_tarfile.py', 0, 0,
>> None))
>>
>> If this is expected to fail, please forget this.

It should not faild but we know that it is failing. The module isn't
easy to fix either. I spent about an hour on tarfile.py without any
luck. It's a beast and seems to be rather old style code from the Python
1.x days.

Christian


From lists at cheimes.de  Wed Jul 25 01:33:09 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 25 Jul 2007 01:33:09 +0200
Subject: [Python-3000] [Python-Dev] Py3k: error during 'make install' in
 py3k-struni ?
In-Reply-To: <e7ba66e40707241614g2a5180b3o871f2bd73d57a695@mail.gmail.com>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
	<e7ba66e40707241614g2a5180b3o871f2bd73d57a695@mail.gmail.com>
Message-ID: <f8627t$6a7$1@sea.gmane.org>

Lisandro Dalcin wrote:
> However, I am having trouble with 'pickle', but perhaps this is only
> my fault, i just imported pickle instead of cPickle (and all this in a
> C extension module). I am using that because cPickle seems to be not
> available in the py3k-struni.

The pickle module is broken as well. The cPickle module won't be
available in Python 3000. The C optimization of the cPickle module are
going to be integrated into the pickle module during a Google Summer of
Code project. The new pickle code will be subclass-able (cPickle
couldn't be subclassed) but will have optimized C code to speed up
pickling and unpickling.

Christian


From greg.ewing at canterbury.ac.nz  Wed Jul 25 03:43:04 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 25 Jul 2007 13:43:04 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070724004850.2F2343A403D@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
	<20070723004703.C3A903A40A9@sparrow.telecommunity.com>
	<46A54092.8030606@canterbury.ac.nz>
	<20070724004850.2F2343A403D@sparrow.telecommunity.com>
Message-ID: <46A6AAA8.8080607@canterbury.ac.nz>

Phillip J. Eby wrote:
> ...and may be subclassed in an unlimited number of places.
> 
> A generic function is defined in just one place, with a limited number 
> of "generic" methods typically adjoining it, and may be extended in an 
> unlimited number of places.
> 
> Where's the difference?

With GFs, even if you assume a particular runtime type,
it can *still* be extended in an unlimited number of places.

>> It also provides a convenient mental chunk under which to
>> group all the operations that it implements.
> 
> The function itself is the grouping, in the same way that Python's 
> operator.* functions are, or its built-in generics like len() and 
> iter().

But they're just syntactic sugar for calling methods of the
objects involved, so those objects' classes have full control
of what happens. If len() were a GF in your sense, the code
implementing it for a given type could appear anywhere.

>> No, you're going to find every function whose name is 'foo',
>> whether it's a method of the particular GF you have in mind
>> or not.
>
> And this doesn't apply to normal methods?

Yes, it does to some extent, and that can be a nuisance. But
in the first instance I'm not going to grep the whole program,
just the file where the class is defined. If I don't find it
there, I'll move on to the file defining its base class, etc.
The first definition I find will be the relevant one.

In other words, I have a search *path* through a structure
that's reflected in the layout of the source files. GFs
destroy that structure.

(Multiple inheritance can mess this up a bit, but that just
means multiple inheritance has problems, not that GFs are
good.)

 > For one thing, you can isolate your search to modules that
> import the function being overridden

But I'll still get an unordered set of results that I'll
have to sort through to find the most relevant method.

> The thing that you seem to keep missing in your analysis is that Python 
> already *has* generic functions in the language specification,

But only in a very restricted way -- so restricted that
I've never even thought of them as GFs, but as just another
way to write a method call.

--
Greg

From greg.ewing at canterbury.ac.nz  Wed Jul 25 04:39:06 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 25 Jul 2007 14:39:06 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070724013722.B2F5B3A403D@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A54DCE.8050205@canterbury.ac.nz>
	<20070724013722.B2F5B3A403D@sparrow.telecommunity.com>
Message-ID: <46A6B7CA.3010308@canterbury.ac.nz>

Phillip J. Eby wrote:
> At 12:54 PM 7/24/2007 +1200, Greg Ewing wrote:
>> When adding an overload to a GF, what methodology can I
>> follow to ensure that my overload doesn't interact in an
>> unfortunate way with another one somewhere else, perhaps
>> one not written by me?
> 
> What methodology can you follow that ensures that same thing when 
> overriding a method in a subclass?

It's dead simple -- my method always wins.

This is true even in the presence of multiple inheritance.
Problems only arise there if I use super() to make an
inherited method call (so I don't do that) or if other
people multiply inherit from me -- in which case it's
their problem, not mine.

> Okay, well I guess the above statements all put you squarely in the "OO 
> is too scary" category,

Certainly not -- I don't find OO scary at all. I wouldn't
say that I find GFs "scary" either, only that I would use
them cautiously and sparingly. I don't agree that there is
no difference between the traditional OO model and GFs.
With GFs there is less static structure that you can rely
on.

>> And what if the program doesn't exist yet, because I'm
>> still thinking about how to write it? Or it exists but
>> isn't yet in a state where it can be run successfully?
> 
> I don't understand what you're asking, here.

Don't you think it's important to be able to reason about
the way a program will behave while you're in the process
of designing it? If you haven't written runnable code yet,
you can't run it to get a list of method overrides.

> I was under the impression that 
> under certain circumstances, if one object is "more specific" than the 
> other (i.e., one is an instance of a subclass of the other's type), then 
> that one gets first say.

You may be right. But the fact remains that the method called
will be a method of one class or the other -- it can't be some
function defined in an arbitrary place.

--
Greg

From greg.ewing at canterbury.ac.nz  Wed Jul 25 06:06:04 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 25 Jul 2007 16:06:04 +1200
Subject: [Python-3000] New section for PEP 3124
In-Reply-To: <20070724225847.804E13A40A7@sparrow.telecommunity.com>
References: <20070724220011.4F63D3A40B2@sparrow.telecommunity.com>
	<ca471dc20707241516m6af5329ax1dada6129718d058@mail.gmail.com>
	<20070724225847.804E13A40A7@sparrow.telecommunity.com>
Message-ID: <46A6CC2C.9040506@canterbury.ac.nz>

Phillip J. Eby wrote:
> At 03:16 PM 7/24/2007 -0700, Guido van Rossum wrote:
> 
>>I'm confused why you spend so much time refuting the argument,
> 
> The purpose was to capture the arguments on both sides for posterity 
> as part of the PEP.

I don't think you need to spend so many words on the
argument itself -- a one-paragraph summary would be
enough.

The parts outlining recommended practice for overloading
look useful, though. This is the sort of thing I was
after with my "What methodology can I follow?" question.

But I would phrase it in an "It is recommended that..."
kind of way rather than making assertions about what
"can be found" in code (that doesn't exist yet in
Python).

> For example, epydoc and pydoc contain functions that inspect the type 
> of their arguments in order to decide what to with them.  While it's 
> arguable that in a GF world, the authors *should* have made those 
> functions overloadable, it isn't reasonable to expect everyone to 
> rewrite their code to make everything overloadable, nor to correctly 
> anticipate every function for which extension might be needed.

However, given the existence of GFs, someone writing
something like pydoc, and coming to a point where he is
about to write an if-else statement that switches on
a type, perhaps ought to at least suspect that it might
be a good idea to use a GF instead?

--
Greg

From guido at python.org  Wed Jul 25 06:53:37 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Jul 2007 21:53:37 -0700
Subject: [Python-3000] [Python-Dev] Py3k: error during 'make install' in
	py3k-struni ?
In-Reply-To: <f8627t$6a7$1@sea.gmane.org>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>
	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
	<e7ba66e40707241614g2a5180b3o871f2bd73d57a695@mail.gmail.com>
	<f8627t$6a7$1@sea.gmane.org>
Message-ID: <ca471dc20707242153p748e17eeh81d346097159b583@mail.gmail.com>

What's broken about pickle on the struni branch? It passes all its tests.

On 7/24/07, Christian Heimes <lists at cheimes.de> wrote:
> Lisandro Dalcin wrote:
> > However, I am having trouble with 'pickle', but perhaps this is only
> > my fault, i just imported pickle instead of cPickle (and all this in a
> > C extension module). I am using that because cPickle seems to be not
> > available in the py3k-struni.
>
> The pickle module is broken as well. The cPickle module won't be
> available in Python 3000. The C optimization of the cPickle module are
> going to be integrated into the pickle module during a Google Summer of
> Code project. The new pickle code will be subclass-able (cPickle
> couldn't be subclassed) but will have optimized C code to speed up
> pickling and unpickling.
>
> Christian
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From talin at acm.org  Wed Jul 25 06:54:33 2007
From: talin at acm.org (Talin)
Date: Tue, 24 Jul 2007 21:54:33 -0700
Subject: [Python-3000] Latest revision of PEP 3101
Message-ID: <46A6D789.9060502@acm.org>

You can find it in the usual place:

     http://www.python.org/dev/peps/pep-3101/

There are no changes to public APIs, the only changes are to the 
extension mechanism for custom formatting classes. Also, I've edited a 
lot of the text in order to improve the clarity of explanations and cut 
out excess verbiage.

Comments are welcome as usual.

-- Talin

From guido at python.org  Wed Jul 25 06:55:10 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Jul 2007 21:55:10 -0700
Subject: [Python-3000] Py3k: error during 'make install' in py3k-struni ?
In-Reply-To: <f861uv$55h$1@sea.gmane.org>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>
	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
	<f861uv$55h$1@sea.gmane.org>
Message-ID: <ca471dc20707242155p665b8bd4ie0a065cbe31c09ba@mail.gmail.com>

Tarfile is not from the 1.x days. But you're right, it's hairy. It
also changes too much (e.g. between 2.4.1 and 2.4.3 a refactoring
happened that also caused a new bug. The code has evolved quite a bit
since then and is still evolving... ;-( )

On 7/24/07, Christian Heimes <lists at cheimes.de> wrote:
> Guido van Rossum wrote:
> > On 7/24/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> >> Afther checking out the py3k-struni branch, 'make install' issued this:
> >>
> >> Compiling /usr/local/python/3.0/lib/python3.0/test/test_tarfile.py ...
> >> *** SyntaxError: ('expected string, bytes found',
> >> ('/usr/local/python/3.0/lib/python3.0/test/test_tarfile.py', 0, 0,
> >> None))
> >>
> >> If this is expected to fail, please forget this.
>
> It should not faild but we know that it is failing. The module isn't
> easy to fix either. I spent about an hour on tarfile.py without any
> luck. It's a beast and seems to be rather old style code from the Python
> 1.x days.
>
> Christian
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jyasskin at gmail.com  Wed Jul 25 07:18:00 2007
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Tue, 24 Jul 2007 22:18:00 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
Message-ID: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>

I'm looking through a couple of the OS X tests and have run into the
question of what to do with four-character codes. (For those of you
who are unfamiliar with these, Apple, around the dawn of time, decided
that C constants like 'TEXT' (yes, those are single quotes) would
compile to the uint32_t 0x54455854 (or maybe the other-endian version
of that) so they could use these as cheap-but-readable type
identifiers.) In Python 2, these are represented as 'str' instances,
which PyMac_GetOSType() in Python/mactoolboxglue.c converts to the
native int format. For Python 3, right now they're str8's, but str8 is
theoretically supposed to go away. Because they're binary constants
displayed as ASCII, not unicode text, I initially thought that 'bytes'
was the appropriate type. Unfortunately, bytes is mutable, and I think
it makes sense to hash these constants (and some code in aepack.py
does).

So, I'm stuck and wanted to ask the list for input. I see 5 options:
 1) Make these str instances so they're immutable and just rely on
convention and runtime errors to keep them in ascii.
 2) Make them bytes, and cast them to something else when you want to
make them keys in a dict.
 3) Keep them str8 and give up on getting rid of it.
 4) Make bytes immutable, add a 'buffer' type which acts like the
current bytes type, and make these codes instances of bytes. [probably
impossible this late in the game]
 5) Make a new hashable class for these codes which converts them to
and from ints and bytes and becomes the general argument type for the
apple platform interface. [Cleanest, but lots of work that I'm not
volunteering to do]

Thoughts?
Jeffrey

From talin at acm.org  Wed Jul 25 07:29:21 2007
From: talin at acm.org (Talin)
Date: Tue, 24 Jul 2007 22:29:21 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
Message-ID: <46A6DFB1.8050908@acm.org>

Jeffrey Yasskin wrote:
> I'm looking through a couple of the OS X tests and have run into the
> question of what to do with four-character codes. (For those of you
> who are unfamiliar with these, Apple, around the dawn of time, decided
> that C constants like 'TEXT' (yes, those are single quotes) would
> compile to the uint32_t 0x54455854 (or maybe the other-endian version
> of that) so they could use these as cheap-but-readable type
> identifiers.) In Python 2, these are represented as 'str' instances,
> which PyMac_GetOSType() in Python/mactoolboxglue.c converts to the
> native int format. For Python 3, right now they're str8's, but str8 is
> theoretically supposed to go away. Because they're binary constants
> displayed as ASCII, not unicode text, I initially thought that 'bytes'
> was the appropriate type. Unfortunately, bytes is mutable, and I think
> it makes sense to hash these constants (and some code in aepack.py
> does).
> 
> So, I'm stuck and wanted to ask the list for input. I see 5 options:
>  1) Make these str instances so they're immutable and just rely on
> convention and runtime errors to keep them in ascii.
>  2) Make them bytes, and cast them to something else when you want to
> make them keys in a dict.
>  3) Keep them str8 and give up on getting rid of it.
>  4) Make bytes immutable, add a 'buffer' type which acts like the
> current bytes type, and make these codes instances of bytes. [probably
> impossible this late in the game]
>  5) Make a new hashable class for these codes which converts them to
> and from ints and bytes and becomes the general argument type for the
> apple platform interface. [Cleanest, but lots of work that I'm not
> volunteering to do]
> 
> Thoughts?
> Jeffrey

Yeah. I like the idea of converting them to integers, but I don't think 
you need a special hash table class for that. Instead, create a wrapper 
class for the four character codes:

    TextId = FourCharId("TEXT")
    i = int(TextId) # Integer value
    s = str(TextId) # String representation
    some_map[TextId] = "Some Text" # Can use as dict key

The wrapper class is an immutable class that handles conversion to 
integer form in the constructor, hashing, and has a __str__ and __repr__ 
method that produces the original input string. Then you can use that as 
a key to a regular dict.

-- Talin


From guido at python.org  Wed Jul 25 07:44:17 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Jul 2007 22:44:17 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <46A6DFB1.8050908@acm.org>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<46A6DFB1.8050908@acm.org>
Message-ID: <ca471dc20707242244x2c423cfdmc22cde053a8563a9@mail.gmail.com>

Make them bytes literals (some code already does this), and convert
them to integers when they're needed to be used as hash keys. (Does
this happen a lot? I haven't seen it yet.)

I would endorse an API to create an int from a bytes array (or
arbitrary length) and vice versa -- that would be a useful way to
marshal long integers, too. There's probably already a C API to do
something like that.

--Guido

On 7/24/07, Talin <talin at acm.org> wrote:
> Jeffrey Yasskin wrote:
> > I'm looking through a couple of the OS X tests and have run into the
> > question of what to do with four-character codes. (For those of you
> > who are unfamiliar with these, Apple, around the dawn of time, decided
> > that C constants like 'TEXT' (yes, those are single quotes) would
> > compile to the uint32_t 0x54455854 (or maybe the other-endian version
> > of that) so they could use these as cheap-but-readable type
> > identifiers.) In Python 2, these are represented as 'str' instances,
> > which PyMac_GetOSType() in Python/mactoolboxglue.c converts to the
> > native int format. For Python 3, right now they're str8's, but str8 is
> > theoretically supposed to go away. Because they're binary constants
> > displayed as ASCII, not unicode text, I initially thought that 'bytes'
> > was the appropriate type. Unfortunately, bytes is mutable, and I think
> > it makes sense to hash these constants (and some code in aepack.py
> > does).
> >
> > So, I'm stuck and wanted to ask the list for input. I see 5 options:
> >  1) Make these str instances so they're immutable and just rely on
> > convention and runtime errors to keep them in ascii.
> >  2) Make them bytes, and cast them to something else when you want to
> > make them keys in a dict.
> >  3) Keep them str8 and give up on getting rid of it.
> >  4) Make bytes immutable, add a 'buffer' type which acts like the
> > current bytes type, and make these codes instances of bytes. [probably
> > impossible this late in the game]
> >  5) Make a new hashable class for these codes which converts them to
> > and from ints and bytes and becomes the general argument type for the
> > apple platform interface. [Cleanest, but lots of work that I'm not
> > volunteering to do]
> >
> > Thoughts?
> > Jeffrey
>
> Yeah. I like the idea of converting them to integers, but I don't think
> you need a special hash table class for that. Instead, create a wrapper
> class for the four character codes:
>
>     TextId = FourCharId("TEXT")
>     i = int(TextId) # Integer value
>     s = str(TextId) # String representation
>     some_map[TextId] = "Some Text" # Can use as dict key
>
> The wrapper class is an immutable class that handles conversion to
> integer form in the constructor, hashing, and has a __str__ and __repr__
> method that produces the original input string. Then you can use that as
> a key to a regular dict.
>
> -- Talin
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From lists at cheimes.de  Wed Jul 25 10:54:38 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 25 Jul 2007 10:54:38 +0200
Subject: [Python-3000] [Python-Dev] Py3k: error during 'make install' in
 py3k-struni ?
In-Reply-To: <ca471dc20707242153p748e17eeh81d346097159b583@mail.gmail.com>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>	<e7ba66e40707241614g2a5180b3o871f2bd73d57a695@mail.gmail.com>	<f8627t$6a7$1@sea.gmane.org>
	<ca471dc20707242153p748e17eeh81d346097159b583@mail.gmail.com>
Message-ID: <46A70FCE.9030508@cheimes.de>

Guido van Rossum wrote:
> What's broken about pickle on the struni branch? It passes all its tests.
> 

My brain ... :(
I had some old code laying around. After svn revert + svn up all pickle
tests are passing. *blush*

Christian


From jan.grant at bristol.ac.uk  Wed Jul 25 11:03:08 2007
From: jan.grant at bristol.ac.uk (Jan Grant)
Date: Wed, 25 Jul 2007 10:03:08 +0100 (BST)
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46A58390.8050408@acm.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2B2BB.9070305@canterbury.ac.nz>
	<20070722015630.8F34C3A403A@sparrow.telecommunity.com>
	<46A3EC9B.4020507@canterbury.ac.nz>
	<20070723004703.C3A903A40A9@sparrow.telecommunity.com>
	<46A54092.8030606@canterbury.ac.nz>
	<20070724004850.2F2343A403D@sparrow.telecommunity.com>
	<ca471dc20707231957n2e58258v7b86b904803890dd@mail.gmail.com>
	<20070724034006.9B23E3A403D@sparrow.telecommunity.com>
	<46A58390.8050408@acm.org>
Message-ID: <20070725095353.I54289@tribble.ilrt.bris.ac.uk>

On Mon, 23 Jul 2007, Talin wrote:

> Phillip J. Eby wrote:
> 
> > I just don't see that the things Greg is describing aren't equally 
> > applicable to traditional methods.
> 
> I wasn't going to get into this, but - since you asked :)
> 
> The short form of the argument is that being able to overload any 
> function as a generic function retroactively changes the implicit 
> contract of what that function is.

I don't think this is really true in programs written with good taste - 
ie, it's no more true than in the OO case.

In the OO case, one might consider the class of an object to be closely 
associated with a contract describing its intended semantics (its type). 
If a function takes a parameter and is written expecting that it is 
passed an argument of type B (for come class B), then by subclassing B 
into a derived class, D, you _ought_ to be able to pass an instance of D 
to the same function which should be able to use it, regardless.

That's what subclassing _means_: if D is a subclass of B, then all 
instances of D should behave appropriately and according to the intended 
semantics of B when used as a B.

Of course, it's perfectly possible to abuse subclassing to acquire 
implementation rather than the type/contract, but well-written* OO 
programs at least draw a clear distinction between those uses if they do 
it at all.

So, when you look at an OO program that makes extensive use of 
subclassing, you typically have a notion of what method calls should do 
at a broad semantic level because that notion is part of the contract 
implicit in the type.


Exactly the same is true with GFs. Yes, you can overload "add()" to mean 
"subtract" or "remove a random file" or "close all database connections" 
in certain cases. That's painfully flying in the face of the intended 
semantics of the function you're overloading; so, don't do that.


Cheers,
jan

* Excuse the unavoidably emotive terminology like "well-written". I know 
there are other views - I'm just arguing this one.

-- 
jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/
Tel +44 (0)117 3317661   http://ioctl.org/jan/
Spreadsheet through network. Oh yeah.

From ronaldoussoren at mac.com  Wed Jul 25 12:04:49 2007
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed, 25 Jul 2007 03:04:49 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
Message-ID: <4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>

I've CC-ed Jack Jansen as he has maintained the Mac libraries for ages (from way before OS9 was shiny and new).
 

On Wednesday, July 25, 2007, at 07:18AM, "Jeffrey Yasskin" <jyasskin at gmail.com> wrote:
>I'm looking through a couple of the OS X tests and have run into the
>question of what to do with four-character codes. (For those of you
>who are unfamiliar with these, Apple, around the dawn of time, decided
>that C constants like 'TEXT' (yes, those are single quotes) would
>compile to the uint32_t 0x54455854 (or maybe the other-endian version
>of that) so they could use these as cheap-but-readable type

AFAIK the are always converted as big-endian values.

>identifiers.) In Python 2, these are represented as 'str' instances,
>which PyMac_GetOSType() in Python/mactoolboxglue.c converts to the
>native int format. For Python 3, right now they're str8's, but str8 is
>theoretically supposed to go away. Because they're binary constants
>displayed as ASCII, not unicode text, I initially thought that 'bytes'
>was the appropriate type. Unfortunately, bytes is mutable, and I think
>it makes sense to hash these constants (and some code in aepack.py
>does).
>
>So, I'm stuck and wanted to ask the list for input. I see 5 options:
> 1) Make these str instances so they're immutable and just rely on
>convention and runtime errors to keep them in ascii.
> 2) Make them bytes, and cast them to something else when you want to
>make them keys in a dict.
> 3) Keep them str8 and give up on getting rid of it.
> 4) Make bytes immutable, add a 'buffer' type which acts like the
>current bytes type, and make these codes instances of bytes. [probably
>impossible this late in the game]
> 5) Make a new hashable class for these codes which converts them to
>and from ints and bytes and becomes the general argument type for the
>apple platform interface. [Cleanest, but lots of work that I'm not
>volunteering to do]

A 6th option is a subclass of int. It's constructor would accept a string containing the 4CC and the repr/str method would return the string representation of the code.  IMHO this is the cleanest representation of 4CCs in Python because those codes are basicy a "neat" way to enter integer literals in C.

This would also solve a problem that PyObjC users sometimes run into: Several C/Objective-C APIs return a dictionary where one  of the values is an integer and where one would commonly use 4CCs to write down literals. This currently causes unexpected failures but would do the right thing with this option.

Ronald


From benji at benjiyork.com  Wed Jul 25 14:45:29 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 25 Jul 2007 08:45:29 -0400
Subject: [Python-3000] New section for PEP 3124
In-Reply-To: <20070724225847.804E13A40A7@sparrow.telecommunity.com>
References: <20070724220011.4F63D3A40B2@sparrow.telecommunity.com>	<ca471dc20707241516m6af5329ax1dada6129718d058@mail.gmail.com>
	<20070724225847.804E13A40A7@sparrow.telecommunity.com>
Message-ID: <46A745E9.1050803@benjiyork.com>

Phillip J. Eby wrote:
> For example, epydoc and pydoc contain functions that inspect the type 
> of their arguments in order to decide what to with them.  While it's 
> arguable that in a GF world, the authors *should* have made those 
> functions overloadable, it isn't reasonable to expect everyone to 
> rewrite their code to make everything overloadable, nor to correctly 
> anticipate every function for which extension might be needed.

That makes me wonder.  Will it be possible to monkeypatch a function to 
make it overloadable?  I don't know if people will really do that, but 
at least the irony would be worth it.
-- 
Benji York
http://benjiyork.com

From martin at v.loewis.de  Wed Jul 25 19:26:45 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 25 Jul 2007 19:26:45 +0200
Subject: [Python-3000] Py3k: error during 'make install' in py3k-struni ?
In-Reply-To: <f861uv$55h$1@sea.gmane.org>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
	<f861uv$55h$1@sea.gmane.org>
Message-ID: <46A787D5.2030000@v.loewis.de>

>>> Afther checking out the py3k-struni branch, 'make install' issued this:
>>>
>>> Compiling /usr/local/python/3.0/lib/python3.0/test/test_tarfile.py ...
>>> *** SyntaxError: ('expected string, bytes found',
>>> ('/usr/local/python/3.0/lib/python3.0/test/test_tarfile.py', 0, 0,
>>> None))
>>>
>>> If this is expected to fail, please forget this.
> 
> It should not faild but we know that it is failing. The module isn't
> easy to fix either. I spent about an hour on tarfile.py without any
> luck. It's a beast and seems to be rather old style code from the Python
> 1.x days.

Don't despair, the white knight in shining armor might not be too far
away to safe you from the dragon :-)

Seriously, Lars Gust?bel (CC'ed) has always been quite helpful in
fixing whatever problem arise with the tarfile module.

Lars, do you have a chance to look at porting the
module to 3k/struni?

Regards,
Martin

From lars at gustaebel.de  Thu Jul 26 00:54:41 2007
From: lars at gustaebel.de (Lars =?iso-8859-15?Q?Gust=E4bel?=)
Date: Thu, 26 Jul 2007 00:54:41 +0200
Subject: [Python-3000] Py3k: error during 'make install' in py3k-struni	?
In-Reply-To: <46A787D5.2030000@v.loewis.de>
References: <e7ba66e40707241524v2ad90c51hcb3e63f0a8ea9b08@mail.gmail.com>
	<ca471dc20707241530r4957c856ved9bfa5a9a023c6@mail.gmail.com>
	<f861uv$55h$1@sea.gmane.org> <46A787D5.2030000@v.loewis.de>
Message-ID: <20070725225441.GA18002@core.g33x.de>

On Wed, Jul 25, 2007 at 07:26:45PM +0200, "Martin v. L?wis" wrote:
> >>> Afther checking out the py3k-struni branch, 'make install' issued this:
> >>>
> >>> Compiling /usr/local/python/3.0/lib/python3.0/test/test_tarfile.py ...
> >>> *** SyntaxError: ('expected string, bytes found',
> >>> ('/usr/local/python/3.0/lib/python3.0/test/test_tarfile.py', 0, 0,
> >>> None))
> >>>
> >>> If this is expected to fail, please forget this.
> > 
> > It should not faild but we know that it is failing. The module isn't
> > easy to fix either. I spent about an hour on tarfile.py without any
> > luck. It's a beast and seems to be rather old style code from the Python
> > 1.x days.
> 
> Don't despair, the white knight in shining armor might not be too far
> away to safe you from the dragon :-)

Yea, I heard that call :-)

> Seriously, Lars Gust?bel (CC'ed) has always been quite helpful in
> fixing whatever problem arise with the tarfile module.
> 
> Lars, do you have a chance to look at porting the
> module to 3k/struni?

I just took a quick look at it, but I could not reproduce the
above error message. However, it is obvious that tarfile.py is
completely unusable in py3k-struni and it is my job to fix it,
which seems to me far from trivial at the moment. I have to
catch up with py3k development as well, so I am not able to
estimate when the job will be done.

-- 
Lars Gust?bel
lars at gustaebel.de

The world is a tragedy to those who feel, but a comedy
to those who think.
(Horace Walpole)

From greg.ewing at canterbury.ac.nz  Thu Jul 26 04:07:55 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Jul 2007 14:07:55 +1200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
Message-ID: <46A801FB.7020300@canterbury.ac.nz>

Jeffrey Yasskin wrote:
> Apple, around the dawn of time, decided
> that C constants like 'TEXT' (yes, those are single quotes) would
> compile to the uint32_t 0x54455854

They weren't C constants originally, they were Pascal
constants, and it made sense at the time given the way
the Pascal compiler they were using handled string
literals. They also worked okay as multi-character char
literals in the early C compilers used on the Mac.
It's unfortunate that gcc gets persnickety about those.

 > I initially thought that 'bytes'
> was the appropriate type. Unfortunately, bytes is mutable, and I think
> it makes sense to hash these constants (and some code in aepack.py
> does).

Is this another indication that we should have an
immutable version of the bytes type?

--
Greg

From ronaldoussoren at mac.com  Thu Jul 26 07:52:34 2007
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 26 Jul 2007 07:52:34 +0200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <46A801FB.7020300@canterbury.ac.nz>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<46A801FB.7020300@canterbury.ac.nz>
Message-ID: <3B03B169-1321-48BC-A14D-7B32592BE159@mac.com>


On 26 Jul, 2007, at 4:07, Greg Ewing wrote:
>
>> I initially thought that 'bytes'
>> was the appropriate type. Unfortunately, bytes is mutable, and I  
>> think
>> it makes sense to hash these constants (and some code in aepack.py
>> does).
>
> Is this another indication that we should have an
> immutable version of the bytes type?

No. Four-character-constants are *not* strings or byte arrays, they  
are integer literals.

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070726/ed794ae0/attachment.bin 

From dalcinl at gmail.com  Thu Jul 26 19:06:47 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 26 Jul 2007 14:06:47 -0300
Subject: [Python-3000] interaction between locals, builtins and except clause
Message-ID: <e7ba66e40707261006s71826ce8p73194134992697f3@mail.gmail.com>

Porting to Py3K, I modified a function like the followin, using a
trick for it working in Py2.x .

    def __iter__(self):
        if self == _mpi.INFO_NULL:
            return
        try:    range = xrange
        except: pass
        nkeys = _mpi.info_get_nkeys(self)
        for nthkey in range(nkeys):
            yield _mpi.info_get_nthkey(self, nthkey)

However, I've got in my unittests (running with py3k)

ERROR: testPyMethods (__main__.TestInfo)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/unittest/test_info.py", line 123, in testPyMethods
    for key in INFO:
  File "/u/dalcinl/lib/python/mpi4py/MPI.py", line 937, in __iter__
    for nthkey in range(nkeys):
UnboundLocalError: local variable 'range' referenced before assignment


I am not completelly sure if this is expected (it is, regarding
implementation, but perhaps not regarding Python as a language),  so
I post this for your consideration.


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From eopadoan at altavix.com  Thu Jul 26 19:27:11 2007
From: eopadoan at altavix.com (Eduardo "EdCrypt" O. Padoan)
Date: Thu, 26 Jul 2007 14:27:11 -0300
Subject: [Python-3000] interaction between locals,
	builtins and except clause
In-Reply-To: <e7ba66e40707261006s71826ce8p73194134992697f3@mail.gmail.com>
References: <e7ba66e40707261006s71826ce8p73194134992697f3@mail.gmail.com>
Message-ID: <dea92f560707261027k2462a9beu10b5c4b3398a458d@mail.gmail.com>

On 7/26/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> Porting to Py3K, I modified a function like the followin, using a
> trick for it working in Py2.x .
>
>     def __iter__(self):
>         if self == _mpi.INFO_NULL:
>             return
>         try:    range = xrange
>         except: pass
>         nkeys = _mpi.info_get_nkeys(self)
>         for nthkey in range(nkeys):
>             yield _mpi.info_get_nthkey(self, nthkey)
>
> However, I've got in my unittests (running with py3k)
>
> ERROR: testPyMethods (__main__.TestInfo)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "tests/unittest/test_info.py", line 123, in testPyMethods
>     for key in INFO:
>   File "/u/dalcinl/lib/python/mpi4py/MPI.py", line 937, in __iter__
>     for nthkey in range(nkeys):
> UnboundLocalError: local variable 'range' referenced before assignment
>
>
> I am not completelly sure if this is expected (it is, regarding
> implementation, but perhaps not regarding Python as a language),  so
> I post this for your consideration.
>

Python thinnks range is local, because you referenced it, even if an
error ocurred. Use 'global range' at the top of the file.

But, as I understand, you are trying to target both Python 2.x and 3
with the same code, using tricks like this one. I think that, even if
you succeed, the resulting code will be quite unmaintainable.

-- 
EduardoOPadoan (eopadoan->altavix::com)
Bookmarks: http://del.icio.us/edcrypt

From dalcinl at gmail.com  Thu Jul 26 20:34:51 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 26 Jul 2007 15:34:51 -0300
Subject: [Python-3000] interaction between locals,
	builtins and except clause
In-Reply-To: <dea92f560707261027k2462a9beu10b5c4b3398a458d@mail.gmail.com>
References: <e7ba66e40707261006s71826ce8p73194134992697f3@mail.gmail.com>
	<dea92f560707261027k2462a9beu10b5c4b3398a458d@mail.gmail.com>
Message-ID: <e7ba66e40707261134x6389bb6bi17d31f0cce323a96@mail.gmail.com>

On 7/26/07, Eduardo EdCrypt O. Padoan <eopadoan at altavix.com> > Python
thinnks range is local, because you referenced it, even if an
> error ocurred. Use 'global range' at the top of the file.

Yes, I understand all that. I just wanted to know if the result of
this locals + except + globals interaction was right, even in the case
of errors. Now I know that it is OK. Thanks!

> But, as I understand, you are trying to target both Python 2.x and 3
> with the same code, using tricks like this one. I think that, even if
> you succeed, the resulting code will be quite unmaintainable.

Well, my code is not so complex in the python side (I'm still
supporting python 2.3). And I do not want to put things in globals,
just for maintenenace reasons. In the end, I've used the following
trick:

try: _range = xrange
except: _range = range
for i in _range(n): pass

I think it should work in any 2x and 3K. Is this right? Perhaps this
trick could be used for some automated conversion tool targeting
backward compatibility with 2.x series.

Regards, and thanks again for your clarification.

-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From greg.ewing at canterbury.ac.nz  Fri Jul 27 02:12:10 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Jul 2007 12:12:10 +1200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <3B03B169-1321-48BC-A14D-7B32592BE159@mac.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<46A801FB.7020300@canterbury.ac.nz>
	<3B03B169-1321-48BC-A14D-7B32592BE159@mac.com>
Message-ID: <46A9385A.2060508@canterbury.ac.nz>

Ronald Oussoren wrote:
> No. Four-character-constants are *not* strings or byte arrays, they  are 
> integer literals.

Well, in Pascal they were character arrays -- it
was only when they switched to C that they became
ints. Conceptually they're still the same thing.
Python isn't C, and doesn't have to be bound by
C's limitations.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From aholkner at cs.rmit.edu.au  Fri Jul 27 02:45:35 2007
From: aholkner at cs.rmit.edu.au (Alex Holkner)
Date: Fri, 27 Jul 2007 10:45:35 +1000
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <46A9385A.2060508@canterbury.ac.nz>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>	<46A801FB.7020300@canterbury.ac.nz>	<3B03B169-1321-48BC-A14D-7B32592BE159@mac.com>
	<46A9385A.2060508@canterbury.ac.nz>
Message-ID: <46A9402F.1040807@cs.rmit.edu.au>

Greg Ewing wrote:
> Ronald Oussoren wrote:
>> No. Four-character-constants are *not* strings or byte arrays, they  are 
>> integer literals.
> 
> Well, in Pascal they were character arrays -- it
> was only when they switched to C that they became
> ints. Conceptually they're still the same thing.
> Python isn't C, and doesn't have to be bound by
> C's limitations.

Regardless of what the situation was in Pascal's time, they are 
currently integers.

The order of bytes in the array would need to be adjusted depending on 
the machine endianness to be correct.

The C argument passing convention is different for byte arrays than for 
integers (presumably the most common use of these constants is to use 
them with Apple libraries).

Different constants within the same enumeration are sometimes specified 
as decimal integers, and sometimes as character constants.  For example, 
the QTNewGWorldFromPtr function uses an enumeration which includes 
k32BGRAPixelFormat, defined as 'BGRA', and k32ARGBPixelFormat, defined 
as 0x20.

Providing a convenience str() method may be handy, but the internal 
representation must be integer.

Alex.


From jyasskin at gmail.com  Fri Jul 27 05:38:45 2007
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Thu, 26 Jul 2007 20:38:45 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
Message-ID: <5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>

I've sent the patch as http://python.org/sf/1761465 using Guido's
suggestion of using bytes, but I do philosophically prefer Talin's and
Ronald's suggestions.

On 7/25/07, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> I've CC-ed Jack Jansen as he has maintained the Mac libraries for ages (from way before OS9 was shiny and new).

Did you mean to add him to this thread?

> On Wednesday, July 25, 2007, at 07:18AM, "Jeffrey Yasskin" <jyasskin at gmail.com> wrote:
> > 5) Make a new hashable class for these codes which converts them to
> >and from ints and bytes and becomes the general argument type for the
> >apple platform interface. [Cleanest, but lots of work that I'm not
> >volunteering to do]
>
> A 6th option is a subclass of int. It's constructor would accept a string containing the 4CC and the repr/str method would return the string representation of the code.  IMHO this is the cleanest representation of 4CCs in Python because those codes are basicy a "neat" way to enter integer literals in C.

Na?ve question: How does that differ from option (5)? Just the
isinstance() behavior?

I said this would take a lot of work because I think the new type
needs to be implemented in C to be returned from PyMac_GetOSType(),
and it seemed like a bigger API change than just switching to bytes,
but it turns out that switching to bytes isn't particularly trivial
either when you have to cast for every use in a dict, so maybe the new
type would be easier.

> This would also solve a problem that PyObjC users sometimes run into: Several C/Objective-C APIs return a dictionary where one  of the values is an integer and where one would commonly use 4CCs to write down literals. This currently causes unexpected failures but would do the right thing with this option.

I don't think that option (6) by itself solves with that particular
problem. If you call str() on one of those ints, you'd just get a
number, which is different from what would happen if you call str() on
the 4CC type. It might help though by handling comparisons correctly.

On 7/26/07, Alex Holkner <aholkner at cs.rmit.edu.au> wrote:
> Providing a convenience str() method may be handy, but the internal
> representation must be integer.

Where are you getting "must"? In current python, they're 'str'
instances, not ints. The C interface between python and apple code
converts, of course, but python can do whatever makes the most sense
to us.

-- 
Namast?,
Jeffrey Yasskin

From guido at python.org  Fri Jul 27 07:07:52 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 26 Jul 2007 22:07:52 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
Message-ID: <ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>

On 7/26/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> I've sent the patch as http://python.org/sf/1761465 using Guido's
> suggestion of using bytes, but I do philosophically prefer Talin's and
> Ronald's suggestions.

I've checked in what you submitted; at this point I take whatever I
can get if it makes unit tests pass. :-)

I'm not so sure that the "philosophically optimal" solution is all
that practical. After all we could have done that before, but we
didn't -- we used strings, because that's the most convenient way to
spell them in Python code, and (nearly) all APIs that take or return
these are C code which can do whatever it wants.

We could use Unicode strings where in the past we used 8-bit strings,
but that would be somewhat nasty when there's ever one of these codes
that's not pure ASCII -- we'd have to worry about encoding them
properly. So I'm happy with byte strings and the occasional helper to
convert these to strings or ints when using them as keys. (Personally
I'd like to use strings for the keys since {'TEXT': 'stuff'} is a lot
clearer than {1413830740: 'stuff'} when encountered in a debugging
session.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jyasskin at gmail.com  Fri Jul 27 07:27:42 2007
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Thu, 26 Jul 2007 22:27:42 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070723153031.D00273A403D@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
Message-ID: <5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>

On 7/23/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> For example, one pattern that sometimes comes up in writing methods
> is that you have a base class that always wants to do something
> *after* the subclass version of the method is called.  To implement
> that without method combination, you have to split the method into
> two parts, one of which gets called by the other, and then tell
> everybody writing subclasses to only override the second method.
>
> With method combination and a generic function, you simply declare an
> @after method for the base type, and it'll get called after the
> normal methods for any subclasses.

I've totally wanted to do that, so your email gave me a surge of hope,
but I think the generic function approach is actually worse here
(unless I'm totally misunderstanding). I think this would look like:

class MyBase:
    @generic
    def mymethod(self):
        default_stuff(self)
    @after(mymethod)
    def later(self):
        more_stuff(self)

class MyDerived(MyBase):
    mymethod = MyBase.mymethod
    @overload
    def mymethod(self):
        other_stuff(self)

And if MyDerived just overrides mymethod normally, it replaces the
@after part too.

So instead of telling people to override this other method (with the
benefit that immigrants from other languages are already used to this
inconvenience), you have to tell them to stick two extra lines in
front of their overrides. If they forget, the penalty is the same.
What's the benefit from generic functions here?

-- 
Namast?,
Jeffrey Yasskin

From ronaldoussoren at mac.com  Fri Jul 27 08:11:28 2007
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 27 Jul 2007 08:11:28 +0200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
Message-ID: <B707418E-B504-40D8-9EFF-1B1FB6216EFE@mac.com>


On 27 Jul, 2007, at 5:38, Jeffrey Yasskin wrote:

> I've sent the patch as http://python.org/sf/1761465 using Guido's
> suggestion of using bytes, but I do philosophically prefer Talin's and
> Ronald's suggestions.
>
> On 7/25/07, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>> I've CC-ed Jack Jansen as he has maintained the Mac libraries for  
>> ages (from way before OS9 was shiny and new).
>
> Did you mean to add him to this thread?

Yes, but I obviously failed to actually add this address to the CC  
list :-(.
>
>
>> On Wednesday, July 25, 2007, at 07:18AM, "Jeffrey Yasskin" <jyasskin at gmail.com 
>> > wrote:
>> > 5) Make a new hashable class for these codes which converts them to
>> >and from ints and bytes and becomes the general argument type for  
>> the
>> >apple platform interface. [Cleanest, but lots of work that I'm not
>> >volunteering to do]
>>
>> A 6th option is a subclass of int. It's constructor would accept a  
>> string containing the 4CC and the repr/str method would return the  
>> string representation of the code.  IMHO this is the cleanest  
>> representation of 4CCs in Python because those codes are basicy a  
>> "neat" way to enter integer literals in C.
>
> Na?ve question: How does that differ from option (5)? Just the
> isinstance() behavior?

That's the only change, but it is an important one. To reiterate: 4- 
character-codes in C are numeric literals and it would be best if  
Python reflected that fact to avoid surprises. 4-character-codes are  
definitely not arrays of bytes.

One example of an API that returns a dictionary where some keys refer  
to values that are commonly encoded using 4-character-codes is - 
[NSFileManager fileAttributesAtPath:traverseLink].

>
>
> I said this would take a lot of work because I think the new type
> needs to be implemented in C to be returned from PyMac_GetOSType(),
> and it seemed like a bigger API change than just switching to bytes,
> but it turns out that switching to bytes isn't particularly trivial
> either when you have to cast for every use in a dict, so maybe the new
> type would be easier.

The new type would be easier and the API change isn't too bad. I don't  
think you'd have to implement this type in C, there just needs to be a  
hook to tell the C code about this type.

>
>
>> This would also solve a problem that PyObjC users sometimes run  
>> into: Several C/Objective-C APIs return a dictionary where one  of  
>> the values is an integer and where one would commonly use 4CCs to  
>> write down literals. This currently causes unexpected failures but  
>> would do the right thing with this option.
>
> I don't think that option (6) by itself solves with that particular
> problem. If you call str() on one of those ints, you'd just get a
> number, which is different from what would happen if you call str() on
> the 4CC type. It might help though by handling comparisons correctly.

That's what I meant by "the right thing": code would just work except  
for not printing a nice human-readable value. As you don't have to do  
that a lot anyway that's not really a problem.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3562 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070727/56c25928/attachment.bin 

From jyasskin at gmail.com  Fri Jul 27 08:21:36 2007
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Thu, 26 Jul 2007 23:21:36 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
Message-ID: <5d44f72f0707262321o553347c9j72ac55e195107f9b@mail.gmail.com>

On 7/26/07, Guido van Rossum <guido at python.org> wrote:
> (Personally
> I'd like to use strings for the keys since {'TEXT': 'stuff'} is a lot
> clearer than {1413830740: 'stuff'} when encountered in a debugging
> session.)

Good argument. You now have a patch that uses str() instead of b2i().

From ncoghlan at gmail.com  Fri Jul 27 12:20:09 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 27 Jul 2007 20:20:09 +1000
Subject: [Python-3000] interaction between locals,
 builtins and except clause
In-Reply-To: <e7ba66e40707261134x6389bb6bi17d31f0cce323a96@mail.gmail.com>
References: <e7ba66e40707261006s71826ce8p73194134992697f3@mail.gmail.com>	<dea92f560707261027k2462a9beu10b5c4b3398a458d@mail.gmail.com>
	<e7ba66e40707261134x6389bb6bi17d31f0cce323a96@mail.gmail.com>
Message-ID: <46A9C6D9.3050405@gmail.com>

Lisandro Dalcin wrote:
> I think it should work in any 2x and 3K. Is this right? Perhaps this
> trick could be used for some automated conversion tool targeting
> backward compatibility with 2.x series.

The backwards compatible version looks like this:

     def __iter__(self):
          if self == _mpi.INFO_NULL:
              return
          nkeys = _mpi.info_get_nkeys(self)
          for nthkey in xrange(nkeys):
              yield _mpi.info_get_nthkey(self, nthkey)

The 2to3 converter will automatically convert the xrange() call to a 
range() call for the Py3k version.

If you want to persist in trying to get the same code running on both 
Py3k and 2.x without using the 2->3 converter, then I suggest 
segregating it all into a compatibility module and do:

   from py3k_compat import _range

The try/except code to determine how to set _range would then occur only 
once, regardless of the number of places where you used it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Fri Jul 27 14:55:00 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 27 Jul 2007 05:55:00 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <5d44f72f0707262321o553347c9j72ac55e195107f9b@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
	<5d44f72f0707262321o553347c9j72ac55e195107f9b@mail.gmail.com>
Message-ID: <ca471dc20707270555y4f271cd0j53b999b7d1f827cf@mail.gmail.com>

On 7/26/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> On 7/26/07, Guido van Rossum <guido at python.org> wrote:
> > (Personally
> > I'd like to use strings for the keys since {'TEXT': 'stuff'} is a lot
> > clearer than {1413830740: 'stuff'} when encountered in a debugging
> > session.)
>
> Good argument. You now have a patch that uses str() instead of b2i().

Hmm... That only works as long as the bytes are ASCII. Is that a
problem for aepack? Or are all its 4CCs chosen from a well-known set
that's all-ASCII?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Jul 27 17:25:18 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 27 Jul 2007 08:25:18 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
Message-ID: <ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>

On 7/26/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> On 7/23/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > For example, one pattern that sometimes comes up in writing methods
> > is that you have a base class that always wants to do something
> > *after* the subclass version of the method is called.  To implement
> > that without method combination, you have to split the method into
> > two parts, one of which gets called by the other, and then tell
> > everybody writing subclasses to only override the second method.
> >
> > With method combination and a generic function, you simply declare an
> > @after method for the base type, and it'll get called after the
> > normal methods for any subclasses.
>
> I've totally wanted to do that, so your email gave me a surge of hope,
> but I think the generic function approach is actually worse here
> (unless I'm totally misunderstanding). I think this would look like:
>
> class MyBase:
>     @generic
>     def mymethod(self):
>         default_stuff(self)
>     @after(mymethod)
>     def later(self):
>         more_stuff(self)
>
> class MyDerived(MyBase):
>     mymethod = MyBase.mymethod
>     @overload
>     def mymethod(self):
>         other_stuff(self)
>
> And if MyDerived just overrides mymethod normally, it replaces the
> @after part too.
>
> So instead of telling people to override this other method (with the
> benefit that immigrants from other languages are already used to this
> inconvenience), you have to tell them to stick two extra lines in
> front of their overrides. If they forget, the penalty is the same.
> What's the benefit from generic functions here?

The more I think about this example (and the one in the PEP from which
it's derived), the more I think this part is a frontal collision
between two paradigms, and needs a lot more thought put into it. The
need to say "mymethod = MyBase.mymethod" in the subclass, and the
subtle disasters that happen if this is forgotten, and the rules that
guide what code is called in what order when a subclass method is
called with a type signature for which a better match exists in the
base class, not to mention the combination of super() with
next_method, all make me think that ths part of the PEP is not ready
for public consumption just yet.

Basic GFs, great. Before/after/around, good. Other method
combinations, fine. But GFs in classes and subclassing? Not until we
have a much better design.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bob at redivi.com  Fri Jul 27 17:47:24 2007
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 27 Jul 2007 08:47:24 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <ca471dc20707270555y4f271cd0j53b999b7d1f827cf@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
	<5d44f72f0707262321o553347c9j72ac55e195107f9b@mail.gmail.com>
	<ca471dc20707270555y4f271cd0j53b999b7d1f827cf@mail.gmail.com>
Message-ID: <6a36e7290707270847k556e6eb5mda4aeb83fa919499@mail.gmail.com>

On 7/27/07, Guido van Rossum <guido at python.org> wrote:
> On 7/26/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> > On 7/26/07, Guido van Rossum <guido at python.org> wrote:
> > > (Personally
> > > I'd like to use strings for the keys since {'TEXT': 'stuff'} is a lot
> > > clearer than {1413830740: 'stuff'} when encountered in a debugging
> > > session.)
> >
> > Good argument. You now have a patch that uses str() instead of b2i().
>
> Hmm... That only works as long as the bytes are ASCII. Is that a
> problem for aepack? Or are all its 4CCs chosen from a well-known set
> that's all-ASCII?

4CCs are not all ASCII, they're Mac OS Roman. This is why in some of
the C header files the constants turned into integers.

-bob

From dalcinl at gmail.com  Fri Jul 27 17:49:14 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Fri, 27 Jul 2007 12:49:14 -0300
Subject: [Python-3000] docstring for dict.values
Message-ID: <e7ba66e40707270849i5b640a8fxdff08c1f776947cc@mail.gmail.com>

Why the docstrings for 'dict.values' says "a set-like object ..." ??

>>> list(dict(a=1,b=1,c=1).values())
[1, 1, 1]

-- 
Lisandro Dalc?n

From guido at python.org  Fri Jul 27 18:19:24 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 27 Jul 2007 09:19:24 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <6a36e7290707270847k556e6eb5mda4aeb83fa919499@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
	<5d44f72f0707262321o553347c9j72ac55e195107f9b@mail.gmail.com>
	<ca471dc20707270555y4f271cd0j53b999b7d1f827cf@mail.gmail.com>
	<6a36e7290707270847k556e6eb5mda4aeb83fa919499@mail.gmail.com>
Message-ID: <ca471dc20707270919n30c4788eldf2d444ab15378b9@mail.gmail.com>

On 7/27/07, Bob Ippolito <bob at redivi.com> wrote:
> 4CCs are not all ASCII, they're Mac OS Roman. This is why in some of
> the C header files the constants turned into integers.

Good to know! We should use that when converting them to Unicode.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Jul 27 18:20:30 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 27 Jul 2007 12:20:30 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
Message-ID: <20070727162212.60E2F3A40E6@sparrow.telecommunity.com>

At 08:25 AM 7/27/2007 -0700, Guido van Rossum wrote:
>Basic GFs, great. Before/after/around, good. Other method
>combinations, fine. But GFs in classes and subclassing? Not until we
>have a much better design.

Sounds reasonable to me.  The only time I actually use them in 
classes myself is to override existing generic functions that live 
outside the class, like ones from an Interface or a standalone generic.

The main reason I included GFs-in-classes examples in the PEP is 
because of the "dynamic overloading" meme.  In C++, Java, etc., you 
can use overloading in methods, so I wanted to show how you could do 
that, if you wanted to.

I suspect that the simplest way to fix this in Py3K is with an 
"overloading" metaclass, as it would not even require any 
decorators.  That is, you could provide a custom dictionary that 
records every definition of a function with the same name.  The 
actual metaclass creation process would check for a method of the 
same name in a base class, and if it's generic (or the current class 
added more than one method), put a generic method in.

With a little bit of work, you could probably determine whether you 
could get away with dropping the genericness in a subclass; 
specifically, if all the subclass-defined methods are "more specific" 
than all base class methods, then there's no need for them to be in 
the same generic function, unless they make next_method calls.  Thus, 
you'll end up with normal methods except where absolutely necessary.

Such a metaclass would make method overloads look pretty much the 
same as in OO languages with static overloading.  The only remaining 
hole at that point would be reconciling super() and next_method.  If 
you're using this metaclass, super() is only meaningful if you're not 
in the same generic function as is used in your base, while 
next_method() is only meaningful if you *are*.

I don't know of any quick way to fix that, but I'll give it some thought. 


From guido at python.org  Fri Jul 27 18:33:32 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 27 Jul 2007 09:33:32 -0700
Subject: [Python-3000] docstring for dict.values
In-Reply-To: <e7ba66e40707270849i5b640a8fxdff08c1f776947cc@mail.gmail.com>
References: <e7ba66e40707270849i5b640a8fxdff08c1f776947cc@mail.gmail.com>
Message-ID: <ca471dc20707270933x6e156629pce0a88d1f67138b0@mail.gmail.com>

On 7/27/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> Why the docstrings for 'dict.values' says "a set-like object ..." ??
>
> >>> list(dict(a=1,b=1,c=1).values())
> [1, 1, 1]

Oops, that's a bug! Thanks for reporting.

Committed revision 56584.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From dalcinl at gmail.com  Fri Jul 27 18:51:46 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Fri, 27 Jul 2007 13:51:46 -0300
Subject: [Python-3000] docstring for dict.values
In-Reply-To: <ca471dc20707270933x6e156629pce0a88d1f67138b0@mail.gmail.com>
References: <e7ba66e40707270849i5b640a8fxdff08c1f776947cc@mail.gmail.com>
	<ca471dc20707270933x6e156629pce0a88d1f67138b0@mail.gmail.com>
Message-ID: <e7ba66e40707270951q12f36de1o85534d7dc77620b1@mail.gmail.com>

It seems the same applies to dict.items() ...

$ set(dict(a=[]).items())
>>> set(dict(a=[]).items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'


On 7/27/07, Guido van Rossum <guido at python.org> wrote:
> On 7/27/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> > Why the docstrings for 'dict.values' says "a set-like object ..." ??
> >
> > >>> list(dict(a=1,b=1,c=1).values())
> > [1, 1, 1]
>
> Oops, that's a bug! Thanks for reporting.

-- 
Lisandro Dalc?n

From guido at python.org  Fri Jul 27 18:55:27 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 27 Jul 2007 09:55:27 -0700
Subject: [Python-3000] docstring for dict.values
In-Reply-To: <e7ba66e40707270951q12f36de1o85534d7dc77620b1@mail.gmail.com>
References: <e7ba66e40707270849i5b640a8fxdff08c1f776947cc@mail.gmail.com>
	<ca471dc20707270933x6e156629pce0a88d1f67138b0@mail.gmail.com>
	<e7ba66e40707270951q12f36de1o85534d7dc77620b1@mail.gmail.com>
Message-ID: <ca471dc20707270955m5b68b09cie0253fd9af703ff1@mail.gmail.com>

That's a totally different issue. The result of .items() is a set. But
if it contains an unhashable object you can't convert it to a regular
set.

--Guido

On 7/27/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> It seems the same applies to dict.items() ...
>
> $ set(dict(a=[]).items())
> >>> set(dict(a=[]).items())
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: unhashable type: 'list'
>
>
> On 7/27/07, Guido van Rossum <guido at python.org> wrote:
> > On 7/27/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> > > Why the docstrings for 'dict.values' says "a set-like object ..." ??
> > >
> > > >>> list(dict(a=1,b=1,c=1).values())
> > > [1, 1, 1]
> >
> > Oops, that's a bug! Thanks for reporting.
>
> --
> Lisandro Dalc?n
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Fri Jul 27 18:56:00 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Jul 2007 02:56:00 +1000
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2AE31.2080105@canterbury.ac.nz>	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>	<46A3ECB7.9070504@canterbury.ac.nz>	<20070723010750.E27693A40A9@sparrow.telecommunity.com>	<46A453C7.9070407@acm.org>	<20070723153031.D00273A403D@sparrow.telecommunity.com>	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
Message-ID: <46AA23A0.7090807@gmail.com>

Phillip J. Eby wrote:
> I don't know of any quick way to fix that, but I'll give it some thought. 

In the meantime, do we want the standard metaclass to complain when it 
finds generic functions in class bodies, or to automatically treat them 
as static methods?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From greg.ewing at canterbury.ac.nz  Sat Jul 28 03:19:44 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 28 Jul 2007 13:19:44 +1200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
Message-ID: <46AA99B0.5070105@canterbury.ac.nz>

Guido van Rossum wrote:
> We could use Unicode strings where in the past we used 8-bit strings,
> but that would be somewhat nasty when there's ever one of these codes
> that's not pure ASCII

Since this is a Mac-specific thing (and Classic-originated at
that), I think you can be pretty sure that any non-ASCII value
is to be interpreted according to the Macintosh character set,
if it's meant to be a character at all.

So I would suggest using the Macintosh encoding when converting
these to and from unicode.

--
Greg

From greg.ewing at canterbury.ac.nz  Sat Jul 28 03:41:49 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 28 Jul 2007 13:41:49 +1200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <B707418E-B504-40D8-9EFF-1B1FB6216EFE@mac.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<B707418E-B504-40D8-9EFF-1B1FB6216EFE@mac.com>
Message-ID: <46AA9EDD.4010405@canterbury.ac.nz>

Ronald Oussoren wrote:
> To reiterate: 4-character-codes in C are numeric literals

I'm still not convinced about that. The major use of 4-char
codes is in data structures stored on disk. I'd be surprised
if they're really stored in the opposite order on little
endian architectures, since then you wouldn't be able to
use a file system written from a PPC on an Intel or vice
versa.

It's much more likely that the C macros used to handle
4-char codes change depending on the architecture, so that
the order in memory stays the same.

So I stand by my opinion that *conceptually* they're still
4-character arrays, and the fact that they're declared as
ints in C is just a kludge to work around limitations of C.

> One example of an API that returns a dictionary where some keys refer  
> to values that are commonly encoded using 4-character-codes is - 
> [NSFileManager fileAttributesAtPath:traverseLink].

Blarg. Well, I think Cocoa is braindamaged in the way it
handles this. It should convert them to/from some friendlier
type automatically.

Note that if you use a specialised type for this in Python,
it still won't help with APIs like this that munge them in
with other types polymorphically. You'll still have to do
an explicit conversion in your Python code. So it doesn't
really matter whether the representation in Python is a
unicode string, byte string or something special.

--
Greg

From martin at v.loewis.de  Sat Jul 28 11:10:34 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 28 Jul 2007 11:10:34 +0200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <46AA99B0.5070105@canterbury.ac.nz>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>	<ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
	<46AA99B0.5070105@canterbury.ac.nz>
Message-ID: <46AB080A.7030208@v.loewis.de>

> Since this is a Mac-specific thing (and Classic-originated at
> that), I think you can be pretty sure that any non-ASCII value
> is to be interpreted according to the Macintosh character set,
> if it's meant to be a character at all.

Please understand that there is no such thing as "the Macintosh
character set".

Somebody else gave already the correct answer: these codes are
commonly interpreted as MacRoman.

Regards,
Martin

From tomerfiliba at gmail.com  Sat Jul 28 17:06:50 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Sat, 28 Jul 2007 17:06:50 +0200
Subject: [Python-3000] optimizing [x]range
Message-ID: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>

currently, testing for "x in xrange(y)" is an O(n) operation.

since xrange objects (which would become range in py3k) are not real lists,
there's no reason that __contains__ be an O(n). it can easily be made into
an O(1) operation. here's a demo code (it should be trivial to implement
this in CPython)


class xxrange(object):
    def __init__(self, *args):
        if len(args) == 1:
            self.start, self.stop, self.step = (0, args[0], 1)
        elif len(args) == 2:
            self.start, self.stop, self.step = (args[0], args[1], 1)
        elif len(args) == 3:
            self.start, self.stop, self.step = args
        else:
            raise TypeError("invalid number of args")

    def __iter__(self):
        i = self.start
        while i < self.stop:
            yield i
            i += self.step

    def __contains__(self, num):
        if num < self.start or num > self.stop:
            return False
        return (num - self.start) % self.step == 0


print list(xxrange(7))            # [0, 1, 2, 3, 4, 5, 6]
print list(xxrange(0, 7, 2))      # [0, 2, 4, 6]
print list(xxrange(1, 7, 2))      # [1, 3, 5]
print 98 in xxrange(100)          # True
print 98 in xxrange(0, 100, 2)    # True
print 99 in xxrange(0, 100, 2)    # False
print 98 in xxrange(1, 100, 2)    # False
print 99 in xxrange(1, 100, 2)    # True


-tomer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070728/3d78c559/attachment.htm 

From guido at python.org  Sun Jul 29 00:04:02 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Jul 2007 15:04:02 -0700
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
Message-ID: <ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>

Do we really need another way to spell a <= x < b? Have you got a
real-world use case in mind for the version with step > 1?

I'm at most lukewarm; I'd be willing to look at a patch to the C code
in the py3k-struni branch, plus unit tests though.

--Guido

On 7/28/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> currently, testing for "x in xrange(y)" is an O(n) operation.
>
> since xrange objects (which would become range in py3k) are not real lists,
> there's no reason that __contains__ be an O(n). it can easily be made into
> an O(1) operation. here's a demo code (it should be trivial to implement
> this in CPython)
>
>
> class xxrange(object):
>     def __init__(self, *args):
>         if len(args) == 1:
>             self.start , self.stop, self.step = (0, args[0], 1)
>         elif len(args) == 2:
>             self.start, self.stop, self.step = (args[0], args[1], 1)
>         elif len(args) == 3:
>             self.start, self.stop, self.step = args
>         else:
>             raise TypeError("invalid number of args")
>
>     def __iter__(self):
>         i = self.start
>         while i < self.stop:
>             yield i
>             i += self.step
>
>     def __contains__(self, num):
>         if num < self.start or num > self.stop:
>             return False
>         return (num - self.start) % self.step == 0
>
>
> print list(xxrange(7))            # [0, 1, 2, 3, 4, 5, 6]
> print list(xxrange(0, 7, 2))      # [0, 2, 4, 6]
> print list(xxrange(1, 7, 2))      # [1, 3, 5]
> print 98 in xxrange(100)          # True
> print 98 in xxrange(0, 100, 2)    # True
> print 99 in xxrange(0, 100, 2)    # False
> print 98 in xxrange(1, 100, 2)    # False
> print 99 in xxrange(1, 100, 2)    # True
>
>
>
> -tomer
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/guido%40python.org
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jdahlin at async.com.br  Sun Jul 29 02:32:37 2007
From: jdahlin at async.com.br (Johan Dahlin)
Date: Sat, 28 Jul 2007 21:32:37 -0300
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
Message-ID: <46ABE025.4050204@async.com.br>

Guido van Rossum wrote:
> Do we really need another way to spell a <= x < b?

FWIW, I'd say yes; I sometimes find it a bit difficult to remember
how the operator should be placed, there are several possible ways
of making a mistake, eg;

   a > x < b
   a < x > b
   a < x < b
   a > x > b

Now, the range syntax seems a bit strange at first, but I find it easier 
to parse:

   if x in range(a, b)

There's no way to incorrectly parse that, it's immediately known that 
the programmer tries to see whether x is in a specific range.

It seems to be used quite widely already;

http://google.com/codesearch?hl=en&q=+%5E.*if%5Cs%2B.*%5Cs%2Bin%5Cs%2Brange%5C(.*%24&start=10&sa=N

Johan


From lcaamano at gmail.com  Sun Jul 29 03:04:29 2007
From: lcaamano at gmail.com (lcaamano)
Date: Sun, 29 Jul 2007 01:04:29 -0000
Subject: [Python-3000] uuid creation not thread-safe?
In-Reply-To: <ca471dc20707201052p68883fc5l3efd8ecc5cfd497f@mail.gmail.com>
References: <ca471dc20707201052p68883fc5l3efd8ecc5cfd497f@mail.gmail.com>
Message-ID: <1185671069.839769.274620@z28g2000prd.googlegroups.com>


On Jul 20, 1:52 pm, "Guido van Rossum" <gu... at python.org> wrote:
> I discovered what appears to be a thread-unsafety inuuid.py. This is
> in the trunk as well as in 3.x; I'm using the trunk here for easy
> reference. There's some code around like 395:
>
>     import ctypes, ctypes.util
>     _buffer = ctypes.create_string_buffer(16)
>
> This creates a *global* buffer which is used as the output parameter
> to later calls to _uuid_generate_random() and _uuid_generate_time().
> For example, around line 481, in uuid1():
>
>         _uuid_generate_time(_buffer)
>         returnUUID(bytes=_buffer.raw)
>
> Clearly if two threads do this simultaneously they are overwriting
> _buffer in unpredictable order. There are a few other occurrences of
> this too.
>
> I find it somewhat disturbing that what seems a fairly innocent
> function that doesn't *appear* to have global state is nevertheless
> not thread-safe. Would it be wise to fix this, e.g. by allocating a
> fresh output buffer inside uuid1() and other callers?
>


I didn't find any reply to this, which is odd, so forgive me if it's
old news.

I agree with you that it's not thread safe and that a local buffer in
the stack should fix it.

Just for reference, the thread-safe uuid extension we've been using
since python 2.1, which I don't recall where we borrow it from, uses a
local buffer in the stack.  It looks like this:

-----begin uuid.c--------------

static char uuid__doc__ [] =
"DCE compatible Universally Unique Identifier module";

#include "Python.h"
#include <uuid/uuid.h>

static char uuidgen__doc__ [] =
"Create a new DCE compatible UUID value";

static PyObject *
uuidgen(void)
{
uuid_t out;
char buf[48];

    uuid_generate(out);
    uuid_unparse(out, buf);
    return PyString_FromString(buf);
}

static PyMethodDef uuid_methods[] = {
    {"uuidgen", uuidgen, 0, uuidgen__doc__},
    {NULL,      NULL}        /* Sentinel */
};

DL_EXPORT(void)
inituuid(void)
{
    Py_InitModule4("uuid",
               uuid_methods,
               uuid__doc__,
               (PyObject *)NULL,
               PYTHON_API_VERSION);
}

-----end uuid.c--------------


It also seems that using uuid_generate()/uuid_unparse() should be
faster than using uuid_generate_random() and then creating a python
object to call its __str__ method.  If so, it would be nice if the
uuid.py module also provided equivalent fast versions that returned
strings instead of objects.


--
Luis P Caamano
Atlanta, GA, USA


From jyasskin at gmail.com  Sun Jul 29 03:28:08 2007
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Sat, 28 Jul 2007 18:28:08 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <6a36e7290707270847k556e6eb5mda4aeb83fa919499@mail.gmail.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<ca471dc20707262207k6c5844e3t1bc051e0f70ee5ca@mail.gmail.com>
	<5d44f72f0707262321o553347c9j72ac55e195107f9b@mail.gmail.com>
	<ca471dc20707270555y4f271cd0j53b999b7d1f827cf@mail.gmail.com>
	<6a36e7290707270847k556e6eb5mda4aeb83fa919499@mail.gmail.com>
Message-ID: <5d44f72f0707281828l50394be3o8baf18080426ecc8@mail.gmail.com>

On 7/27/07, Bob Ippolito <bob at redivi.com> wrote:
> On 7/27/07, Guido van Rossum <guido at python.org> wrote:
> > On 7/26/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> > > On 7/26/07, Guido van Rossum <guido at python.org> wrote:
> > > > (Personally
> > > > I'd like to use strings for the keys since {'TEXT': 'stuff'} is a lot
> > > > clearer than {1413830740: 'stuff'} when encountered in a debugging
> > > > session.)
> > >
> > > Good argument. You now have a patch that uses str() instead of b2i().
> >
> > Hmm... That only works as long as the bytes are ASCII. Is that a
> > problem for aepack? Or are all its 4CCs chosen from a well-known set
> > that's all-ASCII?
>
> 4CCs are not all ASCII, they're Mac OS Roman. This is why in some of
> the C header files the constants turned into integers.

Good point; my second patch is wrong. I'm satisfied that b2i is
correct, even if it's not ideal from either a debugging or a "what are
4CCs really" perspective, so I don't intend to do any more work on it.
Would one of the mac enthusiasts like to take over from here?

From joe at bitworking.org  Sun Jul 29 03:47:05 2007
From: joe at bitworking.org (Joe Gregorio)
Date: Sat, 28 Jul 2007 18:47:05 -0700
Subject: [Python-3000] base64 - bytes and strings
Message-ID: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>

I just submitted a patch to fix test_urllib2 and test_cookielib. In the
process of fixing them I came across something that looks like an
inconsistency in the base64 module.

Right now the base64 module uses bytes for everything. That is,
a value passed to b64encode() must be bytes, and the
base64 encoded response is also in bytes.

Shouldn't it operate more like expat, with the stuff to be
encoded is bytes and the encoded form is a string?
It seems more natural if the encoded value is a string since
base64 encoding is a way of encoding data
so that it fits in US-ASCII.

   Thanks,
   -joe

-- 
Joe Gregorio        http://bitworking.org

From ncoghlan at gmail.com  Sun Jul 29 04:40:18 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Jul 2007 12:40:18 +1000
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
Message-ID: <46ABFE12.5000101@gmail.com>

Joe Gregorio wrote:
> Shouldn't it operate more like expat, with the stuff to be
> encoded is bytes and the encoded form is a string?
> It seems more natural if the encoded value is a string since
> base64 encoding is a way of encoding data
> so that it fits in US-ASCII.

Py3k strings are unicode, so returning a string would mean you just have 
to encode it again using the ascii codec to get the bytes to put on the 
wire. Since the base64 module already knows that it is producing ASCII, 
it makes more sense to consider it as a byte->byte encoding.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From martin at v.loewis.de  Sun Jul 29 06:58:32 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Jul 2007 06:58:32 +0200
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
Message-ID: <46AC1E78.7050900@v.loewis.de>

> It seems more natural if the encoded value is a string since
> base64 encoding is a way of encoding data
> so that it fits in US-ASCII.

There have been long debates about this specific question in
the past. The point that proponents of "base64 encoding should
yield strings" miss is that US-ASCII is *both* a character set,
and an encoding. So if data "is in US-ASCII", it's not all that
clear whether the focus is on it being character data, or bytes.

base64 is used "on the wire" most of the time (except when it
gets embedded into XML); from that point of view, it's more
natural that encoding yields bytes.

Regards,
Martin

From g.brandl at gmx.net  Sun Jul 29 07:41:58 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 29 Jul 2007 07:41:58 +0200
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <46ABE025.4050204@async.com.br>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
Message-ID: <f8h9b2$h4u$1@sea.gmane.org>

Johan Dahlin schrieb:
> Guido van Rossum wrote:
>> Do we really need another way to spell a <= x < b?
> 
> FWIW, I'd say yes; I sometimes find it a bit difficult to remember
> how the operator should be placed, there are several possible ways
> of making a mistake, eg;
> 
>    a > x < b
>    a < x > b
>    a < x < b
>    a > x > b
> 
> Now, the range syntax seems a bit strange at first, but I find it easier 
> to parse:
> 
>    if x in range(a, b)
> 
> There's no way to incorrectly parse that, it's immediately known that 
> the programmer tries to see whether x is in a specific range.

What about floats?

Currently, "3.5 in range(5)" is False, while "0 <= 3.5 < 5" is True.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From jdahlin at async.com.br  Sun Jul 29 02:32:37 2007
From: jdahlin at async.com.br (Johan Dahlin)
Date: Sat, 28 Jul 2007 21:32:37 -0300
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
Message-ID: <46ABE025.4050204@async.com.br>

Guido van Rossum wrote:
> Do we really need another way to spell a <= x < b?

FWIW, I'd say yes; I sometimes find it a bit difficult to remember
how the operator should be placed, there are several possible ways
of making a mistake, eg;

   a > x < b
   a < x > b
   a < x < b
   a > x > b

Now, the range syntax seems a bit strange at first, but I find it easier 
to parse:

   if x in range(a, b)

There's no way to incorrectly parse that, it's immediately known that 
the programmer tries to see whether x is in a specific range.

It seems to be used quite widely already;

http://google.com/codesearch?hl=en&q=+%5E.*if%5Cs%2B.*%5Cs%2Bin%5Cs%2Brange%5C(.*%24&start=10&sa=N

Johan

From skip at pobox.com  Sun Jul 29 14:18:49 2007
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 29 Jul 2007 07:18:49 -0500
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <46ABE025.4050204@async.com.br>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
Message-ID: <18092.34217.512107.677855@montanaro.dyndns.org>


    Johan> FWIW, I'd say yes; I sometimes find it a bit difficult to
    Johan> remember how the operator should be placed, there are several
    Johan> possible ways of making a mistake, eg;

    Johan>    a > x < b
    Johan>    a < x > b
    Johan>    a < x < b
    Johan>    a > x > b

If the two angles face the same way it's correct.  It's hard to see how it
could be any other way.

    Johan> Now, the range syntax seems a bit strange at first, but I find it easier 
    Johan> to parse:

    Johan>    if x in range(a, b)

You can't spell

    a <= x <= b

or

    a < x < b

without remembering to add or subtract 1 from the appropriate endpoint

    if x in range(a, b+1)
    if x in range(a-1, b)

That would seem to me to be more error-prone than confusion about

    a < x < b

Skip

From tomerfiliba at gmail.com  Sun Jul 29 14:48:21 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Sun, 29 Jul 2007 12:48:21 -0000
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
Message-ID: <1185713301.934213.186010@q75g2000hsh.googlegroups.com>

i understand there is no much need for using ranges instead of
intervals (a < x < b), but:

1) it's already supported. you CAN use x in range(100), so
why not optimize it? there's no justification to keep it an
O(N) operation (you're not trying to punish anyone :).
it just calls for adding a __contains__ slot to range objects.
the cost is very minimal.

2) ranges are more like set-builder notation, i.e.
evens = {2*n | n in N}
which can be written as
evens = range(0, maxint, 2)
odds = range(1, maxint, 2)
you cannot phrase "x in odds" in "a <= x < b" notation.
sure, just use modulu, but then it just gets ugly.

if range (== xrange) would be a cheap, O(1) operation, there's
not reason to to use it when it suits well.


-tomer

On Jul 29, 12:04 am, "Guido van Rossum" <gu... at python.org> wrote:
> Do we really need another way to spell a <= x < b? Have you got a
> real-world use case in mind for the version with step > 1?
>
> I'm at most lukewarm; I'd be willing to look at a patch to the C code
> in the py3k-struni branch, plus unit tests though.
>
> --Guido
>


From guido at python.org  Sun Jul 29 19:33:34 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 29 Jul 2007 10:33:34 -0700
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <46ABE025.4050204@async.com.br>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
Message-ID: <ca471dc20707291033v1bec4607pad990db38f82eda8@mail.gmail.com>

On 7/28/07, Johan Dahlin <jdahlin at async.com.br> wrote:
> Guido van Rossum wrote:
> > Do we really need another way to spell a <= x < b?
>
> FWIW, I'd say yes; I sometimes find it a bit difficult to remember
> how the operator should be placed, there are several possible ways
> of making a mistake, eg;
>
>    a > x < b
>    a < x > b
>    a < x < b
>    a > x > b

Were you drunk at the time? :-)

> Now, the range syntax seems a bit strange at first, but I find it easier
> to parse:
>
>    if x in range(a, b)
>
> There's no way to incorrectly parse that, it's immediately known that
> the programmer tries to see whether x is in a specific range.
>
> It seems to be used quite widely already;
>
> http://google.com/codesearch?hl=en&q=+%5E.*if%5Cs%2B.*%5Cs%2Bin%5Cs%2Brange%5C(.*%24&start=10&sa=N

Sorry, 50 hits is not "quite widely".

Did you find *any* examples using a step > 1?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun Jul 29 19:37:53 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 29 Jul 2007 10:37:53 -0700
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <1185713301.934213.186010@q75g2000hsh.googlegroups.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<1185713301.934213.186010@q75g2000hsh.googlegroups.com>
Message-ID: <ca471dc20707291037w63378621qe5683d9076518083@mail.gmail.com>

On 7/29/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> i understand there is no much need for using ranges instead of
> intervals (a < x < b), but:
>
> 1) it's already supported. you CAN use x in range(100), so
> why not optimize it? there's no justification to keep it an
> O(N) operation (you're not trying to punish anyone :).
> it just calls for adding a __contains__ slot to range objects.
> the cost is very minimal.

Don't forget the *cost* in terms of code bloat. Plus, I asked for a
patch. Where is it? This is not Santa Claus's email address. You're
expected to contribute more than a wish.

> 2) ranges are more like set-builder notation, i.e.
> evens = {2*n | n in N}
> which can be written as
> evens = range(0, maxint, 2)
> odds = range(1, maxint, 2)
> you cannot phrase "x in odds" in "a <= x < b" notation.
> sure, just use modulu, but then it just gets ugly.

Um, your range "solution" would break for examples like 2**100 in
evens (it's hard to think of a more even number than that. :-)

Typically one would write a predicate that tested for modulo.

> if range (== xrange) would be a cheap, O(1) operation, there's
> not reason to to use it when it suits well.

But I still see no reason to make it O(1).

> -tomer
>
> On Jul 29, 12:04 am, "Guido van Rossum" <gu... at python.org> wrote:
> > Do we really need another way to spell a <= x < b? Have you got a
> > real-world use case in mind for the version with step > 1?
> >
> > I'm at most lukewarm; I'd be willing to look at a patch to the C code
> > in the py3k-struni branch, plus unit tests though.
> >
> > --Guido
> >
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tomerfiliba at gmail.com  Sun Jul 29 20:06:47 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Sun, 29 Jul 2007 20:06:47 +0200
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707291037w63378621qe5683d9076518083@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<1185713301.934213.186010@q75g2000hsh.googlegroups.com>
	<ca471dc20707291037w63378621qe5683d9076518083@mail.gmail.com>
Message-ID: <1d85506f0707291106w4c48cbb4s5ecac3415722cb2b@mail.gmail.com>

On 7/29/07, Guido van Rossum <guido at python.org> wrote:
> Don't forget the *cost* in terms of code bloat. Plus, I asked for a
> patch. Where is it? This is not Santa Claus's email address. You're
> expected to contribute more than a wish.

first off all, that's not the politest way to put it, especially since i have
submitted some patches before. second, i've already given a 3-line
implementation in python. it would only take two minutes to convert
it to C, save the unit tests. third, i'm busy over my head studying of
my exams. forth, due to lack of public interest, i might as well
withdraw this.

> Um, your range "solution" would break for examples like 2**100 in
> evens (it's hard to think of a more even number than that. :-)

there's no reason why (x)range shouldn't support longs too.
after all, it only tests for modulo internally (*unlike* how it works
today, which will never finish).

besides, that's not the point. i'm only saying there's no reason that
testing for containment in range objects (which are no longer lists),
should be O(N), when it can easily be made O(1) in under 10 lines of C
code.

> But I still see no reason to make it O(1).

as you wish.
*goes back in time and withdraws the proposal*


-tomer

From martin at v.loewis.de  Sun Jul 29 21:03:06 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 29 Jul 2007 21:03:06 +0200
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <1d85506f0707291106w4c48cbb4s5ecac3415722cb2b@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>	<1185713301.934213.186010@q75g2000hsh.googlegroups.com>	<ca471dc20707291037w63378621qe5683d9076518083@mail.gmail.com>
	<1d85506f0707291106w4c48cbb4s5ecac3415722cb2b@mail.gmail.com>
Message-ID: <46ACE46A.5080102@v.loewis.de>

>> Don't forget the *cost* in terms of code bloat. Plus, I asked for a
>> patch. Where is it? This is not Santa Claus's email address. You're
>> expected to contribute more than a wish.
> 
> first off all, that's not the politest way to put it, especially since i have
> submitted some patches before. second, i've already given a 3-line
> implementation in python. it would only take two minutes to convert
> it to C, save the unit tests. third, i'm busy over my head studying of
> my exams. forth, due to lack of public interest, i might as well
> withdraw this.

It's not that *you* were asked to contribute it. Guido just pointed
out that, without a patch, it won't get implemented. More so if
the patch is as trivial as you expect it to be. We *all* are
under time pressure - I am busy giving exams, for example.

So to your original question "why not optimize it?", there is
a very simple answer: there is no ready implementation available.

> besides, that's not the point. i'm only saying there's no reason that
> testing for containment in range objects (which are no longer lists),
> should be O(N), when it can easily be made O(1) in under 10 lines of C
> code.

Nothing is easy. Neal Norwitz was working on implementing xrange with
longs, and it took an entire week. The patch is still sitting on SF
somewhere.

Regards,
Martin

From unknown_kev_cat at hotmail.com  Sun Jul 29 22:11:12 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Sun, 29 Jul 2007 16:11:12 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org><ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com><f7lk7q$9m6$1@sea.gmane.org><ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com><f7lnd8$l2s$1@sea.gmane.org>
	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>
Message-ID: <f8is94$tgh$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc20707181158p17417c9cg37c5382d61b53fe5 at mail.gmail.com...
> On 7/18/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>> >> I'm wondering if the recusion limit on my build is getting set too low
>> >> somehow.
>> >
>> > Can you find out what it is? sys.getrecursionlimit().
>>
>> Hmm...  It is a limit of 1000.
>> That is probably large enough, no?
>
> Yes, that's what it is for me.
>
>> Anyway, from some basic testing it looks like marshal is always throwing
>> that error when marshal.load() is called.
>> However, marshal.loads() works fine.
>>
>> Might this be another encoding related error?
>
> Perhaps. Or something else. Do try to investigate.
>

What I have found is that (on CYGWIN) all of marshal seems to work fine 
except for marshal.load().
marshal.dump()'s output can be read by 2.5's marshal.load() without problem. 
3k's marshal.load() will not
load the data from 3k's marshal.dump or 2.5's marshal.dump()

It turns out to be a fault due to an uninitialized value on a RFILE.
Specifically, the following patch (part of marshal_load in marshal.c fixes 
things.

-----BEGIN PATCH-----
Index: Python/marshal.c
===================================================================
--- Python/marshal.c    (revision 56620)
+++ Python/marshal.c    (working copy)
@@ -1181,6 +1181,7 @@
                return NULL;
        }
        rf.strings = PyList_New(0);
+       rf.depth=0;
        result = read_object(&rf);
        Py_DECREF(rf.strings);
        Py_DECREF(data);
-----END PATCH-----

I'll submit the patch to sourceforge if needed, although the fact that all 
the other loading methods
do set rf.depth=0 (including PyMarshal_ReadObjectFromFile) indicates to me 
that this is definately the correct patch.
Looks like that line was accidentally forgoten.


From unknown_kev_cat at hotmail.com  Mon Jul 30 01:34:22 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Sun, 29 Jul 2007 19:34:22 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org><ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com><f7lk7q$9m6$1@sea.gmane.org><ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com><f7lnd8$l2s$1@sea.gmane.org><ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>
	<f8is94$tgh$1@sea.gmane.org>
Message-ID: <f8j862$st0$1@sea.gmane.org>


"Joe Smith" <unknown_kev_cat at hotmail.com> wrote in message 
news:f8is94$tgh$1 at sea.gmane.org...
>
> "Guido van Rossum" <guido at python.org> wrote in message
> news:ca471dc20707181158p17417c9cg37c5382d61b53fe5 at mail.gmail.com...
>> On 7/18/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>>> >> I'm wondering if the recusion limit on my build is getting set too 
>>> >> low
>>> >> somehow.
>>> >
>>> > Can you find out what it is? sys.getrecursionlimit().
>>>
>>> Hmm...  It is a limit of 1000.
>>> That is probably large enough, no?
>>
>> Yes, that's what it is for me.
>>
>>> Anyway, from some basic testing it looks like marshal is always throwing
>>> that error when marshal.load() is called.
>>> However, marshal.loads() works fine.
>>>
>>> Might this be another encoding related error?
>>
>> Perhaps. Or something else. Do try to investigate.
>>
>
> What I have found is that (on CYGWIN) all of marshal seems to work fine
> except for marshal.load().
> marshal.dump()'s output can be read by 2.5's marshal.load() without 
> problem.
> 3k's marshal.load() will not
> load the data from 3k's marshal.dump or 2.5's marshal.dump()
>
> It turns out to be a fault due to an uninitialized value on a RFILE.
> Specifically, the following patch (part of marshal_load in marshal.c fixes
> things.
>
> -----BEGIN PATCH-----
> Index: Python/marshal.c
> ===================================================================
> --- Python/marshal.c    (revision 56620)
> +++ Python/marshal.c    (working copy)
> @@ -1181,6 +1181,7 @@
>                return NULL;
>        }
>        rf.strings = PyList_New(0);
> +       rf.depth=0;
>        result = read_object(&rf);
>        Py_DECREF(rf.strings);
>        Py_DECREF(data);
> -----END PATCH-----
>
> I'll submit the patch to sourceforge if needed, although the fact that all
> the other loading methods
> do set rf.depth=0 (including PyMarshal_ReadObjectFromFile) indicates to me
> that this is definately the correct patch.
> Looks like that line was accidentally forgoten.


With that patch, things on CYGWIN are getting close to matching the other 
platforms.

There are still some problems with the 'Python' directory for example. This 
is because of a change in the internals of Cygwin.
Cygwin does have "managed mounts" which allow for case sensitivity. 
Compiling Python inside a managed mount eliminates those issues.
So it is not a terribly big deal.

If I patch io.py to default to "utf-8" rather than using the filesystem 
encoding (ascii), that fixes a few more things. (test_coding.py and 
test_minidom.py)


Then there are only 2 test failures remaining that are not listed on the 
wiki. One of them is a very minor issue in test_platform.py.
The other is a more complicated problem with test_mailbox.py

First the test_platform problem.
sys.executable lacks the ".exe" suffix.  In order for libc_ver to work it 
would need to be passed the exe suffix.
The cygwin specific hack in the test_platform.py does not work if when using 
a managed mount because a managed mount is case sensitive, so 
isdir(executable) returns false.
(Using libc_ver with no arguments also fails for the same basic reason. 
(although there is no cygwin hack in that case.))   That said, using 
libc_ver on cygwin would not be meaningful because cygwin uses newlib 
instead of libc/glibc.


The mailbox.py problem seems troubling, I'm getting exceptions of type 
"IOError: [Errno 13] Permission denied" on "./@test" (aka. 
test_support.TESTFN) .

This is true for all tests after TestMbox's run of test_add(). All of 
TestMailDir works fine. TestMbox's test_add() works fine, but all the 
remaining tests that use "./@test" fail. Sounds like something is not 
getting cleaned up correctly. That said no "@test" file or directory is left 
behind after the end of the test.

(For whats its worth, Cygwin's python 2.5 (as installed on my system) fails 
2 of the tests in it's version of test_mailbox.py, both with "IOError: 
[Errno 13] Permission denied"). 


From guido at python.org  Mon Jul 30 02:09:37 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 29 Jul 2007 17:09:37 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f8is94$tgh$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>
	<f7lk7q$9m6$1@sea.gmane.org>
	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>
	<f7lnd8$l2s$1@sea.gmane.org>
	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>
	<f8is94$tgh$1@sea.gmane.org>
Message-ID: <ca471dc20707291709y68e8c301qa0845fb9dab1874a@mail.gmail.com>

On 7/29/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> What I have found is that (on CYGWIN) all of marshal seems to work fine
> except for marshal.load().
> marshal.dump()'s output can be read by 2.5's marshal.load() without problem.
> 3k's marshal.load() will not
> load the data from 3k's marshal.dump or 2.5's marshal.dump()
>
> It turns out to be a fault due to an uninitialized value on a RFILE.
> Specifically, the following patch (part of marshal_load in marshal.c fixes
> things.
>
> -----BEGIN PATCH-----
> Index: Python/marshal.c
> ===================================================================
> --- Python/marshal.c    (revision 56620)
> +++ Python/marshal.c    (working copy)
> @@ -1181,6 +1181,7 @@
>                 return NULL;
>         }
>         rf.strings = PyList_New(0);
> +       rf.depth=0;
>         result = read_object(&rf);
>         Py_DECREF(rf.strings);
>         Py_DECREF(data);
> -----END PATCH-----
>
> I'll submit the patch to sourceforge if needed, although the fact that all
> the other loading methods
> do set rf.depth=0 (including PyMarshal_ReadObjectFromFile) indicates to me
> that this is definately the correct patch.
> Looks like that line was accidentally forgoten.

Thanks! Looks like that line was accidentally dropped -- perhaps as a
result of a merge. It was in all previous versions.

Anyway, I've added it back.

Committed revision 56623.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Mon Jul 30 02:08:22 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Jul 2007 12:08:22 +1200
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46ABFE12.5000101@gmail.com>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46ABFE12.5000101@gmail.com>
Message-ID: <46AD2BF6.5080507@canterbury.ac.nz>

Nick Coghlan wrote:
> Py3k strings are unicode, so returning a string would mean you just have 
> to encode it again using the ascii codec to get the bytes to put on the 
> wire.

I still believe that producing a string is conceptually
the right thing to do. The point of base64 is to encode
binary data as text, not binary data as binary data.

If I ever had a reason to use base64, it would be because
I had a "wire" that would accept text but not binary data,
e.g. a file open in text mode, or some other text that I
wanted to embed it in. Getting bytes in that situation
would force me to make an *extra* conversion.

--
Greg

From greg.ewing at canterbury.ac.nz  Mon Jul 30 02:22:56 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Jul 2007 12:22:56 +1200
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AC1E78.7050900@v.loewis.de>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46AC1E78.7050900@v.loewis.de>
Message-ID: <46AD2F60.9050907@canterbury.ac.nz>

Martin v. L?wis wrote:
> The point that proponents of "base64 encoding should
> yield strings" miss is that US-ASCII is *both* a character set,
> and an encoding.

Last time we discussed this, I went and looked at the
RFC where base64 is defined. According to my reading of
it, nowhere does it say that base64 output must be
encoded as US-ASCII, nor any other particular encoding.

It *does* say that the characters used were chosen because
they are present in a number of different character sets
in use at the time, and explicity mentions EBCDIC as one
of those character sets.

To me this quite clearly says that base64 is defined at
the level of characters, not encodings.

--
Greg

From guido at python.org  Mon Jul 30 02:27:20 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 29 Jul 2007 17:27:20 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f8j862$st0$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>
	<f7lk7q$9m6$1@sea.gmane.org>
	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>
	<f7lnd8$l2s$1@sea.gmane.org>
	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>
	<f8is94$tgh$1@sea.gmane.org> <f8j862$st0$1@sea.gmane.org>
Message-ID: <ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>

On 7/29/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> There are still some problems with the 'Python' directory for example. This
> is because of a change in the internals of Cygwin.
> Cygwin does have "managed mounts" which allow for case sensitivity.
> Compiling Python inside a managed mount eliminates those issues.
> So it is not a terribly big deal.
>
> If I patch io.py to default to "utf-8" rather than using the filesystem
> encoding (ascii), that fixes a few more things. (test_coding.py and
> test_minidom.py)

How come the filesystem decoding is set to ASCII?

> Then there are only 2 test failures remaining that are not listed on the
> wiki. One of them is a very minor issue in test_platform.py.
> The other is a more complicated problem with test_mailbox.py

Please do add these to the wiki, so we won't forget them. If you want
CYGWIN to work, existing CYGWIN users will have to contribute patches.

> First the test_platform problem.
> sys.executable lacks the ".exe" suffix.  In order for libc_ver to work it
> would need to be passed the exe suffix.
> The cygwin specific hack in the test_platform.py does not work if when using
> a managed mount because a managed mount is case sensitive, so
> isdir(executable) returns false.
> (Using libc_ver with no arguments also fails for the same basic reason.
> (although there is no cygwin hack in that case.))   That said, using
> libc_ver on cygwin would not be meaningful because cygwin uses newlib
> instead of libc/glibc.
>
>
>
>
> The mailbox.py problem seems troubling, I'm getting exceptions of type
> "IOError: [Errno 13] Permission denied" on "./@test" (aka.
> test_support.TESTFN) .
>
> This is true for all tests after TestMbox's run of test_add(). All of
> TestMailDir works fine. TestMbox's test_add() works fine, but all the
> remaining tests that use "./@test" fail. Sounds like something is not
> getting cleaned up correctly. That said no "@test" file or directory is left
> behind after the end of the test.

Sounds like test_add() changes the perms on the file.

> (For whats its worth, Cygwin's python 2.5 (as installed on my system) fails
> 2 of the tests in it's version of test_mailbox.py, both with "IOError:
> [Errno 13] Permission denied").

OK, so the fix may need to be backported -- or perhaps (if it's easier
to find and fix in 2.5) forward ported.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Mon Jul 30 02:29:14 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Jul 2007 12:29:14 +1200
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <18092.34217.512107.677855@montanaro.dyndns.org>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
Message-ID: <46AD30DA.6050405@canterbury.ac.nz>

skip at pobox.com wrote:

> You can't spell
> 
>     a <= x <= b
> 
> or
> 
>     a < x < b
> 
> without remembering to add or subtract 1 from the appropriate endpoint

I think the use cases for this are where you're trying to
express a range-like condition, i.e 'a <= x < b'. Then you
have to make sure you get the right relations in the right
places, which is the same kind of burden as remembering to
add or subtract 1 in the right places.

--
Greg

From guido at python.org  Mon Jul 30 02:43:19 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 29 Jul 2007 17:43:19 -0700
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AD2F60.9050907@canterbury.ac.nz>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46AC1E78.7050900@v.loewis.de> <46AD2F60.9050907@canterbury.ac.nz>
Message-ID: <ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>

On 7/29/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Martin v. L?wis wrote:
> > The point that proponents of "base64 encoding should
> > yield strings" miss is that US-ASCII is *both* a character set,
> > and an encoding.
>
> Last time we discussed this, I went and looked at the
> RFC where base64 is defined. According to my reading of
> it, nowhere does it say that base64 output must be
> encoded as US-ASCII, nor any other particular encoding.
>
> It *does* say that the characters used were chosen because
> they are present in a number of different character sets
> in use at the time, and explicity mentions EBCDIC as one
> of those character sets.
>
> To me this quite clearly says that base64 is defined at
> the level of characters, not encodings.

I think it's all besides the point. We should look at the use cases. I
recall finding out once that a Java base64 implementation was much
slower than Python's -- turns out that the Java version was converting
everything to Strings; then we needed to convert back to bytes in
order to output them. My suspicion is that in the end using bytes is
more efficient *and* more convenient; it might take some looking
through the email package to confirm or refute this. (The email
package hasn't been converted to work in the struni branch; that
should happen first. Whoever does that might well be the one who tells
us how they want their base64 APIs.)

An alternative might be to provide both string- and bytes-based APIs,
although that doesn't help with deciding what the default one (the one
that uses the same names as 2.x) should do.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From talin at acm.org  Mon Jul 30 03:21:13 2007
From: talin at acm.org (Talin)
Date: Sun, 29 Jul 2007 18:21:13 -0700
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>	<46AC1E78.7050900@v.loewis.de>
	<46AD2F60.9050907@canterbury.ac.nz>
	<ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>
Message-ID: <46AD3D09.9060006@acm.org>

Guido van Rossum wrote:
> On 7/29/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> Martin v. L?wis wrote:
>>> The point that proponents of "base64 encoding should
>>> yield strings" miss is that US-ASCII is *both* a character set,
>>> and an encoding.
>> Last time we discussed this, I went and looked at the
>> RFC where base64 is defined. According to my reading of
>> it, nowhere does it say that base64 output must be
>> encoded as US-ASCII, nor any other particular encoding.
>>
>> It *does* say that the characters used were chosen because
>> they are present in a number of different character sets
>> in use at the time, and explicity mentions EBCDIC as one
>> of those character sets.
>>
>> To me this quite clearly says that base64 is defined at
>> the level of characters, not encodings.
> 
> I think it's all besides the point. We should look at the use cases. I
> recall finding out once that a Java base64 implementation was much
> slower than Python's -- turns out that the Java version was converting
> everything to Strings; then we needed to convert back to bytes in
> order to output them. My suspicion is that in the end using bytes is
> more efficient *and* more convenient; it might take some looking
> through the email package to confirm or refute this. (The email
> package hasn't been converted to work in the struni branch; that
> should happen first. Whoever does that might well be the one who tells
> us how they want their base64 APIs.)
> 
> An alternative might be to provide both string- and bytes-based APIs,
> although that doesn't help with deciding what the default one (the one
> that uses the same names as 2.x) should do.

One has to be careful when comparing performance with Java, because you 
need to specify whether you are using the "old" API or the "new" one. 
(It seems that almost everything in Java has an old and new API.)

I just recently did some work in Java with base64 encoding, or more 
specifically, URL-safe encoding. The library I was working with both 
consumed and produced arrays of bytes. I think that this is the correct 
way to do it.

In my specific use case, I was dealing with encrypted bytes, where the 
encrypter also produced and consumed bytes, so it made sense that the 
character encoder did the same. But even in the case where no encryption 
is involved, I think dealing with bytes is right.

I believe that converting a Unicode string to a base64 encoded form is 
necessarily a 2-step process. Step 1 is to convert from unicode 
characters to bytes, using an appropriate character encoding (UTF-8, 
UTF-16, and so on), and step 2 is to encode the bytes in base64. The 
resulting encoded byte array is actually an ASCII-encoded string, 
although it's more convenient in most cases to represent it as a byte 
array than as a string object, since it's likely in most cases that you 
are about to send it over the wire. So in other words, it makes sense to 
think about the conversion as (string -> bytes -> string), the actual 
objects being generated are (string -> bytes -> bytes).

The fact that 2 steps are needed is evident by the fact that there are 
actually two encodings involved, and these two encodings are mostly 
independent. So for example, one could just as easily base64-encode a 
UTF-16 encoded string as opposed to a UTF-8 encoded string. So the fact 
that you can vary one encoding without changing the other would seem to 
argue for the notion that they are distinct and independent.

Nor can you collapse to a single encoding step - you can't go directly 
from an internal unicode string to base64, since a unicode string is an 
array of code units which range from 1-0xffff, and base64 can't encode a 
number larger than 255.

Now, you *could* do both steps in a single function. However, you still 
have to choose what the intermediate encoding form is, even if you never 
actually see it. Usually this will be UTF-8.

-- Talin

From skip at pobox.com  Mon Jul 30 04:17:24 2007
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 29 Jul 2007 21:17:24 -0500
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <46AD30DA.6050405@canterbury.ac.nz>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
Message-ID: <18093.18996.944540.279864@montanaro.dyndns.org>


    Greg> I think the use cases for this are where you're trying to express
    Greg> a range-like condition, i.e 'a <= x < b'. Then you have to make
    Greg> sure you get the right relations in the right places, which is the
    Greg> same kind of burden as remembering to add or subtract 1 in the
    Greg> right places.

I think it's easier to learn that 'a <= x < b' is logically equivalent to 
'a <= x and x < b' than inferring that 'x in range(a, b)' means the same
thing.  In fact, due to shortcut semantics they actually don't mean quite
the same thing since b might not get evaluated in the cascading comparison
case.  Given that I find the cascading comparisons clearer I see no reason
to optimize the "in range(...)" case.

Skip

From rrr at ronadam.com  Mon Jul 30 04:54:04 2007
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 29 Jul 2007 21:54:04 -0500
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AD2BF6.5080507@canterbury.ac.nz>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>	<46ABFE12.5000101@gmail.com>
	<46AD2BF6.5080507@canterbury.ac.nz>
Message-ID: <46AD52CC.4060403@ronadam.com>


Greg Ewing wrote:
> Nick Coghlan wrote:
>> Py3k strings are unicode, so returning a string would mean you just have 
>> to encode it again using the ascii codec to get the bytes to put on the 
>> wire.
> 
> I still believe that producing a string is conceptually
> the right thing to do. The point of base64 is to encode
> binary data as text, not binary data as binary data.
> 
> If I ever had a reason to use base64, it would be because
> I had a "wire" that would accept text but not binary data,
> e.g. a file open in text mode, or some other text that I
> wanted to embed it in. Getting bytes in that situation
> would force me to make an *extra* conversion.

Not extra, you just need to make sure your binary data is in the correct 
range of values the text device you are sending to can handle.  As long as 
it is, it should just work.  That is the primary purpose of the base64 
encoding.  Keep in mind you are sending byte "characters", not integers.

So it would work like the following I think, with the application having 
responsibility of doing the object to bytes conversion and back, instead of 
the base64 encoder being limited to only strings.

   OUTPUT:
     convert object to bytes -> encode_64 to bytes -> bytes to output

   INPUT:
     bytes from input* -> decode_64 to bytes -> convert bytes to object

     *Reads text "characters" into bytes instance.

By refusing to guess what the object is, we also create an opportunity to 
manipulate the results or source further in a bytes instance without doing 
multiple (or needless) conversions to and from strings.

Cheers,
    Ron


From tjreedy at udel.edu  Mon Jul 30 05:02:02 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 29 Jul 2007 23:02:02 -0400
Subject: [Python-3000] base64 - bytes and strings
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com><46ABFE12.5000101@gmail.com>
	<46AD2BF6.5080507@canterbury.ac.nz>
Message-ID: <f8jkb8$mre$1@sea.gmane.org>


"Greg Ewing" <greg.ewing at canterbury.ac.nz> wrote in message 
news:46AD2BF6.5080507 at canterbury.ac.nz...
| Nick Coghlan wrote:
| > Py3k strings are unicode, so returning a string would mean you just 
have
| > to encode it again using the ascii codec to get the bytes to put on the
| > wire.
|
| I still believe that producing a string is conceptually
| the right thing to do. The point of base64 is to encode
| binary data as text, not binary data as binary data.

On the contrary, to me, the point of base64 is to encode bytes into a 
subset of bytes more or less guaranteed to not get mangled during 
transport.
That these safe bytes correspond to ascii chars (which, yes,is why they are 
safe) does not, to me, make the resulting quasi-random sequence 'text'.

tjr


From talin at acm.org  Mon Jul 30 05:41:08 2007
From: talin at acm.org (Talin)
Date: Sun, 29 Jul 2007 20:41:08 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2AE31.2080105@canterbury.ac.nz>	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>	<46A3ECB7.9070504@canterbury.ac.nz>	<20070723010750.E27693A40A9@sparrow.telecommunity.com>	<46A453C7.9070407@acm.org>	<20070723153031.D00273A403D@sparrow.telecommunity.com>	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
Message-ID: <46AD5DD4.8000509@acm.org>

Phillip J. Eby wrote:
> At 08:25 AM 7/27/2007 -0700, Guido van Rossum wrote:
>> Basic GFs, great. Before/after/around, good. Other method
>> combinations, fine. But GFs in classes and subclassing? Not until we
>> have a much better design.
> 
> Sounds reasonable to me.  The only time I actually use them in 
> classes myself is to override existing generic functions that live 
> outside the class, like ones from an Interface or a standalone generic.

I've been thinking about this quite a bit over the last week, and in 
particular thinking about the kinds of use cases that I would want to 
use GFs for.

One idea that I had a while back, but rejected as simply too much of a 
kludge, was to say that for GFs that are also methods, we would use 
regular Python method dispatching on the first argument, followed by GF 
overload dispatching on the subsequent arguments.

The reason that this is a kludge is that now the first argument behaves 
differently than the others.

(Pay no attention to the specific syntax here.)

    class A:
       @overload
       def method(self, x:object):
          ...

    class B(A):
       @overload
       def method(self, x:int):
          ...

    b = B()
    b.method("test") // Method not found

With regular GFs, this example works because there is a method that 
satisfies the constraints - the one in A. But since the first argument 
dominates all of the decision, by the time we get to B, the overloads in 
A are no longer accessible. Its as if each subclass is in it's own 
little GF world.

However, even though this is clumsy from a theoretical standpoint, from 
a practical standpoint it may not be all that bad. Most of the time, 
when I want to declare a GF that is also a method, I'm just using the 
class as a namespace to hold all this stuff, and I really don't care 
much about whether subclasses can extend it or not. I'm not using the 
type of 'self' to select different implementations in this case.

And in the case where I really do want to do dynamic dispatch on *all* 
of the arguments, including the first one, then it's more likely that I 
will declare the GF as a global function, instead of as a method.

An example of this would be an AST walker: I have a class which walks an 
AST and does various transformations on it, such as constant folding or 
algebraic simplification. Since I have some state that I'd need to carry 
around, I'll put that in a class, and then I'll have class methods which 
are dynamically dispatched on the *second* argument which is the node type:

    class ASTWalker:
       @overload
       def foldConstants(self, node:InfixOperator):
          ...

       @overload
       def foldConstants(self, node:UnaryOperator):
          ...

       @overload
       def foldConstants(self, node:ConstantInteger):
          ...

       @overload
       def foldConstants(self, node:ConstantString):
          ...

In this case, I have no need to subclass the class, and I'm only doing 
dynamic dispatching on the second argument.

So basically what I would propose is that we simply say that we don't 
mix normal overloading and multi-method dispatch until PJE comes up with 
his better solution.

-- Talin


From martin at v.loewis.de  Mon Jul 30 06:27:13 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 30 Jul 2007 06:27:13 +0200
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AD3D09.9060006@acm.org>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>	<46AC1E78.7050900@v.loewis.de>	<46AD2F60.9050907@canterbury.ac.nz>	<ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>
	<46AD3D09.9060006@acm.org>
Message-ID: <46AD68A1.8030403@v.loewis.de>

> I believe that converting a Unicode string to a base64 encoded form is 
> necessarily a 2-step process.

I think that part is undebated. What is the debate is whether
base64.encodestring (which accepts bytes) should *produce*
(unicode) strings, which would then have to be encoded as
us-ascii. That would make a process of going from unicode
to base64 bytes a three-step process:

   tosend = base64.encodestring(data.encode("utf-8")).encode("ascii")

Currently, you can spare the last step if you do want bytes,
and need to specify .decode("ascii") if you want strings.

Regards,
Martin

From martin at v.loewis.de  Mon Jul 30 06:51:51 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 30 Jul 2007 06:51:51 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>
References: <f7ithr$lrr$1@sea.gmane.org>	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>	<f7lk7q$9m6$1@sea.gmane.org>	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>	<f7lnd8$l2s$1@sea.gmane.org>	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>	<f8is94$tgh$1@sea.gmane.org>
	<f8j862$st0$1@sea.gmane.org>
	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>
Message-ID: <46AD6E67.50407@v.loewis.de>

>> If I patch io.py to default to "utf-8" rather than using the filesystem
>> encoding (ascii), that fixes a few more things. (test_coding.py and
>> test_minidom.py)
> 
> How come the filesystem decoding is set to ASCII?

I guess there are two problems: a) MS_WINDOWS isn't defined, and the
relevant code in bltinmodule.c doesn't special-case cygwin, and b)
setlocale is defined on Cygwin, but doesn't work.

>> (For whats its worth, Cygwin's python 2.5 (as installed on my system) fails
>> 2 of the tests in it's version of test_mailbox.py, both with "IOError:
>> [Errno 13] Permission denied").

I found that in many cases, this is a virus scanner or the indexing
service interfering. They open the file, and then the test suite cannot
delete it.

Regards,
Martin

From jyasskin at gmail.com  Mon Jul 30 07:56:27 2007
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Sun, 29 Jul 2007 22:56:27 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46AD5DD4.8000509@acm.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
	<46AD5DD4.8000509@acm.org>
Message-ID: <5d44f72f0707292256t2e4a8d25y976637456618eeaa@mail.gmail.com>

On 7/29/07, Talin <talin at acm.org> wrote:
> Phillip J. Eby wrote:
> > At 08:25 AM 7/27/2007 -0700, Guido van Rossum wrote:
> >> Basic GFs, great. Before/after/around, good. Other method
> >> combinations, fine. But GFs in classes and subclassing? Not until we
> >> have a much better design.
> >
> > Sounds reasonable to me.  The only time I actually use them in
> > classes myself is to override existing generic functions that live
> > outside the class, like ones from an Interface or a standalone generic.
>
> I've been thinking about this quite a bit over the last week, and in
> particular thinking about the kinds of use cases that I would want to
> use GFs for.
>
> One idea that I had a while back, but rejected as simply too much of a
> kludge, was to say that for GFs that are also methods, we would use
> regular Python method dispatching on the first argument, followed by GF
> overload dispatching on the subsequent arguments.
>
> The reason that this is a kludge is that now the first argument behaves
> differently than the others.
>
> (Pay no attention to the specific syntax here.)
>
>     class A:
>        @overload
>        def method(self, x:object):
>           ...
>
>     class B(A):
>        @overload
>        def method(self, x:int):
>           ...
>
>     b = B()
>     b.method("test") // Method not found
>
> With regular GFs, this example works because there is a method that
> satisfies the constraints - the one in A. But since the first argument
> dominates all of the decision, by the time we get to B, the overloads in
> A are no longer accessible. Its as if each subclass is in it's own
> little GF world.
>
> However, even though this is clumsy from a theoretical standpoint, from
> a practical standpoint it may not be all that bad. Most of the time,
> when I want to declare a GF that is also a method, I'm just using the
> class as a namespace to hold all this stuff, and I really don't care
> much about whether subclasses can extend it or not. I'm not using the
> type of 'self' to select different implementations in this case.

FWIW, this dispatching on self before overloading on the rest of the
arguments is what C++ does, and I think also what Java does. To get
the parent class's methods to participate in overloading, you have to
say
  using the_parent::method;
which looks pretty similar to Phillip's
  method = the_parent.method
except that using can appear anywhere within a class, while the method
assignment looks like it needs to appear first.

Unfortunately, this seems to surprise people, although I don't have
any experience about whether an alternative would be better or worse.
A lot of times, I write:

  class Parent {
    virtual int method(int i, string s) = 0;
    int method(Bar b) { return method(b.i, b.s); }
    int method(Quux q, Foo f) { return method(q.i, q.t + f.x); }
    // Note that the non-virtual methods forward to the virtual one.
    // Although the visibility would be the same if they were virtual too.
  };

  class Child : public Parent {
    virtual int method(int i, string s) { return do_something(i, s); }
  };

and am then surprised that
  Child c;
  c.method(Bar(...));
fails to compile. (Because I forgot the using declaration in Child. Again.)

So the possibility is practically clumsy, but there's a precedent for it.

-- 
Namast?,
Jeffrey Yasskin

From hasan.diwan at gmail.com  Mon Jul 30 08:40:29 2007
From: hasan.diwan at gmail.com (Hasan Diwan)
Date: Sun, 29 Jul 2007 23:40:29 -0700
Subject: [Python-3000] test_asyncore fails intermittently on Darwin
In-Reply-To: <2cda2fc90707292338pff060c1i810737dcf6d5df54@mail.gmail.com>
References: <2cda2fc90707261505tdd9a0f1t861b5801c37ad11e@mail.gmail.com>
	<1d36917a0707261618oac94f20l98f464a2ab1edc4e@mail.gmail.com>
	<2cda2fc90707292338pff060c1i810737dcf6d5df54@mail.gmail.com>
Message-ID: <2cda2fc90707292340k7eb11f2w82003e6f705438c3@mail.gmail.com>

The issue seems to be in the socket.py close method. It needs to sleep
socket.SO_REUSEADDR seconds before returning. Yes, it is a simple fix
in python, but the socket code is C. I found some code in socket.py
and made the changes. Patch is available at
http://sourceforge.net/tracker/index.php?func=detail&aid=1763387&group_id=5470&atid=305470
-- enjoy your week.
-- 
Cheers,
Hasan Diwan <hasan.diwan at gmail.com>

From jdahlin at async.com.br  Sun Jul 29 19:46:55 2007
From: jdahlin at async.com.br (Johan Dahlin)
Date: Sun, 29 Jul 2007 14:46:55 -0300
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707291033v1bec4607pad990db38f82eda8@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>	<46ABE025.4050204@async.com.br>
	<ca471dc20707291033v1bec4607pad990db38f82eda8@mail.gmail.com>
Message-ID: <46ACD28F.6030808@async.com.br>

Guido van Rossum wrote:
[..]
>
>  Were you drunk at the time? :-)
No, I just remember that I made that mistake several times.

[..]
> > It seems to be used quite widely already;
> >
> > 
http://google.com/codesearch?hl=en&q=+%5E.*if%5Cs%2B.*%5Cs%2Bin%5Cs%2Brange%5C(.*%24&start=10&sa=N
> >
>
>  Sorry, 50 hits is not "quite widely".

Not everything is known to google's code search. I'm just saying that 
there's code
out there that uses this syntax.

>
>  Did you find *any* examples using a step > 1?

No, I didn't. I'm not arguing for that use case either, I'm mainly 
interested
in the use case where step == 1.

Johan

From Jack.Jansen at cwi.nl  Sun Jul 29 21:32:42 2007
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Sun, 29 Jul 2007 21:32:42 +0200
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <B707418E-B504-40D8-9EFF-1B1FB6216EFE@mac.com>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<B707418E-B504-40D8-9EFF-1B1FB6216EFE@mac.com>
Message-ID: <1719694F-6FCA-4A64-8A42-5BC60C3A793C@cwi.nl>

One minor point (that may already have been addressed, I've not seen  
the whole discussion): note that 4CCs not only occur on the Mac but  
also in various other contexts: AIFF files use 4CCs to define chunk  
types, MP4 files use them for a gazillion different things (media  
types, codec types, etc). Actually, codec types are generally defined  
by their 4CC, and some times these even get to be used as their  
mainstream name (divx and xvid).

It may be worthwhile to add generalized support somewhere to handle  
converting 4CCs from readable to binary representation. And, of  
course, the world being as it is some formats (Mac OSTypes, for  
example, and probably quicktime/mp4 as well, but I'm not sure)  
represent 4CCs in big-endian order, others (AIFF) in little-endian.
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma  
Goldman


From skip at pobox.com  Mon Jul 30 15:13:18 2007
From: skip at pobox.com (skip at pobox.com)
Date: Mon, 30 Jul 2007 08:13:18 -0500
Subject: [Python-3000] io library/PEP 3116 bits
Message-ID: <18093.58350.892824.688493@montanaro.dyndns.org>


I was looking at PEP 3116 to try and figure out what the newline keyword
argument was for (it was mentioned in a couple replies to some checkin
comments and I see it in io.py).  It's not really mentioned in the PEP as
far as I could tell other than this:

    Some new features include universal newlines and character set encoding
    and decoding.

The io.open() docstring has this to say:

      newline: optional newlines specifier; must be None, '\n' or '\r\n';
               specifies the line ending expected on input and written on
               output.  If None, use universal newlines on input and
               use os.linesep on output.

Shouldn't '\r' be provided as an option for Macs?  Also, shouldn't the "U"
mode flag be discarded (2to3 could maybe do this)?  Is this particular bit
of backwards compatibility all that necessary?

The other thing I wanted to comment on is the default value for n in the
various read methods.  In some places it's -1 (why not zero? *), but in
other places it's None, with presumably the same meaning.  Shouldn't this be
consistent across all read methods?  The couple read methods mentioned in
PEP 3116 only mention n=-1 as a default.

Skip

(*) A few days ago at work I saw someone check in a piece of code with

    f.read(-1)

That looked so strange to me I had to look up its meaning.  I don't think I
had ever seen someone explicitly call read with a -1 arg.

S

From bwinton at latte.ca  Mon Jul 30 15:41:21 2007
From: bwinton at latte.ca (Blake Winton)
Date: Mon, 30 Jul 2007 09:41:21 -0400
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AD68A1.8030403@v.loewis.de>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>	<46AC1E78.7050900@v.loewis.de>	<46AD2F60.9050907@canterbury.ac.nz>	<ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>	<46AD3D09.9060006@acm.org>
	<46AD68A1.8030403@v.loewis.de>
Message-ID: <46ADEA81.8000609@latte.ca>

Martin v. L?wis wrote:
> The debate is whether base64.encodestring (which accepts bytes)
 > should *produce* (unicode) strings, which would then have to be
 > encoded as us-ascii. That would make a process of going from
 > unicode to base64 bytes a three-step process:
> 
>    tosend = base64.encodestring(data.encode("utf-8")).encode("ascii")
> 
> Currently, you can spare the last step if you do want bytes,
> and need to specify .decode("ascii") if you want strings.

As a vote for keeping it, does anyone really want to encode the 
base64-ed data as something other than "ascii"?

I mean, does it make any sense to write:
tosend = base64.encodestring(data.encode("utf-8")).encode("UTF-16")
?  Even if you could, I believe the resulting string would be 
un-processable by any other base-64 decoding tool.

Later,
Blake.

From unknown_kev_cat at hotmail.com  Mon Jul 30 18:12:17 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Mon, 30 Jul 2007 12:12:17 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org>	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>	<f7lk7q$9m6$1@sea.gmane.org>	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>	<f7lnd8$l2s$1@sea.gmane.org>	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>	<f8is94$tgh$1@sea.gmane.org><f8j862$st0$1@sea.gmane.org><ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>
	<46AD6E67.50407@v.loewis.de>
Message-ID: <f8l2l4$2d7$1@sea.gmane.org>


""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:46AD6E67.50407 at v.loewis.de...
>>> If I patch io.py to default to "utf-8" rather than using the filesystem
>>> encoding (ascii), that fixes a few more things. (test_coding.py and
>>> test_minidom.py)
>>
>> How come the filesystem decoding is set to ASCII?
>
> I guess there are two problems: a) MS_WINDOWS isn't defined, and the
> relevant code in bltinmodule.c doesn't special-case cygwin, and b)
> setlocale is defined on Cygwin, but doesn't work.

Cygwin's setlocale function only supports the "C" locale.
I am a bit suprised that ASCII is returned rather than the system's default 
encoding.
(I believe that should be Latin-1 on my system).

>>> (For whats its worth, Cygwin's python 2.5 (as installed on my system) 
>>> fails
>>> 2 of the tests in it's version of test_mailbox.py, both with "IOError:
>>> [Errno 13] Permission denied").
>
> I found that in many cases, this is a virus scanner or the indexing
> service interfering. They open the file, and then the test suite cannot
> delete it.

Good guesses, but the indexing service is turned off, and I am not running 
any virus scanning software.

The failures for the 2.5 test come from lines that look like:
"for line in f:". The failures in the 3k tests come from the lines that 
attempt to open the file.

It does seem likely though that something is going wrong with the deletion, 
effectively delaying
it, which is triggering the errors. But i'm not sure what.


But actually looking closely I was mistaken. Only one test failed under 2.5. 
That was TestMH's test_pack.
It may be a fluke. Or perhaps the Cygwin Python maintainer understood that 
failure and decided it was nothing to worry about.

GvR's thoughts of changing file permissions do not seem to work, as test_add 
certainly does not look to be changing file permissions.


From guido at python.org  Mon Jul 30 19:09:27 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Jul 2007 10:09:27 -0700
Subject: [Python-3000] struni and the Apple four-character-codes
In-Reply-To: <1719694F-6FCA-4A64-8A42-5BC60C3A793C@cwi.nl>
References: <5d44f72f0707242218i71554da4x1d6924715f016562@mail.gmail.com>
	<4EFA40DF-0113-1000-ED2D-BFFF4499DECF-Webmail-10021@mac.com>
	<5d44f72f0707262038j7a2dd0dcued85d9ef6d014236@mail.gmail.com>
	<B707418E-B504-40D8-9EFF-1B1FB6216EFE@mac.com>
	<1719694F-6FCA-4A64-8A42-5BC60C3A793C@cwi.nl>
Message-ID: <ca471dc20707301009k487f8775iaa5f8561c84bf09d@mail.gmail.com>

On 7/29/07, Jack Jansen <Jack.Jansen at cwi.nl> wrote:
> One minor point (that may already have been addressed, I've not seen
> the whole discussion): note that 4CCs not only occur on the Mac but
> also in various other contexts: AIFF files use 4CCs to define chunk
> types, MP4 files use them for a gazillion different things (media
> types, codec types, etc). Actually, codec types are generally defined
> by their 4CC, and some times these even get to be used as their
> mainstream name (divx and xvid).
>
> It may be worthwhile to add generalized support somewhere to handle
> converting 4CCs from readable to binary representation. And, of
> course, the world being as it is some formats (Mac OSTypes, for
> example, and probably quicktime/mp4 as well, but I'm not sure)
> represent 4CCs in big-endian order, others (AIFF) in little-endian.

And some support both, detecting the byte order from the 4CC. (This
excludes palindromic 4CCs. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul 30 19:20:50 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Jul 2007 10:20:50 -0700
Subject: [Python-3000] io library/PEP 3116 bits
In-Reply-To: <18093.58350.892824.688493@montanaro.dyndns.org>
References: <18093.58350.892824.688493@montanaro.dyndns.org>
Message-ID: <ca471dc20707301020h49f89131k944fe32036708628@mail.gmail.com>

On 7/30/07, skip at pobox.com <skip at pobox.com> wrote:
> I was looking at PEP 3116 to try and figure out what the newline keyword
> argument was for (it was mentioned in a couple replies to some checkin
> comments and I see it in io.py).  It's not really mentioned in the PEP as
> far as I could tell other than this:
>
>     Some new features include universal newlines and character set encoding
>     and decoding.
>
> The io.open() docstring has this to say:
>
>       newline: optional newlines specifier; must be None, '\n' or '\r\n';
>                specifies the line ending expected on input and written on
>                output.  If None, use universal newlines on input and
>                use os.linesep on output.
>
> Shouldn't '\r' be provided as an option for Macs?  Also, shouldn't the "U"
> mode flag be discarded (2to3 could maybe do this)?  Is this particular bit
> of backwards compatibility all that necessary?

I don't think \r needs to be supported -- OSX uses \n; Python 3.0
isn't going to be ported to MacOS 9. We discussed this before; I
promised I'd add \r support if anyone can find a current use case for
it. So far none have been reported.

Regarding dropping 'U': agreed. But since the fixer hasn't been
written yet it hasn't been dropped yet. We need help for little
niggling details like this!

> The other thing I wanted to comment on is the default value for n in the
> various read methods.  In some places it's -1 (why not zero? *), but in
> other places it's None, with presumably the same meaning.  Shouldn't this be
> consistent across all read methods?  The couple read methods mentioned in
> PEP 3116 only mention n=-1 as a default.
>
> Skip
>
> (*) A few days ago at work I saw someone check in a piece of code with
>
>     f.read(-1)
>
> That looked so strange to me I had to look up its meaning.  I don't think I
> had ever seen someone explicitly call read with a -1 arg.

read(0) means to read zero bytes. It always returns an empty string
(or byte array). There are plenty of end cases where this is useful.

read(), read(None) and read(-1) are all synonyms, meaning "read until
EOF". The reason there are three spellings is mostly historic; because
there are so many different file-like objects and not all of them
implemented this consistently. Since the argument is an integer, it's
the easiest to use -1 as the default; but since some classes used None
as the default instead, some people started *passing* None, and then
the need was born to support both.

Arguably this was a bad idea, and we should add a new API readall()
(one of the implementations already has this, and read(-1) calls it).
Then the 2to3 fixer will have to recognize this. I welcome patches!

But right now, getting the number of failing unit tests in the
py3k-struni branch down to zero is more important. To help, see
http://wiki.python.org/moin/Py3kStrUniTests.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul 30 19:24:35 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Jul 2007 10:24:35 -0700
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46ADEA81.8000609@latte.ca>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46AC1E78.7050900@v.loewis.de> <46AD2F60.9050907@canterbury.ac.nz>
	<ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>
	<46AD3D09.9060006@acm.org> <46AD68A1.8030403@v.loewis.de>
	<46ADEA81.8000609@latte.ca>
Message-ID: <ca471dc20707301024q4b5a82cdjeba1dc7461efc12c@mail.gmail.com>

On 7/30/07, Blake Winton <bwinton at latte.ca> wrote:
> Martin v. L?wis wrote:
> > The debate is whether base64.encodestring (which accepts bytes)
>  > should *produce* (unicode) strings, which would then have to be
>  > encoded as us-ascii. That would make a process of going from
>  > unicode to base64 bytes a three-step process:
> >
> >    tosend = base64.encodestring(data.encode("utf-8")).encode("ascii")
> >
> > Currently, you can spare the last step if you do want bytes,
> > and need to specify .decode("ascii") if you want strings.
>
> As a vote for keeping it, does anyone really want to encode the
> base64-ed data as something other than "ascii"?
>
> I mean, does it make any sense to write:
> tosend = base64.encodestring(data.encode("utf-8")).encode("UTF-16")
> ?  Even if you could, I believe the resulting string would be
> un-processable by any other base-64 decoding tool.

I think you're missing the point, the point being that the most common
use needs bytes, so returning bytes is the most useful API design.

And to answer your rhetorical question: yes, there are other
conceivable encodings for base64; in particular EBCDIC.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul 30 19:27:35 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Jul 2007 10:27:35 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <46AD6E67.50407@v.loewis.de>
References: <f7ithr$lrr$1@sea.gmane.org>
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>
	<f7lk7q$9m6$1@sea.gmane.org>
	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>
	<f7lnd8$l2s$1@sea.gmane.org>
	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>
	<f8is94$tgh$1@sea.gmane.org> <f8j862$st0$1@sea.gmane.org>
	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>
	<46AD6E67.50407@v.loewis.de>
Message-ID: <ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>

On 7/29/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> (For whats its worth, Cygwin's python 2.5 (as installed on my system) fails
> >> 2 of the tests in it's version of test_mailbox.py, both with "IOError:
> >> [Errno 13] Permission denied").
>
> I found that in many cases, this is a virus scanner or the indexing
> service interfering. They open the file, and then the test suite cannot
> delete it.

Oh darn. I remember running into that in a completely different
context. What's the solution? Turn off the virus scanner? Wait until
it's done? I guess we could add something to test_support.unlink()
that checks for windows or cygwin, and when it gets this error on
cleanup, waits half a second and tries again, looping for a few
seconds before giving up completely. Currently the unlink() call
ignores all errors, so then the subsequent open() call gets the error.
Unit tests that still call os.unlink() or os.remove() instead of
test_support.unlink() should be updated anyway.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Jul 30 20:07:50 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Jul 2007 11:07:50 -0700
Subject: [Python-3000] io library/PEP 3116 bits
In-Reply-To: <DBADB8D9-9941-43C2-8B79-9440E1B00DAC@PageDNA.com>
References: <18093.58350.892824.688493@montanaro.dyndns.org>
	<ca471dc20707301020h49f89131k944fe32036708628@mail.gmail.com>
	<DBADB8D9-9941-43C2-8B79-9440E1B00DAC@PageDNA.com>
Message-ID: <ca471dc20707301107i60789370lfce119eb8ecb44d7@mail.gmail.com>

On 7/30/07, Tony Lownds <tony at pagedna.com> wrote:
> On Jul 30, 2007, at 10:20 AM, Guido van Rossum wrote:
> > I don't think \r needs to be supported -- OSX uses \n; Python 3.0
> > isn't going to be ported to MacOS 9. We discussed this before; I
> > promised I'd add \r support if anyone can find a current use case for
> > it. So far none have been reported.
>
> I routinely work with OS X created files that use \r newlines. The most
> common ones are Excel (when exporting to text) and Adobe Illustrator
> EPS files.

fair enough. We'll have to support \r then. I'll update the PEP; a
patch for the code would be most welcome.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jimjjewett at gmail.com  Mon Jul 30 20:20:14 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 30 Jul 2007 14:20:14 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070721181442.48FB03A403A@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
Message-ID: <fb6fbf560707301120w2421cbc4s519a59a93163c937@mail.gmail.com>

On 7/21/07, Phillip J. Eby <pje at telecommunity.com> wrote:

>... If you have to use @somegeneric.before and
> @somegeneric.after, you can't decide on your own to add
> @somegeneric.debug.

> However, if it's @before(somegeneric...), then you can add
> @debug and @authorize and @discount and whatever else
> you need for your
> application, without needing to monkeypatch them in.

I honestly don't see any difference here.  @somegeneric.method implies
that somegeneric is an existing object, and even that it already has
rules for combining .before and .after; it can just as easily have a
rule for combining arbitrary methods.

If you're saying that @discount could include its own combination
rules, then each method needs to repeat the boilerplate to pick apart
the current decision tree.  The only compensating "advantage" I see is
that the decision tree could be changed arbitrarily from anywhere,
even as "good practice."  (Since my new @thumpit decorator would takes
the generic as an argument, you won't see the name of the generic in
my file; you might never see it there was iteration involved.)

> Our brains run by pattern recognition, with more-specific
> patterns taking precedence, so this is an easier model for your
> brain to follow than step-by-step computation anyway.

Only if you are confident that you have all the patterns enumerated.

I realize that subclasses are theoretically just as arbitrary, but
they aren't in practice.  Base classes are almost always named
directly, rather than indirectly through a variable.  Subclassing
(normally) affects only the first dimension, so you don't have a
cartesian product to mentally resolve.

You can certainly say now that configuration specialization should be
in one place, and that dispatching on parameter patterns like

(*            # ignored
, :int        # actual int subclass
, :Container  # meets the Container ABC
, 4<val<17.3  # value-specific rule
)

is a bad idea -- but whenever I look at an application from the
outside, well-organized configuration data is a rare exception.

> At 10:55 PM 7/20/2007 -0700, Talin wrote:
> >If it turns out that there's no way to get a callback when the
> >class has finished being built,

Could you clarify why the __class__ attribute being used by super is
not sufficient?

-jJ

From tony at PageDNA.com  Mon Jul 30 19:50:16 2007
From: tony at PageDNA.com (Tony Lownds)
Date: Mon, 30 Jul 2007 10:50:16 -0700
Subject: [Python-3000] io library/PEP 3116 bits
In-Reply-To: <ca471dc20707301020h49f89131k944fe32036708628@mail.gmail.com>
References: <18093.58350.892824.688493@montanaro.dyndns.org>
	<ca471dc20707301020h49f89131k944fe32036708628@mail.gmail.com>
Message-ID: <DBADB8D9-9941-43C2-8B79-9440E1B00DAC@PageDNA.com>


On Jul 30, 2007, at 10:20 AM, Guido van Rossum wrote:
> I don't think \r needs to be supported -- OSX uses \n; Python 3.0
> isn't going to be ported to MacOS 9. We discussed this before; I
> promised I'd add \r support if anyone can find a current use case for
> it. So far none have been reported.

I routinely work with OS X created files that use \r newlines. The most
common ones are Excel (when exporting to text) and Adobe Illustrator
EPS files.

-Tony


From jimjjewett at gmail.com  Mon Jul 30 21:00:46 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 30 Jul 2007 15:00:46 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <5d44f72f0707292256t2e4a8d25y976637456618eeaa@mail.gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
	<46AD5DD4.8000509@acm.org>
	<5d44f72f0707292256t2e4a8d25y976637456618eeaa@mail.gmail.com>
Message-ID: <fb6fbf560707301200o1065b65p5516689848dd579d@mail.gmail.com>

On 7/30/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> On 7/29/07, Talin <talin at acm.org> wrote:
> > Phillip J. Eby wrote:
> > > At 08:25 AM 7/27/2007 -0700, Guido van Rossum wrote:
> > >> ... But GFs in classes and subclassing? Not until we
> > >> have a much better design.

> > > The only time I actually use them in
> > > classes myself is to override existing generic functions
> > > that live outside the class

Why are you overriding, instead of just specializing?

Why not define the @overload operator so that it just registers the
specialization with the base class?


> >     class A:
> >        @overload
> >        def method1(self, x:object):
> >           ...

Should this register with a "global" generic method, so that

    method1(first_arg:A, x:object)

forwards to

    A.method1(first_arg, x)

> >     class B(A):
> >        @overload
> >        def method(self, x:int):
> >           ...

and this would register with A.method1 (or the global method1,
depending on the previous answer) for the pattern

    method1(first_arg:B, x:int)

> >     b = B()
> >     b.method("test") // Method not found

Instead, this would skip back to A.method1(self, "test") -- and I
think the @overload decorator is sufficient warning.  (I do wonder
whether that is magical enough to call it @__overload__)

-jJ

From pje at telecommunity.com  Mon Jul 30 21:45:33 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Jul 2007 15:45:33 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <fb6fbf560707301120w2421cbc4s519a59a93163c937@mail.gmail.co
 m>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<20070713173936.53C213A404D@sparrow.telecommunity.com>
	<f7pgki$6o3$1@sea.gmane.org>
	<ca471dc20707200749p4ed42134h453c7535c98cc73d@mail.gmail.com>
	<20070720174706.AE5773A40A8@sparrow.telecommunity.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<fb6fbf560707301120w2421cbc4s519a59a93163c937@mail.gmail.com>
Message-ID: <20070730194510.08C733A40AA@sparrow.telecommunity.com>

At 02:20 PM 7/30/2007 -0400, Jim Jewett wrote:
>On 7/21/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> >... If you have to use @somegeneric.before and
> > @somegeneric.after, you can't decide on your own to add
> > @somegeneric.debug.
>
> > However, if it's @before(somegeneric...), then you can add
> > @debug and @authorize and @discount and whatever else
> > you need for your
> > application, without needing to monkeypatch them in.
>
>I honestly don't see any difference here.  @somegeneric.method implies
>that somegeneric is an existing object, and even that it already has
>rules for combining .before and .after; it can just as easily have a
>rule for combining arbitrary methods.

I don't understand what you're saying or how it relates to what I said above.

If you define a new kind of method qualifier (e.g. @discount), then 
all existing generic functions aren't suddenly going to grow a 
'.discount' attribute.  That's what the above discussion is about -- 
how you *access* qualifier decorators.


>If you're saying that @discount could include its own combination
>rules, then each method needs to repeat the boilerplate to pick apart
>the current decision tree.

Still don't understand you.  Method combination is done with a 
generic function called "combine_actions" which takes two arbitrary 
"method" objects and returns a new "method" representing their 
combination.  There is no boilerplate or picking anything apart.


>   The only compensating "advantage" I see is
>that the decision tree could be changed arbitrarily from anywhere,
>even as "good practice."  (Since my new @thumpit decorator would takes
>the generic as an argument, you won't see the name of the generic in
>my file; you might never see it there was iteration involved.)

Decision trees are generated from a flat collection of rules; they're 
not directly manipulated.  In the default implementation (based on 
Guido's prototype), the "tree" is just a big dictionary mapping 
tuples of types to "method" objects created by combining all the 
methods whose signatures are implied by that tuple of types.  It's 
also sparse, in that it doesn't contain type combinations that 
haven't been looked up yet.  So there isn't really any tree that you 
could "change" here.

There's just a collection of rules, where a rule consists of a 
predicate, a definition order, a "body" (function), and a method 
factory.  A predicate is a collection of possible signatures (e.g. 
the sequence of applicable types) -- i.e., an OR of ANDs.

To actually build a tree, rules are turned into a set of "cases", 
where each case consists of one signature from the rule's predicate, 
plus a method instance created using the signature, body, and 
definition order.  (Not all methods care about definition order, just 
ones like before/after.)

In the default engine (loosely based on Guido's prototype), these 
cases are merged by using combine_actions() on any cases with the 
same signature, and stored in a dictionary called the 
"registry".  The registry is built up incrementally as you add methods.

When you call the function, a type tuple is built and looked up in 
the cache.  If nothing is found in the cache, we loop over the 
*entire* registry, and build up a derived method, like this (actual 
code excerpt):

     try:
         f = cache[types]
     except KeyError:
         # guard against re-entrancy looking for the same thing...
         action = cache[types] = self.rules.default_action
         for sig in self.registry:
             if sig==types or implies(types, sig):
                 action = combine_actions(action, self.registry[sig])
         f = cache[types] = action
     return f(*args)

The 'self.rules.default_action' is to method objects what zero is to 
numbers -- the start of the summing.  Ordinarily, the default action 
is a NoMethodFound object -- a perfectly valid "method" 
implementation whose behavior is to raise an error.  All other method 
types have higher combination precedence than NoMethodFound, so it 
always sinks to the end of any combination of methods.

The relevant generic functions here are implies(), combine_actions(), 
and overrides() -- where combine_actions() calls overrides() to find 
out which action should override the other, and then returns 
overriding_action.override(overridden_action).

The overrides() relationship of two actions of the same type (e.g. 
two Around methods), is defined by the implies() relationship of the 
action signatures.  For Before/After methods, the definition order is 
used to resolve any ambiguity in the implies().

The .override() of a method is usually a new instance of the same 
method type, but with a "tail" that points to the overridden method, 
so that next_method will do the right thing.

There are more details than this, of course, but the point is that 
method combination is 100% orthogonal to the dispatch tree 
mechanism.  You can build any kind of dispatch engine you want, just 
by using combine_actions to combine the actions.  The action types 
themselves only need to know how to .override() a lower precedence 
method and .merge() with a same-precedence method.  And there needs 
to be an overrides() relationship defined between all pairs of method 
types, but in my current version of the implementation, overrides() 
is automatically transitive for any type-level relationship.

So if you define a type that overrides Around, then it also overrides 
anything that Around overrides.  So, for the most part you just say 
what types you want to override (and/or be overridden by), and maybe 
add a rule for how to compare two methods of your type (if the 
default of comparing by the implies() of signatures isn't sufficient).

The way that generic functions make this incredible orthogonality and 
flexibility possible is itself an argument for generic functions, 
IMO.  Certainly, it's a hell of an argument for implementing generic 
functions in terms of other generic functions, which is why I did 
it.  It beats the crap out of my previous implementation approaches, 
which had way too much coupling between method combination and 
tree-building and rules and cases and whatnot.

Separating these ideas into different functional/conceptual domains 
makes the whole thing easier to understand -- as long as you're not 
locked into procedural-implementation thinking.  If you want to think 
step-by-step, it's potentially a vast increase in complication.  On 
the other hand, it's like thinking about reference counting while 
writing Python code.  Sure, you need to drop down to that level every 
now and then, but it's a waste of time to think about it 90% of the 
time.  Being able to have a class of things that you *don't* think 
about is what makes Python a higher-level language than the C it's 
implemented with.

In the same way, generic functions are a higher-level version of OO 
-- you get to think in terms of a domain's abstract operations, like 
implication, overriding, and combination in this example.

The domain abstractions are not an "interface", nor are they methods 
or object types.  They're more like "concepts", except that the term 
"concept" has been abused to refer to much lower-level things that 
can attach to only one object within an operation.

The concept of implication is that there are imply-ers and imply-ees 
-- a role for each argument, each of which is an implicit interface 
or abstract object type.

In traditional OO and even interfaces, there are considerable limits 
to your ability to specify such partial interfaces and the 
relationships between them, forcing you to choose arbitrary and 
implementation-defined organization to put them in.  You then have to 
force-fit objects to have the right methods, because you didn't 
define an x.is_implied_by(y) relationship, only a x.implies(y) relationship.

Thing is, a *relationship* doesn't belong to one side or the other -- 
it's a *relationship*.  A third, independent thing.  Like a GF method.

In any program, these relationships already exist, and you still have 
to understand them.  They're just forced into whatever pattern the 
designer chose or had thrust upon them to make them fit the 
at-best-binary nature of OO methods, instead of called out as 
explicit relationships, following the form of the problem domain.


>I realize that subclasses are theoretically just as arbitrary, but
>they aren't in practice.

Right -- and neither are generic functions in normal usage.  The only 
reason you think that subclasses aren't arbitrary is because you're 
used to the ways that things get force-fitted into those 
relationships.  Whereas, with GF's, the program can simply model the 
application domain relationships, and you're going to know what 
patterns will follow because they'll reflect the application domain.

For example, if you see implies() and combine_actions() and 
overrides(), are you going to have any problems knowing when you see 
a type, whether these GF's might have methods for that type?  You'll 
know when to *look* for such a method, because you know what roles 
the arguments play in each GF.  If the type might play such a role, 
then you'll want to know *how* it plays that role in connection with 
specific collaborators or circumstances -- and you'll know what 
method implementations to look for.

It's ridiculously simple in practice, even though it sounds hard in 
theory.  That's the very problem in fact -- in neither subclassing 
nor GF's can you solve such problems *in theory*.  You can only solve 
them in *practice*, because it's only in the context of a specific 
program that you have any domain knowledge to apply -- i.e., 
knowledge about what general kinds of things the program is supposed 
to do and what general kinds of things it does them with.

If you have that general knowledge, it's just as easy to handle one 
organization as the other -- but the GF-based version gives you the 
option of having a module that defines lots of basic "kinds of things 
it's supposed to do" up front, so that you have an idea of how to 
understand the "things it does them with" when you encounter them.


>You can certainly say now that configuration specialization should be
>in one place, and that dispatching on parameter patterns like
>
>(*            # ignored
>, :int        # actual int subclass
>, :Container  # meets the Container ABC
>, 4<val<17.3  # value-specific rule
>)
>
>is a bad idea

But I *don't* say that.  What I say is that in practice, there are 
only a few natural places to *put* such a definition:

* near the definition of Container (or int, but that's a builtin in this case)

* near the definition of the generic function being overloaded

* in a "concern-based" grouping, e.g. an appropriate module that 
groups together matters for some application-domain concept.  (For 
example, an "ordering_policy" module might contain overrides for a 
variety of generic functions that relate to inventory, shipping, and 
billing, within the context of placing orders.)

* in an application-designated catchall location

Which of these locations is "best" depends on the overall size of the 
program.  A one-module program is certainly small enough to not need 
to pick one.  As a system gets bigger, some of the other usage 
patterns become more applicable.


>-- but whenever I look at an application from the
>outside, well-organized configuration data is a rare exception.

That may be -- but one enormous advantage of generic functions is 
that you can always relocate your method definitions to a different 
module or different part of the same module without affecting the 
meaning of the program, as long as all the destination modules are 
imported by the time you execute any of the functions.

In other words, if a program is messy, you can clean it up -- heck, 
it's potentially safer to do with an automatic refactoring tool, than 
other types of refactorings in Python.  (e.g., changing the signature 
of a 'foo()' method is difficult to do safely because you don't 
necessarily know whether two arbitrary methods *named* 'foo' are 
semantically the same, whereas generic functions are objects, not names.)


From pje at telecommunity.com  Mon Jul 30 22:10:08 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Jul 2007 16:10:08 -0400
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
Message-ID: <20070730201511.C14ED3A406B@sparrow.telecommunity.com>

At 12:20 PM 7/27/2007 -0400, Phillip J. Eby wrote:
>At 08:25 AM 7/27/2007 -0700, Guido van Rossum wrote:
> >Basic GFs, great. Before/after/around, good. Other method
> >combinations, fine. But GFs in classes and subclassing? Not until we
> >have a much better design.
>
>Sounds reasonable to me.  The only time I actually use them in
>classes myself is to override existing generic functions that live
>outside the class, like ones from an Interface or a standalone generic.
>
>The main reason I included GFs-in-classes examples in the PEP is
>because of the "dynamic overloading" meme.  In C++, Java, etc., you
>can use overloading in methods, so I wanted to show how you could do
>that, if you wanted to.
>
>I suspect that the simplest way to fix this in Py3K is with an
>"overloading" metaclass, as it would not even require any
>decorators.  That is, you could provide a custom dictionary that
>records every definition of a function with the same name.  The
>actual metaclass creation process would check for a method of the
>same name in a base class, and if it's generic (or the current class
>added more than one method), put a generic method in.
>
>With a little bit of work, you could probably determine whether you
>could get away with dropping the genericness in a subclass;
>specifically, if all the subclass-defined methods are "more specific"
>than all base class methods, then there's no need for them to be in
>the same generic function, unless they make next_method calls.  Thus,
>you'll end up with normal methods except where absolutely necessary.
>
>Such a metaclass would make method overloads look pretty much the
>same as in OO languages with static overloading.  The only remaining
>hole at that point would be reconciling super() and next_method.  If
>you're using this metaclass, super() is only meaningful if you're not
>in the same generic function as is used in your base, while
>next_method() is only meaningful if you *are*.
>
>I don't know of any quick way to fix that, but I'll give it some thought.

I think I see how to resolve next_method() and super() now: if you 
create a new GF in a subclass, you just define its default_action to 
be something that calls super().  Then, you just use next_method() 
instead of super().

Currently the default default_action is a NoMethodFound action, but 
replacing it for a given GF is a piece of cake.  So, an "overloading" 
metaclass could be written that would:

1. Use __prepare__ to catch multiple function assignments to the same 
name, converting them to overloads

2. Decide whether to combine those overloads with an existing generic 
in the base classes, or to create a new generic and chain it with a 
super() default action.

3. Automatically make the class object part of the overload 
registrations for 'self'.

The principle downside to this approach is that only one metaclass 
can provide a __prepare__ dictionary, which means it's even more 
difficult to combine metaclasses than it is in today's Python -- 
which means I want to give a little more thought to PEP 3115, to see 
if there is any way to at least emulate the "derived metaclass rule" 
for __prepare__, that Python currently enforces for the base classes.

In other words, a class' metaclass has to be a derivative of all its 
bases' metaclasses; ISTM that a __prepare__ namespace needs to be a 
derivative in some sense of all its bases' __prepare__ results.  This 
probably isn't enforceable, but the pattern should be documented such 
that e.g. the overloading metaclass' __prepare__ would return a 
mapping that delegates operations to the mapping returned by its 
super()'s __prepare__, and the actual class creation would be 
similarly chained.  PEP 3115 probably needs a section to explain 
these issues and recommend best practices for implementing 
__prepare__ and class creation on that basis.  I'll write something 
up after I've thought this through some more.

But I think this wraps up the overall question of *how* to integrate 
methods and GFs in a way that supports a more C++/Java-like 
overloading style (i.e., no decorators on individual overloads within 
a class).  The main drawback is that it's a silent error if you leave 
off the metaclass.

Another option of course would be to make this part of the default 
metaclass, but that would bring in the issue of needing a standard 
API (and default implementation) for GF's.

In the meantime, though, it's nice to see a practical application for 
PEP 3115 -- i.e., implementing transparent Java-style 
overloading.  It's absolutely not possible in 2.x without decorators, 
both because of the lack of argument annotations and the lack of a 
__prepare__-controlled class-suite namespace.


From martin at v.loewis.de  Mon Jul 30 23:39:40 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 30 Jul 2007 23:39:40 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>
References: <f7ithr$lrr$1@sea.gmane.org>	
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>	
	<f7lk7q$9m6$1@sea.gmane.org>	
	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>	
	<f7lnd8$l2s$1@sea.gmane.org>	
	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>	
	<f8is94$tgh$1@sea.gmane.org> <f8j862$st0$1@sea.gmane.org>	
	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>	
	<46AD6E67.50407@v.loewis.de>
	<ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>
Message-ID: <46AE5A9C.5000103@v.loewis.de>

Guido van Rossum schrieb:
>> I found that in many cases, this is a virus scanner or the indexing
>> service interfering. They open the file, and then the test suite cannot
>> delete it.
> 
> Oh darn. I remember running into that in a completely different
> context. What's the solution? Turn off the virus scanner? Wait until
> it's done? 

I never found the time to properly research the official solution.

Looking at the DeleteFile documentation, the problem is slightly
different, still: "The DeleteFile function marks a file for deletion on
close. Therefore, the file deletion does not occur until the last handle
to the file is closed. Subsequent calls to CreateFile to open the file
fail with ERROR_ACCESS_DENIED."

So it is not the DeleteFile that fails, but the subsequent attempt
to create a new file in the same place.

For the test suite, the solution would be to always use a fresh file
name for temporary files. Of course, it is then more important that
all files created actually do get removed in the fixture.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Jul 31 03:18:41 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 13:18:41 +1200
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <18093.18996.944540.279864@montanaro.dyndns.org>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
	<18093.18996.944540.279864@montanaro.dyndns.org>
Message-ID: <46AE8DF1.1030409@canterbury.ac.nz>

skip at pobox.com wrote:
> Given that I find the cascading comparisons clearer I see no reason
> to optimize the "in range(...)" case.

The sort of thing I have in mind is where I have a sequence
that I want to frequently iterate over the indices of, so
I do

   r = xrange(len(myseq))

so I can write

   for i in r:
     ...

Having done that, if I want to test whether some index j
is within the range of indices for this sequence, it
seems natural to write

   if j in r:
     ...

Given the context, I think this is a very Obvious Way To
Do It, and it's surprising that it isn't as efficient as it
looks like it should be.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue Jul 31 03:26:28 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 13:26:28 +1200
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AD3D09.9060006@acm.org>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46AC1E78.7050900@v.loewis.de> <46AD2F60.9050907@canterbury.ac.nz>
	<ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>
	<46AD3D09.9060006@acm.org>
Message-ID: <46AE8FC4.8080602@canterbury.ac.nz>

Talin wrote:
> I believe that converting a Unicode string to a base64 encoded form is 
> necessarily a 2-step process.

Well, yes, but only because base64 itself takes arbitrary
binary data as input, not Unicode strings. Encoding *anything*
other than binary data as base64 is going to require an
extra step in that sense.

> So the fact 
> that you can vary one encoding without changing the other would seem to 
> argue for the notion that they are distinct and independent.

I would say that the first encoding is outside the scope of
base64 and therefore irrelevant to this discussion.

--
Greg


From greg.ewing at canterbury.ac.nz  Tue Jul 31 03:33:43 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 13:33:43 +1200
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <f8jkb8$mre$1@sea.gmane.org>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46ABFE12.5000101@gmail.com> <46AD2BF6.5080507@canterbury.ac.nz>
	<f8jkb8$mre$1@sea.gmane.org>
Message-ID: <46AE9177.4010109@canterbury.ac.nz>

Terry Reedy wrote:
> On the contrary, to me, the point of base64 is to encode bytes into a 
> subset of bytes more or less guaranteed to not get mangled during 
> transport.

Yes, and the way it goes about it is to map the binary
data to a sequence of characters, the reasoning being
that most such channels can at least encode those characters
somehow, because they're designed for the purpose of sending
text.

> That these safe bytes correspond to ascii chars

They only correspond to ASCII character *codes* when the
channel in question is designed to transmit text encoded
in ASCII. If the channel were designed to transmit text
encoded in EBCDIC or some other way, then ASCII codes would
likely get mangled just as badly as raw binary data.

--
Greg

From stephen at xemacs.org  Tue Jul 31 03:54:23 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 31 Jul 2007 10:54:23 +0900
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f8l2l4$2d7$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>
	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>
	<f7lk7q$9m6$1@sea.gmane.org>
	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>
	<f7lnd8$l2s$1@sea.gmane.org>
	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>
	<f8is94$tgh$1@sea.gmane.org> <f8j862$st0$1@sea.gmane.org>
	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>
	<46AD6E67.50407@v.loewis.de> <f8l2l4$2d7$1@sea.gmane.org>
Message-ID: <87myxde3sg.fsf@uwakimon.sk.tsukuba.ac.jp>

Joe Smith writes:

 > Cygwin's setlocale function only supports the "C" locale.
 > I am a bit suprised that ASCII is returned rather than the system's default 
 > encoding.

If I understand the situation correctly, you shouldn't be.  The C
locale is defined to use ASCII.


From greg.ewing at canterbury.ac.nz  Tue Jul 31 03:38:07 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 13:38:07 +1200
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <46AD5DD4.8000509@acm.org>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
	<46AD5DD4.8000509@acm.org>
Message-ID: <46AE927F.8080102@canterbury.ac.nz>

Talin wrote:
> So basically what I would propose is that we simply say that we don't 
> mix normal overloading and multi-method dispatch until PJE comes up with 
> his better solution.

Maybe this should be enforced, i.e. only allow global functions
and class or static methods to be GFs, not regular methods.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue Jul 31 03:45:33 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 13:45:33 +1200
Subject: [Python-3000] test_asyncore fails intermittently on Darwin
In-Reply-To: <2cda2fc90707292340k7eb11f2w82003e6f705438c3@mail.gmail.com>
References: <2cda2fc90707261505tdd9a0f1t861b5801c37ad11e@mail.gmail.com>
	<1d36917a0707261618oac94f20l98f464a2ab1edc4e@mail.gmail.com>
	<2cda2fc90707292338pff060c1i810737dcf6d5df54@mail.gmail.com>
	<2cda2fc90707292340k7eb11f2w82003e6f705438c3@mail.gmail.com>
Message-ID: <46AE943D.1040105@canterbury.ac.nz>

Hasan Diwan wrote:
> The issue seems to be in the socket.py close method. It needs to sleep
> socket.SO_REUSEADDR seconds before returning.

WHAT??? socket.SO_REUSEADDR is a flag that you pass when
creating a socket to tell it to re-use an existing address,
not something to be used as a timeout value, as far as
I know.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue Jul 31 03:58:01 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 13:58:01 +1200
Subject: [Python-3000] io library/PEP 3116 bits
In-Reply-To: <18093.58350.892824.688493@montanaro.dyndns.org>
References: <18093.58350.892824.688493@montanaro.dyndns.org>
Message-ID: <46AE9729.6050507@canterbury.ac.nz>

skip at pobox.com wrote:
> The other thing I wanted to comment on is the default value for n in the
> various read methods.  In some places it's -1 (why not zero? *),

Maybe because reading 0 bytes already has a well-defined
(if not particularly useful) meaning?

You probably wouldn't use it explicitly, but it could
arise as the result of a calculation, and it would then
need to be special-cased if it had a reserved meaning.

> (*) A few days ago at work I saw someone check in a piece of code with
> 
>     f.read(-1)

That does look strange. Maybe the result of someone
reading the docs and failing to notice that there was
an easier spelling.

--
Greg

From talin at acm.org  Tue Jul 31 04:06:48 2007
From: talin at acm.org (Talin)
Date: Mon, 30 Jul 2007 19:06:48 -0700
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070730201511.C14ED3A406B@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2AE31.2080105@canterbury.ac.nz>	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>	<46A3ECB7.9070504@canterbury.ac.nz>	<20070723010750.E27693A40A9@sparrow.telecommunity.com>	<46A453C7.9070407@acm.org>	<20070723153031.D00273A403D@sparrow.telecommunity.com>	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
	<20070730201511.C14ED3A406B@sparrow.telecommunity.com>
Message-ID: <46AE9938.6070802@acm.org>

Phillip J. Eby wrote:

> The principle downside to this approach is that only one metaclass 
> can provide a __prepare__ dictionary, which means it's even more 
> difficult to combine metaclasses than it is in today's Python -- 
> which means I want to give a little more thought to PEP 3115, to see 
> if there is any way to at least emulate the "derived metaclass rule" 
> for __prepare__, that Python currently enforces for the base classes.

I would love any improvements to PEP 3115 that you can think of.

-- Talin


From greg.ewing at canterbury.ac.nz  Tue Jul 31 04:19:44 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 14:19:44 +1200
Subject: [Python-3000] io library/PEP 3116 bits
In-Reply-To: <ca471dc20707301020h49f89131k944fe32036708628@mail.gmail.com>
References: <18093.58350.892824.688493@montanaro.dyndns.org>
	<ca471dc20707301020h49f89131k944fe32036708628@mail.gmail.com>
Message-ID: <46AE9C40.8040003@canterbury.ac.nz>

Guido van Rossum wrote:
> I don't think \r needs to be supported -- OSX uses \n;

Not always. It's still possible to come across situations
where dealing with \r is necessary, when using Classic
applications or OSX ports of them. I think it would be
premature to drop support for \r at this stage.

--
Greg


From guido at python.org  Tue Jul 31 05:41:36 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Jul 2007 20:41:36 -0700
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <46AE8DF1.1030409@canterbury.ac.nz>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
	<18093.18996.944540.279864@montanaro.dyndns.org>
	<46AE8DF1.1030409@canterbury.ac.nz>
Message-ID: <ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>

On 7/30/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> The sort of thing I have in mind is where I have a sequence
> that I want to frequently iterate over the indices of, so
> I do
>
>    r = xrange(len(myseq))
>
> so I can write
>
>    for i in r:
>      ...
>
> Having done that, if I want to test whether some index j
> is within the range of indices for this sequence, it
> seems natural to write
>
>    if j in r:
>      ...
>
> Given the context, I think this is a very Obvious Way To
> Do It, and it's surprising that it isn't as efficient as it
> looks like it should be.

Fair enough. So maybe *you* can contribute a patch?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Tue Jul 31 06:09:47 2007
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 30 Jul 2007 21:09:47 -0700
Subject: [Python-3000] optimizing [x]range
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com><ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com><46ABE025.4050204@async.com.br><18092.34217.512107.677855@montanaro.dyndns.org><46AD30DA.6050405@canterbury.ac.nz><18093.18996.944540.279864@montanaro.dyndns.org><46AE8DF1.1030409@canterbury.ac.nz>
	<ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>
Message-ID: <00b701c7d328$a5cd0d60$f101a8c0@RaymondLaptop1>


>> Having done that, if I want to test whether some index j
>> is within the range of indices for this sequence, it
>> seems natural to write
>>
>>    if j in r:
>>      ...
> 
> Fair enough. So maybe *you* can contribute a patch?

And maybe we can do the same for xrange() in Py2.6


Raymond

From greg.ewing at canterbury.ac.nz  Tue Jul 31 06:29:47 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2007 16:29:47 +1200
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AD52CC.4060403@ronadam.com>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46ABFE12.5000101@gmail.com> <46AD2BF6.5080507@canterbury.ac.nz>
	<46AD52CC.4060403@ronadam.com>
Message-ID: <46AEBABB.5030902@canterbury.ac.nz>

Ron Adam wrote:
> Not extra, you just need to make sure your binary data is in the correct 
> range of values the text device you are sending to can handle.

Does this mean that Py3k text streams will accept byte arrays
in their write() methods, and that byte arrays can be concatenated
with unicode strings and otherwise used in any context expecting
a text string, as long as all their elements are in the ASCII range?

If that's true, then some of my objection is mitigated.

--
Greg

From foom at fuhm.net  Tue Jul 31 07:37:40 2007
From: foom at fuhm.net (James Y Knight)
Date: Tue, 31 Jul 2007 01:37:40 -0400
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <ca471dc20707301024q4b5a82cdjeba1dc7461efc12c@mail.gmail.com>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46AC1E78.7050900@v.loewis.de> <46AD2F60.9050907@canterbury.ac.nz>
	<ca471dc20707291743l5eefd72cx28ca281c451e15ba@mail.gmail.com>
	<46AD3D09.9060006@acm.org> <46AD68A1.8030403@v.loewis.de>
	<46ADEA81.8000609@latte.ca>
	<ca471dc20707301024q4b5a82cdjeba1dc7461efc12c@mail.gmail.com>
Message-ID: <92FDE9DA-D6C1-47CE-807E-ACA0544C7CEE@fuhm.net>

On Jul 30, 2007, at 1:24 PM, Guido van Rossum wrote:
> I think you're missing the point, the point being that the most common
> use needs bytes, so returning bytes is the most useful API design.

I'd say that encoding binary data in XML is at least in the running  
for most common use of base64. And for that use case, you'll need it  
as a text string, I think?

James

From martin at v.loewis.de  Tue Jul 31 08:07:18 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Jul 2007 08:07:18 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <87myxde3sg.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <f7ithr$lrr$1@sea.gmane.org>	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>	<f7lk7q$9m6$1@sea.gmane.org>	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>	<f7lnd8$l2s$1@sea.gmane.org>	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>	<f8is94$tgh$1@sea.gmane.org>
	<f8j862$st0$1@sea.gmane.org>	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>	<46AD6E67.50407@v.loewis.de>
	<f8l2l4$2d7$1@sea.gmane.org>
	<87myxde3sg.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <46AED196.4050506@v.loewis.de>

>  > Cygwin's setlocale function only supports the "C" locale.
>  > I am a bit suprised that ASCII is returned rather than the system's default 
>  > encoding.
> 
> If I understand the situation correctly, you shouldn't be.  The C
> locale is defined to use ASCII.

I think you don't. I'm certain that standard C doesn't define the C
locale to be ASCII, and I believe POSIX doesn't, either. What they do
define is that the "basic execution character set" must be in it
(or some such). However, in absence of better knowledge, assuming
ASCII is the best choice that the library can make.

Regards,
Martin

From ncoghlan at gmail.com  Tue Jul 31 11:40:12 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 31 Jul 2007 19:40:12 +1000
Subject: [Python-3000] pep 3124 plans
In-Reply-To: <20070730201511.C14ED3A406B@sparrow.telecommunity.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>	<46A19FCC.7070609@acm.org>	<20070721181442.48FB03A403A@sparrow.telecommunity.com>	<46A2AE31.2080105@canterbury.ac.nz>	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>	<46A3ECB7.9070504@canterbury.ac.nz>	<20070723010750.E27693A40A9@sparrow.telecommunity.com>	<46A453C7.9070407@acm.org>	<20070723153031.D00273A403D@sparrow.telecommunity.com>	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
	<20070730201511.C14ED3A406B@sparrow.telecommunity.com>
Message-ID: <46AF037C.9050902@gmail.com>

Phillip J. Eby wrote:
> In other words, a class' metaclass has to be a derivative of all its 
> bases' metaclasses; ISTM that a __prepare__ namespace needs to be a 
> derivative in some sense of all its bases' __prepare__ results.  This 
> probably isn't enforceable, but the pattern should be documented such 
> that e.g. the overloading metaclass' __prepare__ would return a 
> mapping that delegates operations to the mapping returned by its 
> super()'s __prepare__, and the actual class creation would be 
> similarly chained.  PEP 3115 probably needs a section to explain 
> these issues and recommend best practices for implementing 
> __prepare__ and class creation on that basis.  I'll write something 
> up after I've thought this through some more.

A variant of the metaclass rule specific to __prepare__ might look 
something like:
   A class's metaclass providing the __prepare__ method must be a 
subclass of all of the class's base classes providing __prepare__ methods.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From skip at pobox.com  Tue Jul 31 12:22:56 2007
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 31 Jul 2007 05:22:56 -0500
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
	<18093.18996.944540.279864@montanaro.dyndns.org>
	<46AE8DF1.1030409@canterbury.ac.nz>
	<ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>
Message-ID: <18095.3456.693480.981533@montanaro.dyndns.org>


    >> if j in r:
    >> ...
    >> 
    >> Given the context, I think this is a very Obvious Way To Do It, and
    >> it's surprising that it isn't as efficient as it looks like it should
    >> be.

    Guido> Fair enough. So maybe *you* can contribute a patch?

Given the nature of this discussion and who you're asking to provide a
patch, I'd rather see a patch for this:

    Python 3.0x (py3k-struni:56553M, Jul 26 2007, 13:34:26) 
    [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> for 0 <= i < 10 by 3:
    ...    print(i)
    ...
    0
    3
    6
    9

:-)  (Yes, I know the language is frozen at this point.)

Also, bringing it back more on-topic, what should the value of this
expression be?

    4 in range(0, 10, 3)

That is, are we treating range() as a set or an interval?  Maybe I missed
earlier messages in this thread where this was discussed, but part of the
discussion focused on this construct

    0 <= 4 < 10

where there was no option to provide a step size.  Also, this particular
notation screams out interval, not set, to me.

Skip

From pje at telecommunity.com  Tue Jul 31 18:26:35 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 31 Jul 2007 12:26:35 -0400
Subject: [Python-3000]  PEP 3115 chaining rules (was Re: pep 3124 plans)
In-Reply-To: <46AF037C.9050902@gmail.com>
References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com>
	<46A19FCC.7070609@acm.org>
	<20070721181442.48FB03A403A@sparrow.telecommunity.com>
	<46A2AE31.2080105@canterbury.ac.nz>
	<20070722020422.5AAAC3A403A@sparrow.telecommunity.com>
	<46A3ECB7.9070504@canterbury.ac.nz>
	<20070723010750.E27693A40A9@sparrow.telecommunity.com>
	<46A453C7.9070407@acm.org>
	<20070723153031.D00273A403D@sparrow.telecommunity.com>
	<5d44f72f0707262227o6fcf8471ja6654910c7ee07e0@mail.gmail.com>
	<ca471dc20707270825j3e53c11dyb2064468f3665c14@mail.gmail.com>
	<20070727162212.60E2F3A40E6@sparrow.telecommunity.com>
	<20070730201511.C14ED3A406B@sparrow.telecommunity.com>
	<46AF037C.9050902@gmail.com>
Message-ID: <20070731162912.3E2C53A40A7@sparrow.telecommunity.com>

At 07:40 PM 7/31/2007 +1000, Nick Coghlan wrote:
>Phillip J. Eby wrote:
>>In other words, a class' metaclass has to be a derivative of all 
>>its bases' metaclasses; ISTM that a __prepare__ namespace needs to 
>>be a derivative in some sense of all its bases' __prepare__ 
>>results.  This probably isn't enforceable, but the pattern should 
>>be documented such that e.g. the overloading metaclass' __prepare__ 
>>would return a mapping that delegates operations to the mapping 
>>returned by its super()'s __prepare__, and the actual class 
>>creation would be similarly chained.  PEP 3115 probably needs a 
>>section to explain these issues and recommend best practices for 
>>implementing __prepare__ and class creation on that basis.  I'll 
>>write something up after I've thought this through some more.
>
>A variant of the metaclass rule specific to __prepare__ might look 
>something like:
>   A class's metaclass providing the __prepare__ method must be a 
> subclass of all of the class's base classes providing __prepare__ methods.

That doesn't really work; among other things, it would require 
everything to be a dict subclass, since type.__prepare__() will 
presumably return a dict.  Therefore, it really does need to be 
delegation instead of inheritance, or it becomes very difficult to 
provide any "interesting" properties.

So let's say that your super().__prepare__() is your "delegate".  And 
we recommend that any write operations you receive, you should also 
invoke on your delegate, and that you delegate any read operations 
you can't handle (i.e., key not found) to your delegate as well.

And of course, this requirement is recursive -- i.e., all metaclasses 
that define a __prepare__() should follow it, in order to be fully 
co-operative.

Actually, speaking of co-operative metaclasses, I wonder if it's time 
to finally implement automatic metaclass mixing in 3.x?  Python 
currently requires you to mix base classes' metaclasses, but doesn't 
provide any assistance in doing so.

For 2.x, I wrote a function that can be called to automatically 
generate a mixed metaclass for a type; perhaps we should include 
something like it in the stdlib, so if you get a mixed metaclasses 
error, the error message itself can suggest using 'metaclass=mixed' 
or whatever we call it.


From unknown_kev_cat at hotmail.com  Tue Jul 31 19:21:41 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Tue, 31 Jul 2007 13:21:41 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org>	<ca471dc20707181002w64e076aco9a509ec7e4e15b9a@mail.gmail.com>	<f7lk7q$9m6$1@sea.gmane.org>	<ca471dc20707181113m360db736h2fd079f29f71220@mail.gmail.com>	<f7lnd8$l2s$1@sea.gmane.org>	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>	<f8is94$tgh$1@sea.gmane.org>
	<f8j862$st0$1@sea.gmane.org>	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>	<46AD6E67.50407@v.loewis.de><ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>
	<46AE5A9C.5000103@v.loewis.de>
Message-ID: <f8nr3b$s25$1@sea.gmane.org>


""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:46AE5A9C.5000103 at v.loewis.de...
> Guido van Rossum schrieb:
>>> I found that in many cases, this is a virus scanner or the indexing
>>> service interfering. They open the file, and then the test suite cannot
>>> delete it.
>>
>> Oh darn. I remember running into that in a completely different
>> context. What's the solution? Turn off the virus scanner? Wait until
>> it's done?
>
> I never found the time to properly research the official solution.
>
> Looking at the DeleteFile documentation, the problem is slightly
> different, still: "The DeleteFile function marks a file for deletion on
> close. Therefore, the file deletion does not occur until the last handle
> to the file is closed. Subsequent calls to CreateFile to open the file
> fail with ERROR_ACCESS_DENIED."
>
> So it is not the DeleteFile that fails, but the subsequent attempt
> to create a new file in the same place.
>
> For the test suite, the solution would be to always use a fresh file
> name for temporary files. Of course, it is then more important that
> all files created actually do get removed in the fixture.

Hmm... The documentation for Cygwin's unlink() implies that it should 
function the same as a POSIX unlink() except perhaps if a non-Cygwin process 
has an open handle for it without the correct attributes. I see nothing on 
my system that would have done that. (No indexing service or virus scanner) 
So that implies that at the time Python is trying to create the file, it 
still has an open handle for it. Either that, or something besides Python is 
opening the file without my knowledge.


From guido at python.org  Tue Jul 31 20:06:58 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 31 Jul 2007 11:06:58 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f8nr3b$s25$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org> <f7lnd8$l2s$1@sea.gmane.org>
	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>
	<f8is94$tgh$1@sea.gmane.org> <f8j862$st0$1@sea.gmane.org>
	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>
	<46AD6E67.50407@v.loewis.de>
	<ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>
	<46AE5A9C.5000103@v.loewis.de> <f8nr3b$s25$1@sea.gmane.org>
Message-ID: <ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>

On 7/31/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
> Hmm... The documentation for Cygwin's unlink() implies that it should
> function the same as a POSIX unlink() except perhaps if a non-Cygwin process
> has an open handle for it without the correct attributes. I see nothing on
> my system that would have done that. (No indexing service or virus scanner)
> So that implies that at the time Python is trying to create the file, it
> still has an open handle for it. Either that, or something besides Python is
> opening the file without my knowledge.

Regular Windows typically won't let you remove a file when you still
have it open. Is this also a restriction on CYGWIN? I don't know
anything about CYGWIN but I could imagine that they allow unlink() to
succeed when there's still a file descriptor referencing it, and that
they will delete the file when you close it. But if that fd is never
closed the file is probably in  weird state. Anyway, before we start
speculating more, you probably need to find a source of more CYGWIN
expertise elsewhere -- it's rather thin here.

Rewriting those tests to use a mroe random temporary file might also
be an option, as long as you make sure to clean up (use try/finally or
setUp/tearDown).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Jul 31 20:11:35 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 31 Jul 2007 11:11:35 -0700
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <18095.3456.693480.981533@montanaro.dyndns.org>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
	<18093.18996.944540.279864@montanaro.dyndns.org>
	<46AE8DF1.1030409@canterbury.ac.nz>
	<ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>
	<18095.3456.693480.981533@montanaro.dyndns.org>
Message-ID: <ca471dc20707311111w8bf7b09qed98e72ca3f7707b@mail.gmail.com>

On 7/31/07, skip at pobox.com <skip at pobox.com> wrote:
> Also, bringing it back more on-topic, what should the value of this
> expression be?
>
>     4 in range(0, 10, 3)
>
> That is, are we treating range() as a set or an interval?  Maybe I missed
> earlier messages in this thread where this was discussed, but part of the
> discussion focused on this construct
>
>     0 <= 4 < 10
>
> where there was no option to provide a step size.  Also, this particular
> notation screams out interval, not set, to me.

You missed it -- it should definitely be equivalent to

    4 in list(range(0, 10, 3))

i.e.

    4 in [0, 4, 8]

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Jul 31 20:13:38 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 31 Jul 2007 11:13:38 -0700
Subject: [Python-3000] base64 - bytes and strings
In-Reply-To: <46AEBABB.5030902@canterbury.ac.nz>
References: <3f1451f50707281847q2171f82fu2e48f2297214f591@mail.gmail.com>
	<46ABFE12.5000101@gmail.com> <46AD2BF6.5080507@canterbury.ac.nz>
	<46AD52CC.4060403@ronadam.com> <46AEBABB.5030902@canterbury.ac.nz>
Message-ID: <ca471dc20707311113g342a30b0h23ee7e2b8f5f630f@mail.gmail.com>

On 7/30/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Does this mean that Py3k text streams will accept byte arrays
> in their write() methods, and that byte arrays can be concatenated
> with unicode strings and otherwise used in any context expecting
> a text string, as long as all their elements are in the ASCII range?

No, that is not the intention (even if some of that may accidentally
be supported in the current pre-alpha branch).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From dalcinl at gmail.com  Tue Jul 31 20:18:37 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Tue, 31 Jul 2007 15:18:37 -0300
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <18095.3456.693480.981533@montanaro.dyndns.org>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
	<18093.18996.944540.279864@montanaro.dyndns.org>
	<46AE8DF1.1030409@canterbury.ac.nz>
	<ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>
	<18095.3456.693480.981533@montanaro.dyndns.org>
Message-ID: <e7ba66e40707311118l2e5d6b93k82bde7e7a3ea30c2@mail.gmail.com>

On 7/31/07, skip at pobox.com <skip at pobox.com> wrote:
> Also, bringing it back more on-topic, what should the value of this
> expression be?
>     4 in range(0, 10, 3)
> That is, are we treating range() as a set or an interval?

IMHO, 'range' is a like a set of integers, not an interval.  For me,
'x in range(...)' sould return the same as 'x in list(range(...))'.


-- 
Lisandro Dalc?n

From skip at pobox.com  Tue Jul 31 20:22:04 2007
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 31 Jul 2007 13:22:04 -0500
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707311111w8bf7b09qed98e72ca3f7707b@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
	<18093.18996.944540.279864@montanaro.dyndns.org>
	<46AE8DF1.1030409@canterbury.ac.nz>
	<ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>
	<18095.3456.693480.981533@montanaro.dyndns.org>
	<ca471dc20707311111w8bf7b09qed98e72ca3f7707b@mail.gmail.com>
Message-ID: <18095.32204.309127.375372@montanaro.dyndns.org>


    Guido> You missed it -- it should definitely be equivalent to

    Guido>     4 in list(range(0, 10, 3))

    Guido> i.e.

    Guido>     4 in [0, 4, 8]

Ummm... you mean

    4 in [0, 3, 6, 9]

right? <wink>

Skip

From dalcinl at gmail.com  Tue Jul 31 20:31:12 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Tue, 31 Jul 2007 15:31:12 -0300
Subject: [Python-3000] optimizing [x]range
In-Reply-To: <ca471dc20707311111w8bf7b09qed98e72ca3f7707b@mail.gmail.com>
References: <1d85506f0707280806n1764151cx4961a0573dda435e@mail.gmail.com>
	<ca471dc20707281504g704ff836m8aeaa966559483d3@mail.gmail.com>
	<46ABE025.4050204@async.com.br>
	<18092.34217.512107.677855@montanaro.dyndns.org>
	<46AD30DA.6050405@canterbury.ac.nz>
	<18093.18996.944540.279864@montanaro.dyndns.org>
	<46AE8DF1.1030409@canterbury.ac.nz>
	<ca471dc20707302041j7c834590xf0d315a2f3d3baaa@mail.gmail.com>
	<18095.3456.693480.981533@montanaro.dyndns.org>
	<ca471dc20707311111w8bf7b09qed98e72ca3f7707b@mail.gmail.com>
Message-ID: <e7ba66e40707311131k17eb946bpdde30b5223916112@mail.gmail.com>

On 7/31/07, Guido van Rossum <guido at python.org> wrote:
> You missed it -- it should definitely be equivalent to
>     4 in list(range(0, 10, 3))
> i.e.
>     4 in [0, 4, 8]

An then, as list/tuple __contains__ is implemented in terms of rich
comparison (with Py_EQ), perhaps a patch is not so easy to be
implemented, at first it do not seems to be as trivial as previously
suggested in this thread.

-- 
Lisandro Dalc?n

From unknown_kev_cat at hotmail.com  Tue Jul 31 20:34:27 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Tue, 31 Jul 2007 14:34:27 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org>
	<f7lnd8$l2s$1@sea.gmane.org><ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com><f8is94$tgh$1@sea.gmane.org>
	<f8j862$st0$1@sea.gmane.org><ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com><46AD6E67.50407@v.loewis.de><ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com><46AE5A9C.5000103@v.loewis.de>
	<f8nr3b$s25$1@sea.gmane.org>
	<ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>
Message-ID: <f8nvbm$bcs$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc20707311106v696b3c7an67939bd802b81176 at mail.gmail.com...
> On 7/31/07, Joe Smith <unknown_kev_cat at hotmail.com> wrote:
>> Hmm... The documentation for Cygwin's unlink() implies that it should
>> function the same as a POSIX unlink() except perhaps if a non-Cygwin 
>> process
>> has an open handle for it without the correct attributes. I see nothing 
>> on
>> my system that would have done that. (No indexing service or virus 
>> scanner)
>> So that implies that at the time Python is trying to create the file, it
>> still has an open handle for it. Either that, or something besides Python 
>> is
>> opening the file without my knowledge.
>
> Regular Windows typically won't let you remove a file when you still
> have it open.

My understanding is that POSIX does not require that ability.

> Is this also a restriction on CYGWIN? I don't know
> anything about CYGWIN but I could imagine that they allow unlink() to
> succeed when there's still a file descriptor referencing it, and that
> they will delete the file when you close it.

Exactly. That is exactly what they do.

The claim was that this meets the POSIX standard.
Looking closely, it looks like it does not.

POSIX says:

>When the file's link count becomes 0 and no process has the file open,
>the space occupied by the file shall be freed and the file shall no longer
>be accessible. If one or more processes have the file open when the last
>link is removed, the link shall be removed before unlink() returns, but
>the removal of the file contents shall be postponed until all references
>to the file are closed.


>But if that fd is never
> closed the file is probably in  weird state. Anyway, before we start
> speculating more, you probably need to find a source of more CYGWIN
> expertise elsewhere -- it's rather thin here.

Exactly the issue.
I see the problem here is cygwin's partial POSIX complience. However,
Windows NT had a design goal of allowing a complient implementation
of POSIX to be implmented in a subsystem (along with userespace utilities).

So it should be possible to get unlink() to work as like a POSIX unlink 
using raw NT kernel calls.
Since Cygwin has dropped support for pre-NT systems, swithing to that seems 
to be the correct thing to do.

I'll discuss this with the cygwin team.

Regardless, the exact same issue will likely exist on the windows side.
It seems likely that a fix for the Windows side may fix the cygwin issue.

>
> Rewriting those tests to use a mroe random temporary file might also
> be an option, as long as you make sure to clean up (use try/finally or
> setUp/tearDown).
>
> -- 
> --Guido van Rossum (home page: http://www.python.org/~guido/) 


From unknown_kev_cat at hotmail.com  Tue Jul 31 21:21:50 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Tue, 31 Jul 2007 15:21:50 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org><f7lnd8$l2s$1@sea.gmane.org><ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com><f8is94$tgh$1@sea.gmane.org><f8j862$st0$1@sea.gmane.org><ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com><46AD6E67.50407@v.loewis.de><ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com><46AE5A9C.5000103@v.loewis.de><f8nr3b$s25$1@sea.gmane.org><ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>
	<f8nvbm$bcs$1@sea.gmane.org>
Message-ID: <f8o24h$ku5$1@sea.gmane.org>


"Joe Smith" <unknown_kev_cat at hotmail.com> wrote in message 
news:f8nvbm$bcs$1 at sea.gmane.org...
>
> Exactly the issue.
> I see the problem here is cygwin's partial POSIX complience. However,
> Windows NT had a design goal of allowing a complient implementation
> of POSIX to be implmented in a subsystem (along with userespace 
> utilities).
>
> So it should be possible to get unlink() to work as like a POSIX unlink
> using raw NT kernel calls.
> Since Cygwin has dropped support for pre-NT systems, swithing to that 
> seems
> to be the correct thing to do.
>
> I'll discuss this with the cygwin team.
>
> Regardless, the exact same issue will likely exist on the windows side.
> It seems likely that a fix for the Windows side may fix the cygwin issue.
>

Looks like the fix needed for cygwin's unlink was checked in two days ago.
The problem should automatically disappear in the next cygwin release. 


From martin at v.loewis.de  Tue Jul 31 21:33:12 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Jul 2007 21:33:12 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>
References: <f7ithr$lrr$1@sea.gmane.org>
	<f7lnd8$l2s$1@sea.gmane.org>	<ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com>	<f8is94$tgh$1@sea.gmane.org>
	<f8j862$st0$1@sea.gmane.org>	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>	<46AD6E67.50407@v.loewis.de>	<ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>	<46AE5A9C.5000103@v.loewis.de>
	<f8nr3b$s25$1@sea.gmane.org>
	<ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>
Message-ID: <46AF8E78.6020607@v.loewis.de>

> Regular Windows typically won't let you remove a file when you still
> have it open.

It depends. If FILE_SHARE_DELETE was passed to CreateFile when opening,
you may DeleteFile it while it is still open. Otherwise, you get an
error from DeleteFile.

> Is this also a restriction on CYGWIN?

Cygwin is a wrapper around Win32. So it "can't do" anything that Win32
can't do (like deleting a file that is still open).

> I don't know
> anything about CYGWIN but I could imagine that they allow unlink() to
> succeed when there's still a file descriptor referencing it, and that
> they will delete the file when you close it.

They can't do that, because there is no Win32 mechanism for that.

Regards,
martin

From martin at v.loewis.de  Tue Jul 31 21:42:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Jul 2007 21:42:54 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f8nvbm$bcs$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>	<f7lnd8$l2s$1@sea.gmane.org><ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com><f8is94$tgh$1@sea.gmane.org>	<f8j862$st0$1@sea.gmane.org><ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com><46AD6E67.50407@v.loewis.de><ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com><46AE5A9C.5000103@v.loewis.de>	<f8nr3b$s25$1@sea.gmane.org>	<ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>
	<f8nvbm$bcs$1@sea.gmane.org>
Message-ID: <46AF90BE.3050803@v.loewis.de>

>> Is this also a restriction on CYGWIN? I don't know
>> anything about CYGWIN but I could imagine that they allow unlink() to
>> succeed when there's still a file descriptor referencing it, and that
>> they will delete the file when you close it.
> 
> Exactly. That is exactly what they do.

Not exactly; it's not possible with Win32 to do that.

What they do instead is
1. try to delete the file. If that fails for sharing
   violation, try 2.
2. move the file to the recycle bin, and set the
   "delete" disposition flag on the file, this will
   cause it to be removed from the recycle bin when
   the last handle is closed.

Regards,
Martin

From guido at python.org  Tue Jul 31 21:58:23 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 31 Jul 2007 12:58:23 -0700
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <46AF90BE.3050803@v.loewis.de>
References: <f7ithr$lrr$1@sea.gmane.org> <f8j862$st0$1@sea.gmane.org>
	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>
	<46AD6E67.50407@v.loewis.de>
	<ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>
	<46AE5A9C.5000103@v.loewis.de> <f8nr3b$s25$1@sea.gmane.org>
	<ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>
	<f8nvbm$bcs$1@sea.gmane.org> <46AF90BE.3050803@v.loewis.de>
Message-ID: <ca471dc20707311258l26f0ab6apb157464e3db16496@mail.gmail.com>

On 7/31/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> Is this also a restriction on CYGWIN? I don't know
> >> anything about CYGWIN but I could imagine that they allow unlink() to
> >> succeed when there's still a file descriptor referencing it, and that
> >> they will delete the file when you close it.
> >
> > Exactly. That is exactly what they do.
>
> Not exactly; it's not possible with Win32 to do that.
>
> What they do instead is
> 1. try to delete the file. If that fails for sharing
>    violation, try 2.
> 2. move the file to the recycle bin, and set the
>    "delete" disposition flag on the file, this will
>    cause it to be removed from the recycle bin when
>    the last handle is closed.

I don't understand how that approach would cause the permission error
when trying to create the same file later again. Unless (a) I don't
understand the phrase "move it to the recycle bin" (is this a rename()
call?), or (b) you're describing the new version that was submitted 2
days ago (but not yet released).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From unknown_kev_cat at hotmail.com  Tue Jul 31 22:02:33 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Tue, 31 Jul 2007 16:02:33 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org>	<f7lnd8$l2s$1@sea.gmane.org><ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com><f8is94$tgh$1@sea.gmane.org>	<f8j862$st0$1@sea.gmane.org><ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com><46AD6E67.50407@v.loewis.de><ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com><46AE5A9C.5000103@v.loewis.de>	<f8nr3b$s25$1@sea.gmane.org>	<ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com><f8nvbm$bcs$1@sea.gmane.org>
	<46AF90BE.3050803@v.loewis.de>
Message-ID: <f8o4gs$tbu$1@sea.gmane.org>


""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:46AF90BE.3050803 at v.loewis.de...
>>> Is this also a restriction on CYGWIN? I don't know
>>> anything about CYGWIN but I could imagine that they allow unlink() to
>>> succeed when there's still a file descriptor referencing it, and that
>>> they will delete the file when you close it.
>>
>> Exactly. That is exactly what they do.
> Not exactly; it's not possible with Win32 to do that.

Um. It is indeed possible to mark a file for deletion on close. The 
requirement is that all file handles have SHARED_DELETE. This is one of the 
things Cywin has tried. It works fine except when a Windows app has opened 
the file without that flag.
To prevent the name clashes, movment to the recycle bin is required.

> What they do instead is
> 1. try to delete the file. If that fails for sharing
>   violation, try 2.
> 2. move the file to the recycle bin, and set the
>   "delete" disposition flag on the file, this will
>   cause it to be removed from the recycle bin when
>   the last handle is closed.

That is what they do with the latest patches. It is pretty much equivent to 
the POSIX system.
That requires Native NT Calls, and is not part of win32. It is equivlent to 
marking the file for
deletion on close, except the other handles do not need to have 
shared_delete.
The moving the file to the recycle bin just gets the file out of the way.

But for what it is worth, the next cygwin release will be doing exactly what 
is described above.
So to Python it will look and act *almost* exactly like POSIX. It should fix 
the problem.

GVR: The move to recycle bin is more or less a rename() call, except I 
belive it has special support for avoiding name conflicts. 


From unknown_kev_cat at hotmail.com  Tue Jul 31 22:06:14 2007
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Tue, 31 Jul 2007 16:06:14 -0400
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
References: <f7ithr$lrr$1@sea.gmane.org><f7lnd8$l2s$1@sea.gmane.org><ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com><f8is94$tgh$1@sea.gmane.org><f8j862$st0$1@sea.gmane.org><ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com><46AD6E67.50407@v.loewis.de><ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com><46AE5A9C.5000103@v.loewis.de><f8nr3b$s25$1@sea.gmane.org><ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com><f8nvbm$bcs$1@sea.gmane.org>
	<f8o24h$ku5$1@sea.gmane.org>
Message-ID: <f8o4no$u4b$1@sea.gmane.org>


"Joe Smith" <unknown_kev_cat at hotmail.com> wrote in message 
news:f8o24h$ku5$1 at sea.gmane.org...
>
> "Joe Smith" <unknown_kev_cat at hotmail.com> wrote in message
> news:f8nvbm$bcs$1 at sea.gmane.org...
>>
>> Exactly the issue.
>> I see the problem here is cygwin's partial POSIX complience. However,
>> Windows NT had a design goal of allowing a complient implementation
>> of POSIX to be implmented in a subsystem (along with userespace
>> utilities).
>>
>> So it should be possible to get unlink() to work as like a POSIX unlink
>> using raw NT kernel calls.
>> Since Cygwin has dropped support for pre-NT systems, swithing to that
>> seems
>> to be the correct thing to do.
>>
>> I'll discuss this with the cygwin team.
>>
>> Regardless, the exact same issue will likely exist on the windows side.
>> It seems likely that a fix for the Windows side may fix the cygwin issue.
>>
>
> Looks like the fix needed for cygwin's unlink was checked in two days ago.
> The problem should automatically disappear in the next cygwin release.

Sorry for misinformation. It looks like it has been changed for more than 2 
days, but 2 days
is the date of the most recent change. Regardless it looks like the code 
that does the right thing is not in the latest released DLL. 


From martin at v.loewis.de  Tue Jul 31 22:35:26 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Jul 2007 22:35:26 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <f8o4gs$tbu$1@sea.gmane.org>
References: <f7ithr$lrr$1@sea.gmane.org>	<f7lnd8$l2s$1@sea.gmane.org><ca471dc20707181158p17417c9cg37c5382d61b53fe5@mail.gmail.com><f8is94$tgh$1@sea.gmane.org>	<f8j862$st0$1@sea.gmane.org><ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com><46AD6E67.50407@v.loewis.de><ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com><46AE5A9C.5000103@v.loewis.de>	<f8nr3b$s25$1@sea.gmane.org>	<ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com><f8nvbm$bcs$1@sea.gmane.org>	<46AF90BE.3050803@v.loewis.de>
	<f8o4gs$tbu$1@sea.gmane.org>
Message-ID: <46AF9D0E.4060700@v.loewis.de>

> That is what they do with the latest patches. It is pretty much
> equivent to the POSIX system. That requires Native NT Calls, and is
> not part of win32. It is equivlent to marking the file for deletion
> on close, except the other handles do not need to have shared_delete.
>  The moving the file to the recycle bin just gets the file out of the
> way.

On a true POSIX system, this would not be necessary: you can immediately
create a new file in place of the previous one after you're done
with unlink, and there is no way to get the file back - but there
is on Windows (go to the recycle bin).

> But for what it is worth, the next cygwin release will be doing
> exactly what is described above. 

Indeed, I reported what the Cygwin code does (or will do once
released).

Regards,
Martin


From martin at v.loewis.de  Tue Jul 31 22:37:05 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 31 Jul 2007 22:37:05 +0200
Subject: [Python-3000] Py3k_struni additional test failures under cygwin
In-Reply-To: <ca471dc20707311258l26f0ab6apb157464e3db16496@mail.gmail.com>
References: <f7ithr$lrr$1@sea.gmane.org> <f8j862$st0$1@sea.gmane.org>	
	<ca471dc20707291727q6438b6eax5eaadbb6c712ef29@mail.gmail.com>	
	<46AD6E67.50407@v.loewis.de>	
	<ca471dc20707301027m71b91ffbnbfb31a9075e66d8b@mail.gmail.com>	
	<46AE5A9C.5000103@v.loewis.de> <f8nr3b$s25$1@sea.gmane.org>	
	<ca471dc20707311106v696b3c7an67939bd802b81176@mail.gmail.com>	
	<f8nvbm$bcs$1@sea.gmane.org> <46AF90BE.3050803@v.loewis.de>
	<ca471dc20707311258l26f0ab6apb157464e3db16496@mail.gmail.com>
Message-ID: <46AF9D71.6090804@v.loewis.de>


>> What they do instead is
>> 1. try to delete the file. If that fails for sharing
>>    violation, try 2.
>> 2. move the file to the recycle bin, and set the
>>    "delete" disposition flag on the file, this will
>>    cause it to be removed from the recycle bin when
>>    the last handle is closed.
> 
> I don't understand how that approach would cause the permission error
> when trying to create the same file later again. Unless (a) I don't
> understand the phrase "move it to the recycle bin" (is this a rename()
> call?), or (b) you're describing the new version that was submitted 2
> days ago (but not yet released).

The latter - I just looked into the CVS tree to find out what they
do.

Regards,
Martin