From l.mastrodomenico at gmail.com Sun Jul 1 00:18:53 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Sun, 1 Jul 2007 00:18:53 +0200 Subject: [Python-3000] PEP 368: Standard image protocol and class Message-ID: Hi everyone, I have submitted a new PEP: http://www.python.org/dev/peps/pep-0368/ It starts from a Pete Shinners' suggestion and from the consideration that there are a lot of Python libraries that use image objects, but almost all of them have implemented their own image classes, incompatible with everyone else's (and often not very pythonic). The PEP tries to improve the situation by defining a standard image protocol: in practice this is a definition of how a minimal "image-like" object should look and act in Python. Its details are carefully chosen to allow existing image classes in Tkinter, PIL, wxPython and pygame to implement it without breaking backward compatibility with their existing user bases. It also proposes the inclusion in the standard library of a fast and efficient default implementation of the new protocol. The PEP is long and detailed, but it's not in any way meant to be a take-it-or-leave-it deal: I'm open to any change, even radical, to improve it. It isn't py3k-specific (and it has a low number), but I posted here anyway because IMHO the main question is if and how to include this in Python 3.0; then, if the PEP is accepted, I'll backport the new classes to Python 2.6. Any suggestion or criticism is welcome; I'll also solicit feedback from external libraries developers that might be interested in implementing the new protocol. Regards -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From robert.kern at gmail.com Sun Jul 1 00:33:07 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 30 Jun 2007 17:33:07 -0500 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: References: Message-ID: Lino Mastrodomenico wrote: > Hi everyone, > > I have submitted a new PEP: > > http://www.python.org/dev/peps/pep-0368/ > > It starts from a Pete Shinners' suggestion and from the consideration > that there are a lot of Python libraries that use image objects, but > almost all of them have implemented their own image classes, > incompatible with everyone else's (and often not very pythonic). Could you build this on top of the new buffer protocol that we're working on? http://www.python.org/dev/peps/pep-3118/ Enabling this kind of data sharing is precisely what the new buffer type is intended for. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sun Jul 1 00:36:08 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 30 Jun 2007 17:36:08 -0500 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: References: Message-ID: Robert Kern wrote: > Lino Mastrodomenico wrote: >> Hi everyone, >> >> I have submitted a new PEP: >> >> http://www.python.org/dev/peps/pep-0368/ >> >> It starts from a Pete Shinners' suggestion and from the consideration >> that there are a lot of Python libraries that use image objects, but >> almost all of them have implemented their own image classes, >> incompatible with everyone else's (and often not very pythonic). > > Could you build this on top of the new buffer protocol that we're working on? > > http://www.python.org/dev/peps/pep-3118/ Never mind. I found the reference in your PEP. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From l.mastrodomenico at gmail.com Sun Jul 1 03:00:29 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Sun, 1 Jul 2007 03:00:29 +0200 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: References: Message-ID: Here's the full text of the PEP's current draft, so you can comment directly on it (thanks to Collin Winter for the suggestion): PEP: 368 Title: Standard image protocol and class Version: $Revision: 56133 $ Last-Modified: $Date: 2007-06-30 21:07:03 +0200 (sab, 30 giu 2007) $ Author: Lino Mastrodomenico Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Jun-2007 Python-Version: 2.6, 3.0 Post-History: Abstract ======== The current situation of image storage and manipulation in the Python world is extremely fragmented: almost every library that uses image objects has implemented its own image class, incompatible with everyone else's and often not very pythonic. A basic RGB image class exists in the standard library (``Tkinter.PhotoImage``), but is pretty much unusable, and unused, for anything except Tkinter programming. This fragmentation not only takes up valuable space in the developers minds, but also makes the exchange of images between different libraries (needed in relatively common use cases) slower and more complex than it needs to be. This PEP proposes to improve the situation by defining a simple and pythonic image protocol/interface that can be hopefully accepted and implemented by existing image classes inside and outside the standard library *without breaking backward compatibility* with their existing user bases. In practice this is a definition of how a minimal *image-like* object should look and act (in a similar way to the ``read()`` and ``write()`` methods in *file-like* objects). The inclusion in the standard library of a class that provides basic image manipulation functionality and implements the new protocol is also proposed, together with a mixin class that helps adding support for the protocol to existing image classes. Rationale ========= A good way to have high quality modules ready for inclusion in the Python standard library is to simply wait for natural selection among competing external libraries to provide a clear winner with useful functionality and a big user base. Then the de-facto standard can be officially sanctioned by including it in the standard library. Unfortunately this approach hasn't worked well for the creation of a dominant image class in the Python world: almost every third-party library that requires an image object creates its own class incompatible with the ones from other libraries. This is a real problem because it's entirely reasonable for a program to create and manipulate an image using, e.g., PIL (the Python Imaging Library) and then display it using wxPython or pygame. But these libraries have different and incompatible image classes, and the usual solution is to manually "export" an image from the source to a (width, height, bytes_string) tuple and "import" it creating a new instance in the target format. This approach *works*, but is both uglier and slower than it needs to be. Another "solution" that has been sometimes used is the creation of specific adapters and/or converters from a class to another (e.g. PIL offers the ``ImageTk`` module for converting PIL images to a class compatible with the Tkinter one). But this approach doesn't scale well with the number of libraries involved and it's still annoying for the user: if I have a perfectly good image object why should I convert before passing it to the next method, why can't it simply accept my image as-is? The problem isn't by any stretch limited to the three mentioned libraries and has probably multiple causes, including two that IMO are very important to understand before solving it: * in today's computing world an image is a basic type not strictly tied to a specific domain. This is why there will never be a clear winner between the image classes from the three libraries mentioned above (PIL, wxPython and pygame): they cover different domains and don't really compete with each other; * the Python standard library has never provided a good image class that can be adopted or imitated by third part modules. ``Tkinter.PhotoImage`` provides basic RGB functionality, but it's by far the slowest and ugliest of the bunch and it can be instantiated only after the Tkinter root window has been created. This PEP tries to improve this situation in four ways: 1. It defines a simple and pythonic image protocol/interface (both on the Python and the C side) that can be hopefully accepted and implemented by existing image classes inside and outside the standard library *without breaking backward compatibility* with their existing user bases. 2. It proposes the inclusion in the standard library of three new classes: * ``ImageMixin`` provides almost everything necessary to implement the new protocol; its main purpose is to make as simple as possible to support this interface for existing libraries, in some cases as simple as adding it to the list of base classes and doing minor additions to the constructor. * ``Image`` is a subclass of ``ImageMixin`` and will add a constructor that can resize and/or convert an image between different pixel formats. This is intended to provide a fast and efficient default implementation of the new protocol. * ``ImageSize`` is a minor helper class. See below for details. 3. ``Tkinter.PhotoImage`` will implement the new protocol (mostly through the ``ImageMixin`` class) and all the Tkinter methods that can receive an image will be modified the accept any object that implements the interface. As an aside the author of this PEP will collaborate with the developers of the most common external libraries to achieve the same goal (supporting the protocol in their classes and accepting any class that implements it). 4. New ``PyImage_*`` functions will be added to the CPython C API: they implement the C side of the protocol and accept as first parameter **any** object that supports it, even if it isn't an instance of the ``Image``/``ImageMixin`` classes. The main effects for the end user will be a simplification of the interchange of images between different libraries (if everything goes well, any Python library will accept images from any other library) and the out-of-the-box availability of the new ``Image`` class. The new class is intended to cover simple but common use cases like cropping and/or resizing a photograph to the desired size and passing it an appropriate widget for displaying it on a window, or darkening a texture and passing it to a 3D library. The ``Image`` class is not intended to replace or compete with PIL, Pythonmagick or NumPy, even if it provides a (very small) subset of the functionality of these three libraries. In particular PIL offers very rich image manipulation features with *dozens* of classes, filters, transformations and file formats. The inclusion of PIL (or something similar) in the standard library may, or may not, be a worthy goal but it's completely outside the scope of this PEP. Specification ============= The ``imageop`` module is used as the *default* location for the new classes and objects because it has for a long time hosted functions that provided a somewhat similar functionality, but a new module may be created if preferred (e.g. a new "``image``" or "``media``" module; the latter may eventually include other multimedia classes). ``MODES`` is a new module level constant: it is a set of the pixel formats supported by the ``Image`` class. Any image object that implements the new protocol is guaranteed to be formatted in one of these modes, but libraries that accept images are allowed to support only a subset of them. These modes are in turn also available as module level constants (e.g. ``imageop.RGB``). The following table is a summary of the modes currently supported and their properties: ========= =============== ========= =========== ====================== Name Component Bits per Subsampling Valid names component intervals ========= =============== ========= =========== ====================== L l (lowercase L) 8 no full range L16 l 16 no full range L32 l 32 no full range LA l, a 8 no full range LA32 l, a 16 no full range RGB r, g, b 8 no full range RGB48 r, g, b 16 no full range RGBA r, g, b, a 8 no full range RGBA64 r, g, b, a 16 no full range YV12 y, cr, cb 8 1, 2, 2 16-235, 16-240, 16-240 JPEG_YV12 y, cr, cb 8 1, 2, 2 full range CMYK c, m, y, k 8 no full range CMYK64 c, m, y, k 16 no full range ========= =============== ========= =========== ====================== When the name of a mode ends with a number, it represents the average number of bits per pixel. All the other modes simply use a byte per component per pixel. No palette modes or modes with less than 8 bits per component are supported. Welcome to the 21st century. Here's a quick description of the modes and the rationale for their inclusion; there are four groups of modes: 1. **grayscale** (``L*`` modes): they are heavily used in scientific computing (those people may also need a very high dynamic range and precision, hence ``L32``, the only mode with 32 bits per component) and sometimes it can be useful to consider a single component of a color image as a grayscale image (this is used by the individual planes of the planar images, see ``YV12`` below); the name of the component (``'l'``, lowercase letter L) stands for luminance, the second optional component (``'a'``) is the alpha value and represents the opacity of the pixels: alpha = 0 means full transparency, alpha = 255/65535 represents a fully opaque pixel; 2. **RGB\* modes**: the garden variety color images. The optional alpha component has the same meaning as in grayscale modes; 3. **YCbCr**, a.k.a. YUV (``*YV12`` modes). These modes are planar (i.e. the values of all the pixel for each component are stored in a consecutive memory area, instead of the usual arrangement where all the components of a pixel reside in consecutive bytes) and use a 1, 2, 2 (a.k.a. 4:2:0) subsampling (i.e. each pixel has its own Y value, but the Cb and Cr components are shared between groups of 2x2 adjacent pixels) because this is the format that's by far the most common for YCbCr images. Please note that the V (Cr) plane is stored before the U (Cb) plane. ``YV12`` is commonly used for MPEG2 (including DVDs), MPEG4 (both ASP/DivX and AVC/H.264) and Theora video frames. Valid values for Y are in range(16, 236) (excluding 236), and valid values for Cb and Cr are in range(16, 241). ``JPEG_YV12`` is similar to ``YV12``, but the three components can have the full range of 256 values. It's the native format used by almost all JPEG/JFIF files and by MJPEG video frames. The "strangeness" of these two wrt all the other supported modes derives from the fact that they are widely used that way by a lot of existing libraries and applications; this is also the reason why they are included (and the fact that they can't losslessly converted to RGB because YCbCr is a bigger color space); the funny 4:2:0 planar arrangement of the pixel values is relatively easy to support because in most cases the three planes can be considered three separate grayscale images; 4. **CMYK\* modes** (cyan, magenta, yellow and black) are subtractive color modes, used for printing color images on dead trees. Professional designers love to pretend that they can't live without them, so here they are. Python API ---------- See the examples_ below. In Python 2.x, all the new classes defined here are new-style classes. Mode Objects '''''''''''' The mode objects offer a number of attributes and methods that can be used for implementing generic algorithms that work on different types of images: ``components`` The number of components per pixel (e.g. 4 for an RGBA image). ``component_names`` A tuple of strings; see the column "Component names" in the above table. ``bits_per_component`` 8, 16 or 32; see "Bits per component" in the above table. ``bytes_per_pixel`` ``components * bits_per_component // 8``, only available for non planar modes (see below). ``planar`` Boolean; ``True`` if the image components reside each in a separate plane. Currently this happens if and only if the mode uses subsampling. ``subsampling`` A tuple that for each component in the mode contains a tuple of two integers that represent the amount of downsampling in the horizontal and vertical direction, respectively. In practice it's ``((1, 1), (2, 2), (2, 2))`` for ``YV12`` and ``JPEG_YV12`` and ``((1, 1),) * components`` for everything else. ``x_divisor`` ``max(x for x, y in subsampling)``; the width of an image that uses this mode must be divisible for this value. ``y_divisor`` ``max(y for x, y in subsampling)``; the height of an image that uses this mode must be divisible for this value. ``intervals`` A tuple that for each component in the mode contains a tuple of two integers: the minimum and maximum valid value for the component. Its value is ``((16, 235), (16, 240), (16, 240))`` for ``YV12`` and ``((0, 2 ** bits_per_component - 1),) * components`` for everything else. ``get_length(iterable[integer]) -> int`` The parameter must be an iterable that contains two integers: the width and height of an image; it returns the number of bytes needed to store an image of these dimensions with this mode. Implementation detail: the modes are instances of a subclass of ``str`` and have a value equal to their name (e.g. ``imageop.RGB == 'RGB'``) except for ``L32`` that has value ``'I'``. This is only intended for backward compatibility with existing PIL users; new code that uses the image protocol proposed here should not rely on this detail. Image Protocol '''''''''''''' Any object that supports the image protocol must provide the following methods and attributes: ``mode`` The format and the arrangement of the pixels in this image; it's one of the constants in the ``MODES`` set. ``size`` An instance of the `ImageSize class`_; it's a named tuple of two integers: the width and the height of the image in pixels; both of them must be >= 1 and can also be accessed as the ``width`` and ``height`` attributes of ``size``. ``buffer`` A sequence of integers between 0 and 255; they are the actual bytes used for storing the image data (i.e. modifying their values affects the image pixels and vice versa); the data has a row-major/C-contiguous order without padding and without any special memory alignment, even when there are more than 8 bits per component. The only supported methods are ``__len__``, ``__getitem__``/``__setitem__`` (with both integers and slice indexes) and ``__iter__``; on the C side it implements the buffer protocol. This is a pretty low level interface to the image and the user is responsible for using the correct (native) byte order for modes with more than 8 bit per component and the correct value ranges for ``YV12`` images. A buffer may or may not keep a reference to its image, but it's still safe (if useless) to use the buffer even after the corresponding image has been destroyed by the garbage collector (this will require changes to the image class of wxPython and possibly other libraries). Implementation detail: this can be an ``array('B')``, a ``bytes()`` object or a specialized fixed-length type. ``info`` A ``dict`` object that can contain arbitrary metadata associated with the image (e.g. DPI, gamma, ICC profile, exposure time...); the interpretation of this data is beyond the scope of this PEP and probably depends on the library used to create and/or to save the image; if a method of the image returns a new image, it can copy or adapt metadata from its own ``info`` attribute (the ``ImageMixin`` implementation always creates a new image with an empty ``info`` dictionary). | ``bits_per_component`` | ``bytes_per_pixel`` | ``component_names`` | ``components`` | ``intervals`` | ``planar`` | ``subsampling`` Shortcuts for the corresponding ``mode.*`` attributes. ``map(function[, function...]) -> None`` For every pixel in the image, maps each component through the corresponding function. If only one function is passed, it is used repeatedly for each component. This method modifies the image **in place** and is usually very fast (most of the time the functions are called only a small number of times, possibly only once for simple functions without branches), but it imposes a number of restrictions on the function(s) passed: * it must accept a single integer argument and return a number (``map`` will round the result to the nearest integer and clip it to ``range(0, 2 ** bits_per_component)``, if necessary); * it must *not* try to intercept any ``BaseException``, ``Exception`` or any unknown subclass of ``Exception`` raised by any operation on the argument (implementations may try to optimize the speed by passing funny objects, so even a simple ``"if n == 10:"`` may raise an exception: simply ignore it, ``map`` will take care of it); catching any other exception is fine; * it should be side-effect free and its result should not depend on values (other than the argument) that may change during a single invocation of ``map``. | ``rotate90() -> image`` | ``rotate180() -> image`` | ``rotate270() -> image`` Return a copy of the image rotated 90, 180 or 270 degrees counterclockwise around its center. ``clip() -> None`` Saturates invalid component values in ``YV12`` images to the minimum or the maximum allowed (see ``mode.intervals``), for other image modes this method does nothing, very fast; libraries that save/export ``YV12`` images are encouraged to always call this method, since intermediate operations (e.g. the ``map`` method) may assign to pixels values outside the valid intervals. ``split() -> tuple[image]`` Returns a tuple of ``L``, ``L16`` or ``L32`` images corresponding to the individual components in the image. Planar images also supports attributes with the same names defined in ``component_names``: they contain grayscale (mode ``L``) images that offer a view on the pixel values for the corresponding component; any change to the subimages is immediately reflected on the parent image and vice versa (their buffers refer to the same memory location). Non-planar images offer the following additional methods: ``pixels() -> iterator[pixel]`` Returns an iterator that iterates over all the pixels in the image, starting from the top line and scanning each line from left to right. See below for a description of the `pixel objects`_. ``__iter__() -> iterator[line]`` Returns an iterator that iterates over all the lines in the image, from top to bottom. See below for a description of the `line objects`_. ``__len__() -> int`` Returns the number of lines in the image (``size.height``). ``__getitem__(integer) -> line`` Returns the line at the specified (y) position. ``__getitem__(tuple[integer]) -> pixel`` The parameter must be a tuple of two integers; they are interpreted respectively as x and y coordinates in the image (0, 0 is the top left corner) and a pixel object is returned. ``__getitem__(slice | tuple[integer | slice]) -> image`` The parameter must be a slice or a tuple that contains two slices or an integer and a slice; the selected area of the image is copied and a new image is returned; ``image[x:y:z]`` is equivalent to ``image[:, x:y:z]``. ``__setitem__(tuple[integer], integer | iterable[integer]) -> None`` Modifies the pixel at specified position; ``image[x, y] = integer`` is a shortcut for ``image[x, y] = (integer,)`` for images with a single component. ``__setitem__(slice | tuple[integer | slice], image) -> None`` Selects an area in the same way as the corresponding form of the ``__getitem__`` method and assigns to it a copy of the pixels from the image in the second argument, that must have exactly the same mode as this image and the same size as the specified area; the alpha component, if present, is simply copied and doesn't affect the other components of the image (i.e. no alpha compositing is performed). The ``mode``, ``size`` and ``buffer`` (including the address in memory of the ``buffer``) never change after an image is created. It is expected that, if PEP 3118 is accepted, all the image objects will support the new buffer protocol, however this is beyond the scope of this PEP. ``Image`` and ``ImageMixin`` Classes '''''''''''''''''''''''''''''''''''' The ``ImageMixin`` class implements all the methods and attributes described above except ``mode``, ``size``, ``buffer`` and ``info``. ``Image`` is a subclass of ``ImageMixin`` that adds support for these four attributes and offers the following constructor (please note that the constructor is not part of the image protocol): ``__init__(mode, size, color, source)`` ``mode`` must be one of the constants in the ``MODES`` set, ``size`` is a sequence of two integers (width and height of the new image); ``color`` is a sequence of integers, one for each component of the image, used to initialize all the pixels to the same value; ``source`` can be a sequence of integers of the appropriate size and format that is copied as-is in the buffer of the new image or an existing image; in Python 2.x ``source`` can also be an instance of ``str`` and is interpreted as a sequence of bytes. ``color`` and ``source`` are mutually exclusive and if they are both omitted the image is initialized to transparent black (all the bytes in the buffer have value 16 in the ``YV12`` mode, 255 in the ``CMYK*`` modes and 0 for everything else). If ``source`` is present and is an image, ``mode`` and/or ``size`` can be omitted; if they are specified and are different from the source mode and/or size, the source image is converted. The exact algorithms used for resizing and doing color space conversions may differ between Python versions and implementations, but they always give high quality results (e.g.: a cubic spline interpolation can be used for upsampling and an antialias filter can be used for downsampling images); any combination of mode conversion is supported, but the algorithm used for conversions to and from the ``CMYK*`` modes is pretty na?ve: if you have the exact color profiles of your devices you may want to use a good color management tool such as LittleCMS. The new image has an empty ``info`` ``dict``. Line Objects '''''''''''' The line objects (returned, e.g., when iterating over an image) support the following attributes and methods: ``mode`` The mode of the image from where this line comes. ``__iter__() -> iterator[pixel]`` Returns an iterator that iterates over all the pixels in the line, from left to right. See below for a description of the `pixel objects`_. ``__len__() -> int`` Returns the number of pixels in the line (the image width). ``__getitem__(integer) -> pixel`` Returns the pixel at the specified (x) position. ``__getitem__(slice) -> image`` The selected part of the line is copied and a new image is returned; the new image will always have height 1. ``__setitem__(integer, integer | iterable[integer]) -> None`` Modifies the pixel at the specified position; ``line[x] = integer`` is a shortcut for ``line[x] = (integer,)`` for images with a single component. ``__setitem__(slice, image) -> None`` Selects a part of the line and assigns to it a copy of the pixels from the image in the second argument, that must have height 1, a width equal to the specified slice and the same mode as this line; the alpha component, if present, is simply copied and doesn't affect the other components of the image (i.e. no alpha compositing is performed). Pixel Objects ''''''''''''' The pixel objects (returned, e.g., when iterating over a line) support the following attributes and methods: ``mode`` The mode of the image from where this pixel comes. ``value`` A tuple of integers, one for each component. Any iterable of the correct length can be assigned to ``value`` (it will be automagically converted to a tuple), but you can't assign to it an integer, even if the mode has only a single component: use, e.g., ``pixel.l = 123`` instead. ``r, g, b, a, l, c, m, y, k`` The integer values of each component; only those applicable for the current mode (in ``mode.component_names``) will be available. | ``__iter__() -> iterator[int]`` | ``__len__() -> int`` | ``__getitem__(integer | slice) -> int | tuple[int]`` | ``__setitem__(integer | slice, integer | iterable[integer]) -> None`` These four methods emulate a fixed length list of integers, one for each pixel component. ``ImageSize`` Class ''''''''''''''''''' ``ImageSize`` is a named tuple, a class identical to ``tuple`` except that: * its constructor only accepts two integers, width and height; they are converted in the constructor using their ``__index__()`` methods, so all the ``ImageSize`` objects are guaranteed to contain only ``int`` (or possibly ``long``, in Python 2.x) instances; * it has a ``width`` and a ``height`` property that are equivalent to the first and the second number in the tuple, respectively; * the string returned by its ``__repr__`` method is ``'imageop.ImageSize(width=%d, height=%d)' % (width, height)``. ``ImageSize`` is not usually instantiated by end-users, but can be used when creating a new class that implements the image protocol, since the ``size`` attribute must be an ``ImageSize`` instance. C API ----- The available image modes are visible at the C level as ``PyImage_*`` constants of type ``PyObject *`` (e.g.: ``PyImage_RGB`` is ``imageop.RGB``). The following functions offer a C-friendly interface to mode and image objects (all the functions return ``NULL`` or -1 on failure): ``int PyImageMode_Check(PyObject *obj)`` Returns true if the object ``obj`` is a valid image mode. | ``int PyImageMode_GetComponents(PyObject *mode)`` | ``PyObject* PyImageMode_GetComponentNames(PyObject *mode)`` | ``int PyImageMode_GetBitsPerComponent(PyObject *mode)`` | ``int PyImageMode_GetBytesPerPixel(PyObject *mode)`` | ``int PyImageMode_GetPlanar(PyObject *mode)`` | ``PyObject* PyImageMode_GetSubsampling(PyObject *mode)`` | ``int PyImageMode_GetXDivisor(PyObject *mode)`` | ``int PyImageMode_GetYDivisor(PyObject *mode)`` | ``Py_ssize_t PyImageMode_GetLength(PyObject *mode, Py_ssize_t width, Py_ssize_t height)`` These functions are equivalent to their corresponding Python attributes or methods. ``int PyImage_Check(PyObject *obj)`` Returns true if the object ``obj`` is an ``Image`` object or an instance of a subtype of the ``Image`` type; see also ``PyObject_CheckImage`` below. ``int PyImage_CheckExact(PyObject *obj)`` Returns true if the object ``obj`` is an ``Image`` object, but not an instance of a subtype of the ``Image`` type. | ``PyObject* PyImage_New(PyObject *mode, Py_ssize_t width, Py_ssize_t height)`` Returns a new ``Image`` instance, initialized to transparent black (see ``Image.__init__`` above for the details). | ``PyObject* PyImage_FromImage(PyObject *image, PyObject *mode, Py_ssize_t width, Py_ssize_t height)`` Returns a new ``Image`` instance, initialized with the contents of the ``image`` object rescaled and converted to the specified ``mode``, if necessary. | ``PyObject* PyImage_FromBuffer(PyObject *buffer, PyObject *mode, Py_ssize_t width, Py_ssize_t height)`` Returns a new ``Image`` instance, initialized with the contents of the ``buffer`` object. ``int PyObject_CheckImage(PyObject *obj)`` Returns true if the object ``obj`` implements a sufficient subset of the image protocol to be accepted by the functions defined below, even if its class is not a subclass of ``ImageMixin`` and/or ``Image``. Currently it simply checks for the existence and correctness of the attributes ``mode``, ``size`` and ``buffer``. | ``PyObject* PyImage_GetMode(PyObject *image)`` | ``Py_ssize_t PyImage_GetWidth(PyObject *image)`` | ``Py_ssize_t PyImage_GetHeight(PyObject *image)`` | ``int PyImage_Clip(PyObject *image)`` | ``PyObject* PyImage_Split(PyObject *image)`` | ``PyObject* PyImage_GetBuffer(PyObject *image)`` | ``int PyImage_AsBuffer(PyObject *image, const void **buffer, Py_ssize_t *buffer_len)`` These functions are equivalent to their corresponding Python attributes or methods; the image memory can be accessed only with the GIL and a reference to the image or its buffer held, and extra care should be taken for modes with more than 8 bits per component: the data is stored in native byte order and it can be **not** aligned on 2 or 4 byte boundaries. Examples ======== A few examples of common operations with the new ``Image`` class and protocol:: # create a new black RGB image of 6x9 pixels rgb_image = imageop.Image(imageop.RGB, (6, 9)) # same as above, but initialize the image to bright red rgb_image = imageop.Image(imageop.RGB, (6, 9), color=(255, 0, 0)) # convert the image to YCbCr yuv_image = imageop.Image(imageop.JPEG_YV12, source=rgb_image) # read the value of a pixel and split it into three ints r, g, b = rgb_image[x, y] # modify the magenta component of a pixel in a CMYK image cmyk_image[x, y].m = 13 # modify the Y (luma) component of a pixel in a *YV12 image and # its corresponding subsampled Cr (red chroma) yuv_image.y[x, y] = 42 yuv_image.cr[x // 2, y // 2] = 54 # iterate over an image for line in rgb_image: for pixel in line: # swap red and blue, and set green to 0 pixel.value = pixel.b, 0, pixel.r # find the maximum value of the red component in the image max_red = max(pixel.r for pixel in rgb_image.pixels()) # count the number of colors in the image num_of_colors = len(set(tuple(pixel) for pixel in image.pixels())) # copy a block of 4x2 pixels near the upper right corner of an # image and paste it into the lower left corner of the same image image[:4, -2:] = image[-6:-2, 1:3] # create a copy of the image, except that the new image can have a # different (usually empty) info dict new_image = image[:] # create a mirrored copy of the image, with the left and right # sides flipped flipped_image = image[::-1, :] # downsample an image to half its original size using a fast, low # quality operation and a slower, high quality one: low_quality_image = image[::2, ::2] new_size = image.size.width // 2, image.size.height // 2 high_quality_image = imageop.Image(size=new_size, source=image) # direct buffer access rgb_image[0, 0] = r, g, b assert tuple(rgb_image.buffer[:3]) == (r, g, b) Backwards Compatibility ======================= There are three areas touched by this PEP where backwards compatibility should be considered: * **Python 2.6**: new classes and objects are added to the ``imageop`` module without touching the existing module contents; new methods and attributes will be added to ``Tkinter.PhotoImage`` and its ``__getitem__`` and ``__setitem__`` methods will be modified to accept integers, tuples and slices (currently they only accept strings). All the changes provide a superset of the existing functionality, so no major compatibility issues are expected. * **Python 3.0**: the legacy contents of the ``imageop`` module will be deleted, according to PEP 3108; everything defined in this proposal will work like in Python 2.x with the exception of the usual 2.x/3.0 differences (e.g. support for ``long`` integers and for interpreting ``str`` instances as sequences of bytes will be dropped). * **external libraries**: the names and the semantics of the standard image methods and attributes are carefully chosen to allow some external libraries that manipulate images (including at least PIL, wxPython and pygame) to implement the new protocol in their image classes without breaking compatibility with existing code. The only blatant conflicts between the image protocol and NumPy arrays are the value of the ``size`` attribute and the coordinates order in the ``image[x, y]`` expression. Reference Implementation ======================== If this PEP is accepted, the author will provide a reference implementation of the new classes in pure Python (that can run in CPython, PyPy, Jython and IronPython) and a second one optimized for speed in Python and C, suitable for inclusion in the CPython standard library. The author will also submit the required Tkinter patches. For all the code will be available a version for Python 2.x and a version for Python 3.0 (it is expected that the two version will be very similar and the Python 3.0 one will probably be generated almost completely automatically). Acknowledgments =============== The implementation of this PEP, if accepted, is sponsored by Google through the Google Summer of Code program. Copyright ========= This document has been placed in the public domain. -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From bjourne at gmail.com Sun Jul 1 14:34:03 2007 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Sun, 1 Jul 2007 14:34:03 +0200 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: References: Message-ID: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com> Cool PEP! I really love the API for the Image class. A standard Image class would be a useful addition to the standard library. But I cannot see how it would solve the problem with to many image classes. The reason why PIL, PyGame and wxPython has different image classes is because each of them use different C functions for manipulating said image classes. These differences bubble up through the bindings and results in PIL exposing an Image, PyGame a Surface and wxPython a wxImage. The result is that if you want to use a PIL Image in say PyGame, you still need to convert it. If PIL stores RGB images with 32 bpp and PyGame uses 24, then you'll have to convert it to get it into the proper format. The only way to get compatibility between the libraries is to create an image library in C _and_ get those libraries to start using it. -- mvh Bj?rn From stargaming at gmail.com Sun Jul 1 18:01:12 2007 From: stargaming at gmail.com (Stargaming) Date: Sun, 01 Jul 2007 18:01:12 +0200 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com> References: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com> Message-ID: BJ?rn Lindqvist schrieb: > Cool PEP! I really love the API for the Image class. A standard Image > class would be a useful addition to the standard library. > > But I cannot see how it would solve the problem with to many image > classes. The reason why PIL, PyGame and wxPython has different image > classes is because each of them use different C functions for > manipulating said image classes. These differences bubble up through > the bindings and results in PIL exposing an Image, PyGame a Surface > and wxPython a wxImage. The result is that if you want to use a PIL > Image in say PyGame, you still need to convert it. If PIL stores RGB > images with 32 bpp and PyGame uses 24, then you'll have to convert it > to get it into the proper format. > > The only way to get compatibility between the libraries is to create > an image library in C _and_ get those libraries to start using it. > They'll all quack the same way. (This is paraphrased in the PEP's abstract, as far as I read it.) From martin at v.loewis.de Sun Jul 1 18:55:42 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 01 Jul 2007 18:55:42 +0200 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: References: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com> Message-ID: <4687DC8E.6010109@v.loewis.de> >> The only way to get compatibility between the libraries is to create >> an image library in C _and_ get those libraries to start using it. >> > > They'll all quack the same way. (This is paraphrased in the PEP's > abstract, as far as I read it.) To the Python side, yes. But to the underlying C library, some quack, some bark. How would you pass a Tkinter.PhotoImage to wxPython if both supported the PEP? wxPython would likely be able to produce objects that provide the Image interface, but I can't see how wxPython could consume such a thing - the underlying C libraries surely expect something completely different. The only way I can see this work is if each library imports Image objects by copying them, pixel for pixel, through this interface. Regards, Martin From l.mastrodomenico at gmail.com Sun Jul 1 18:59:09 2007 From: l.mastrodomenico at gmail.com (Lino Mastrodomenico) Date: Sun, 1 Jul 2007 18:59:09 +0200 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com> References: <740c3aec0707010534j4049efbchb2389bf61413c300@mail.gmail.com> Message-ID: 2007/7/1, BJ?rn Lindqvist : > But I cannot see how it would solve the problem with to many image > classes. The reason why PIL, PyGame and wxPython has different image > classes is because each of them use different C functions for > manipulating said image classes. These differences bubble up through > the bindings and results in PIL exposing an Image, PyGame a Surface > and wxPython a wxImage. The result is that if you want to use a PIL > Image in say PyGame, you still need to convert it. Actually, this is not always true. :-) For example it's entirely possible to have the *same* python RGBA image considered as a SDL_Surface by SDL (the underlying library used by pygame), as an ImagingMemoryInstance by the PIL C library and have its buffer directly accepted by the OpenGL function glTexImage2D (with a bit of care in the order of the corners passed to glTexCoord2f), independently by who created the image in the first place. This works because most C/C++ libraries give the possibility of creating a native image struct/class using an existing memory buffer (without copying it) and they support at least a subset of the modes currently defined, with the exact byte order, padding, etc, specified in the PEP (usually L and at least one of RGB or RGBA). But you are right, the particular format specified in the PEP is not always supported by existing the libraries, even when they support that particular mode. Sometimes this can be fixed (e.g. PIL currently uses by default 4 bytes per pixel for RGB images and has only experimental support for 3 bytes per pixel, but its C library is written by the same people that maintain the Python bindings, so they can change it if they want) and sometimes it cannot be easily fixed (e.g. a wxImage class will happily accept a RGB buffer as defined by the PEP, but it has a funny memory arrangement for RGBA images that is completely incompatible). So I expect that each Python library that jumps on the PEP bandwagon will have three levels of support for the modes listed: 1) no support at all (e.g. most 3D libraries will probably never accept CMYK images as textures); the user can explicitly convert the image using "new_image = Image(new_mode, source=old_image)"; 2) limited support: they support a particular mode, but cannot directly use the standard memory arrangement, so when they receive an alien image object they convert it on the fly to their preferred byte order and they do the reverse operation when a foreign library tries to access the buffer property of their images (they may offer a read-only buffer); this is not ideal, but it's better than the current situation because it's transparent to the user and it requires only a single memory copy/conversion instead of the two usually performed by the current tostring/fromstring dance; 3) full support: no conversion or memory copy ever necessary for the exchange of images between two libraries if they both have full support for a particular mode. Of course the Image class that I'm writing and that I hope will be included in the stdlib, will have full support for all the modes. Please note that the conversions in "2)" above can be avoided in some (most?) cases if PEP 3118 is accepted, because it will become possible to expose and discover the "native" memory arrangement of an image without accessing its buffer property (that, in my vision, will always offer the "standard" arrangement defined in the PEP, to simplify things for libraries that prefer a simpler interface, even if it may be slightly less efficient in some, hopefully rare, cases). -- Lino Mastrodomenico E-mail: l.mastrodomenico at gmail.com From alexandre at peadrop.com Mon Jul 2 19:46:28 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Mon, 2 Jul 2007 13:46:28 -0400 Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek properly In-Reply-To: References: Message-ID: If StringIO is not allowed to over-seek, what should happen to the current file position when it is truncated? >>> s = StringIO("Hello world!") >>> s.seek(0, 2) >>> s.truncate(2) >>> s.tell() ??? Truncating can either set the position to the new string size, or it leaves it alone. -- Alexandre From guido at python.org Mon Jul 2 20:38:54 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Jul 2007 11:38:54 -0700 Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek properly In-Reply-To: References: Message-ID: Honestly, I think truncate() should always set the current position to the new size, even though that's not what it currently does. Or at least it should set it to the new size if that's less than the current position. What's the rationale (apart from "Unix defined it so") why it currently leaves the position unchanged? At least I think it's fine if StringIO does it this way. I think TextIOWrapper should also do it this way, as it has the same issue (writing null bytes is not defined for encoded files). --Guido On 7/2/07, Alexandre Vassalotti wrote: > If StringIO is not allowed to over-seek, what should happen to the > current file position when it is truncated? > > >>> s = StringIO("Hello world!") > >>> s.seek(0, 2) > >>> s.truncate(2) > >>> s.tell() > ??? > > Truncating can either set the position to the new string size, or it > leaves it alone. > > -- Alexandre > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexandre at peadrop.com Mon Jul 2 20:59:28 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Mon, 2 Jul 2007 14:59:28 -0400 Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek properly In-Reply-To: References: Message-ID: On 7/2/07, Guido van Rossum wrote: > Honestly, I think truncate() should always set the current position to > the new size, even though that's not what it currently does. Or at > least it should set it to the new size if that's less than the current > position. What's the rationale (apart from "Unix defined it so") why > it currently leaves the position unchanged? No idea. I just know that truncate in the old StringIO module do set the position to the new size if the new size is less than the current position. And that is how I implemented it in _bytes_io and _string_io. From rasky at develer.com Tue Jul 3 00:51:41 2007 From: rasky at develer.com (Giovanni Bajo) Date: Tue, 03 Jul 2007 00:51:41 +0200 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: <20070630205444.GD22221@theory.org> References: <20070630205444.GD22221@theory.org> Message-ID: On 30/06/2007 22.54, Matt Chisholm wrote: > I've created and submitted a new PEP proposing support for labels in > Python's break and continue statements. Georg Brandl has graciously > added it to the PEP list as PEP 3136: > > http://www.python.org/dev/peps/pep-3136/ > > I understand that the deadline for submitting features for Python 3.0 > has passed, so this PEP targets Python 3.1. I also expect that people > might not want to take time off from the Python 3.0 effort to discuss > features that are even further off in the future. > > Thanks for your time, and thanks for letting me contribute an idea to > Python. I didn't see one simple alternative listed: move everything within a function: def func(): for a in a_list: for b in b_list: if condition1(a, b): return [...] if condition2(a, b): break func() -- Giovanni Bajo From greg.ewing at canterbury.ac.nz Tue Jul 3 01:35:25 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 03 Jul 2007 11:35:25 +1200 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: References: <20070630205444.GD22221@theory.org> Message-ID: <46898BBD.1040901@canterbury.ac.nz> On 30/06/2007 22.54, Matt Chisholm wrote: > I've created and submitted a new PEP proposing support for labels in > Python's break and continue statements. > > http://www.python.org/dev/peps/pep-3136/ -1. Confusing nested loops are best broken out into separate functions rather than patching over the problem with features like this. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From ntoronto at cs.byu.edu Tue Jul 3 09:17:05 2007 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Tue, 03 Jul 2007 01:17:05 -0600 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: <46898BBD.1040901@canterbury.ac.nz> References: <20070630205444.GD22221@theory.org> <46898BBD.1040901@canterbury.ac.nz> Message-ID: <4689F7F1.1070503@cs.byu.edu> Greg Ewing wrote: > On 30/06/2007 22.54, Matt Chisholm wrote: > > >> I've created and submitted a new PEP proposing support for labels in >> Python's break and continue statements. >> >> http://www.python.org/dev/peps/pep-3136/ >> > > -1. Confusing nested loops are best broken out into > separate functions rather than patching over the > problem with features like this. > +1 (not that my vote really counts for much). Breaking logic out into separate functions can obscure the meaning of an algorithm that is most naturally implemented with nested loops. Neil From edin.salkovic at gmail.com Tue Jul 3 10:11:59 2007 From: edin.salkovic at gmail.com (Edin Salkovic) Date: Tue, 3 Jul 2007 10:11:59 +0200 Subject: [Python-3000] PEP 368: Standard image protocol and class In-Reply-To: References: Message-ID: <63eb7fa90707030111r2ab33606xb2f76269e9e80b1f@mail.gmail.com> Hi Lino, On 7/1/07, Lino Mastrodomenico wrote: > ``__getitem__(integer) -> line`` > > Returns the line at the specified (y) position. Just some ideas to think about. 1) Have you considered adding a separate lines property to the Image protocol? 2) Does one, by default, want to iterate over lines or over pixels of an image? Even your example iterates over pixels: # iterate over an image for line in rgb_image: for pixel in line: # swap red and blue, and set green to 0 pixel.value = pixel.b, 0, pixel.r why not just: # iterate over an image for pixel in rgb_image: pixel.value = pixel.b, 0, pixel.r 3) The pixels method (same for the possible lines property that I mentioned above) should probably be a property, i.e.: pixels -> iterator[pixel], not: pixels() -> iterator[pixel] P.S.: You might also inform the SciPy/NumPy lists about the PEP. Keep up the good work!, Edin From guido at python.org Tue Jul 3 10:14:17 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 3 Jul 2007 10:14:17 +0200 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: <20070630205444.GD22221@theory.org> References: <20070630205444.GD22221@theory.org> Message-ID: On 6/30/07, Matt Chisholm wrote: > I've created and submitted a new PEP proposing support for labels in > Python's break and continue statements. Georg Brandl has graciously > added it to the PEP list as PEP 3136: > > http://www.python.org/dev/peps/pep-3136/ I think this is a good summary of various proposals that have been floated in the past, plus some new ones. As a PEP, it falls short because it doesn't pick a solution but merely offers a large menu of possible options. Also, there is nothing about implementation yet. However, I'm rejecting it on the basis that code so complicated to require this feature is very rare. In most cases there are existing work-arounds that produce clean code, for example using 'return'. While I'm sure there are some (rare) real cases where clarity of the code would suffer from a refactoring that makes it possible to use return, this is offset by two issues: 1. The complexity added to the language, permanently. This affects not only all Python implementations, but also every source analysis tool, plus of course all documentation for the language. 2. My expectation that the feature will be abused more than it will be used right, leading to a net decrease in code clarity (measured across all Python code written henceforth). Lazy programmers are everywhere, and before you know it you have an incredible mess on your hands of unintelligible code. I realize this is a heavy bar to pass, and somewhat subjective. That's okay. There is real value in having a small language. Also, as I said, while there are no past PEPs to document it, this has been brought up and rejected many times before. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rasky at develer.com Tue Jul 3 10:27:03 2007 From: rasky at develer.com (Giovanni Bajo) Date: Tue, 03 Jul 2007 10:27:03 +0200 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: <4689F7F1.1070503@cs.byu.edu> References: <20070630205444.GD22221@theory.org> <46898BBD.1040901@canterbury.ac.nz> <4689F7F1.1070503@cs.byu.edu> Message-ID: On 03/07/2007 9.17, Neil Toronto wrote: > Greg Ewing wrote: >> On 30/06/2007 22.54, Matt Chisholm wrote: >> >> >>> I've created and submitted a new PEP proposing support for labels in >>> Python's break and continue statements. >>> >>> http://www.python.org/dev/peps/pep-3136/ >>> >> -1. Confusing nested loops are best broken out into >> separate functions rather than patching over the >> problem with features like this. >> > > +1 (not that my vote really counts for much). Breaking logic out into > separate functions can obscure the meaning of an algorithm that is most > naturally implemented with nested loops. Do you have a concrete, real-world example? -- Giovanni Bajo From ntoronto at cs.byu.edu Tue Jul 3 11:42:09 2007 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Tue, 03 Jul 2007 03:42:09 -0600 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: References: <20070630205444.GD22221@theory.org> <46898BBD.1040901@canterbury.ac.nz> <4689F7F1.1070503@cs.byu.edu> Message-ID: <468A19F1.7070302@cs.byu.edu> Giovanni Bajo wrote: > On 03/07/2007 9.17, Neil Toronto wrote: > >> Greg Ewing wrote: >> >>> On 30/06/2007 22.54, Matt Chisholm wrote: >>> >>> >>> >>>> I've created and submitted a new PEP proposing support for labels in >>>> Python's break and continue statements. >>>> >>>> http://www.python.org/dev/peps/pep-3136/ >>>> >>>> >>> -1. Confusing nested loops are best broken out into >>> separate functions rather than patching over the >>> problem with features like this. >>> >>> >> +1 (not that my vote really counts for much). Breaking logic out into >> separate functions can obscure the meaning of an algorithm that is most >> naturally implemented with nested loops. >> > > Do you have a concrete, real-world example? > You pragmatists and your concrete, real-world examples. :p Anyway, sure: image processing -> binary morphological operators -> erode. It's a four-deep nested loop. You pass a binary bitmask (kernel) over a binary image, centering it on each pixel. If one bit in the image is off that's on in the kernel, you turn off the center pixel in the destination. This is the obvious break - it only takes one, so it's senseless to keep going in the inner two loops. Moving the innermost two loops into a new function makes the flow of the algorithm less linear and therefore less clear. (Also, the function would never be called from anywhere else. How about an inner function? That's worse for understandability, IMNSHO.) Other ways of avoiding the inner break, such as counting hits, or overwriting the center pixel repeatedly, obscure the meaning of the morphological operator. Granted, Python doesn't usually get used for low-level stuff like this, and I'd probably use Numpy array operations in the place of the inner two loops, which would be less efficient, but faster. But you were asking whether algorithms that are naturally expressed as nested loops with breaks exist, and this just happened to be on my hard drive, written in Java. FWIW, I've read Guido's recent rejection of this PEP, but I wanted to take up the challenge of showing that these (admittedly rare) use cases do exist. A lot of them come from 2D analogues of algorithms that call for a break from an inner loop. Neil From tomerfiliba at gmail.com Tue Jul 3 12:59:51 2007 From: tomerfiliba at gmail.com (tomer filiba) Date: Tue, 3 Jul 2007 12:59:51 +0200 Subject: [Python-3000] the do-while pep Message-ID: <1d85506f0707030359w45bd864cn2007e67459df18cc@mail.gmail.com> i haven't seen this issue discussed at all, so i thought i'd bring it up -- what's the status of the pep 315 (do-while syntax)? is it getting into py3k? -tomer -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20070703/30989ac6/attachment.html From guido at python.org Tue Jul 3 13:42:14 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 3 Jul 2007 13:42:14 +0200 Subject: [Python-3000] the do-while pep In-Reply-To: <1d85506f0707030359w45bd864cn2007e67459df18cc@mail.gmail.com> References: <1d85506f0707030359w45bd864cn2007e67459df18cc@mail.gmail.com> Message-ID: On 7/3/07, tomer filiba wrote: > i haven't seen this issue discussed at all, so i thought i'd bring it up -- > what's the status of the pep 315 (do-while syntax)? is it getting into py3k? No, it wasn't even considered. It was in the deferred list and nobody suggested we look at it for Py3k. From the message quoted in the deferral note it doesn't look like it's an easy sell. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From john at yates-sheets.org Tue Jul 3 14:32:19 2007 From: john at yates-sheets.org (John S. Yates, Jr.) Date: Tue, 03 Jul 2007 08:32:19 -0400 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: References: <20070630205444.GD22221@theory.org> Message-ID: On Tue, 3 Jul 2007, "Guido van Rossum" wrote: >However, I'm rejecting it on the basis that code so complicated to >require this feature is very rare. I assume that you are familiar with Donald E. Knuth's classic paper: "Structured Programming with go to Statements" http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf /john From alexandre at peadrop.com Tue Jul 3 17:06:20 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Tue, 3 Jul 2007 11:06:20 -0400 Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek properly In-Reply-To: References: Message-ID: On 7/2/07, Guido van Rossum wrote: > Honestly, I think truncate() should always set the current position to > the new size, even though that's not what it currently does. Thought about that and I think that would be the best thing to do. That would avoid making StringIO unnecessary different from BytesIO. And IMHO, it is less prone to bugs. If someone wants to truncate while keeping the current position, then he will have to state is intention explicitly by saving the value of tell() and calling seek() after truncating. I also find the semantic make more sense too. For example: >>> s = StringIO("Good bye, world") >>> s.truncate(10) >>> s.write("cruel world") >>> s.getvalue() ??? I think that should return "Good bye, cruel world", not "cruel world". So, does anyone else agree with this small semantic change of truncate()? -- Alexandre From p.f.moore at gmail.com Tue Jul 3 17:13:51 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 3 Jul 2007 16:13:51 +0100 Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek properly In-Reply-To: References: Message-ID: <79990c6b0707030813o38b36960m7a6469722fd05444@mail.gmail.com> On 03/07/07, Alexandre Vassalotti wrote: > I also find the semantic make more sense too. For example: > > >>> s = StringIO("Good bye, world") > >>> s.truncate(10) > >>> s.write("cruel world") > >>> s.getvalue() > ??? > > I think that should return "Good bye, cruel world", not "cruel world". > > So, does anyone else agree with this small semantic change of truncate()? Looks reasonable to me - without checking documentation, your proposal is what I'd expect the example to do. Paul. From tjreedy at udel.edu Wed Jul 4 01:40:14 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 3 Jul 2007 19:40:14 -0400 Subject: [Python-3000] Announcing PEP 3136 References: <20070630205444.GD22221@theory.org> Message-ID: "John S. Yates, Jr." wrote in message news:j1gk83hfeltf7uqi0u4b5tpgbg80qg3cp0 at 4ax.com... | On Tue, 3 Jul 2007, "Guido van Rossum" wrote: | | >However, I'm rejecting it on the basis that code so complicated to | >require this feature is very rare. | | I assume that you are familiar with Donald E. Knuth's classic paper: | "Structured Programming with go to Statements" | http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf Do you consider this to be for or against the PEP? Rereading it.... At least half Knuth's goto examples are covered by Python's single level restricted gotos: Example 1 (switched to 0-bases arrays, not tested): for i in range(m): if A[i] == x: break else: A[m] = x B[m] = 0 m += 1 B[i] += 1 Example 5 (ditto): i = 0 #? initial value not given while True: if A[i] < x: if L[i] != 0: i = L[i]; continue else: L[i] = j; break else: # > x if R[i] != 0: i = R[i]; continue else: R[i] = j; break # dup code could be factored with LR = L or R as A[i] < or > x A[j] = x L[j] = R[j] = 0 j += 1 The rest are general gotos, including jumps into the middle of loops. None are multilevel continues or breaks. tjr From john at yates-sheets.org Wed Jul 4 15:41:48 2007 From: john at yates-sheets.org (John S. Yates, Jr.) Date: Wed, 04 Jul 2007 09:41:48 -0400 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: References: <20070630205444.GD22221@theory.org> Message-ID: On Tue, 3 Jul 2007, "Terry Reedy" wrote: >Do you consider this to be for or against the PEP? >Rereading it.... > >At least half Knuth's goto examples are covered >by Python's single level restricted gotos: In all honesty I did not reread the paper. I posted based on the recollection that it was the basis for my feeling no compunction about using reviled gotos in my C / C++ code to effect multi-level exists and continuations. When called to task by my peers I invoke Knuth's name. Let's chalk it up to the fallibility of memory over a span of more than 30 years. Thank's for keeping me honest. I am printing out the paper. Rereading it should help me recall the state of programming when I was first starting out. /john From turnbull at sk.tsukuba.ac.jp Wed Jul 4 18:46:08 2007 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Thu, 05 Jul 2007 01:46:08 +0900 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: References: <20070630205444.GD22221@theory.org> Message-ID: <876450xgkv.fsf@uwakimon.sk.tsukuba.ac.jp> John S. Yates, Jr. writes: > In all honesty I did not reread the paper. Sir, you have my thanks for this small misstep, without which you would have undoubtedly abstained from posting that URL, and I, in turn, would have missed a chance to read that wonderful paper. From collinw at gmail.com Fri Jul 6 16:03:56 2007 From: collinw at gmail.com (Collin Winter) Date: Fri, 6 Jul 2007 16:03:56 +0200 Subject: [Python-3000] Change to class construction? Message-ID: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> While experimenting with porting setuptools to py3k (as of r56155), I ran into this situation: class C: a = (4, 5) b = [c for c in range(2) if a] results in a "NameError: global name 'a' is not defined" error, while class C: a = (4, 5) b = [c for c in a] works fine. This gives the same error as above: class C: a = (4, 5) b = [a for c in range(2)] Both now-erroneous snippets work in 2.5.1. Was this change intentional? Collin Winter From g.brandl at gmx.net Fri Jul 6 17:00:08 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 06 Jul 2007 17:00:08 +0200 Subject: [Python-3000] Change to class construction? In-Reply-To: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> Message-ID: Collin Winter schrieb: > While experimenting with porting setuptools to py3k (as of r56155), I > ran into this situation: > > class C: > a = (4, 5) > b = [c for c in range(2) if a] > > results in a "NameError: global name 'a' is not defined" error, while > > class C: > a = (4, 5) > b = [c for c in a] > > works fine. This gives the same error as above: > > class C: > a = (4, 5) > b = [a for c in range(2)] > > Both now-erroneous snippets work in 2.5.1. Was this change intentional? It is at least intentional in the sense that in 3k it works the same as with genexps, which give the same errors in 2.5. What's different is that all code inside a genexp except the first iterator (which is why the second example works) is contained in its own function namespace. So, an equivalent problem is: class C: foo = 1 def bar(): print(foo) bar() Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From pje at telecommunity.com Fri Jul 6 19:25:10 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 06 Jul 2007 13:25:10 -0400 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> Message-ID: <20070706172258.1E65B3A4046@sparrow.telecommunity.com> At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote: >Collin Winter schrieb: > > While experimenting with porting setuptools to py3k (as of r56155), I > > ran into this situation: > > > > class C: > > a = (4, 5) > > b = [c for c in range(2) if a] > > > > results in a "NameError: global name 'a' is not defined" error, while > > > > class C: > > a = (4, 5) > > b = [c for c in a] > > > > works fine. This gives the same error as above: > > > > class C: > > a = (4, 5) > > b = [a for c in range(2)] > > > > Both now-erroneous snippets work in 2.5.1. Was this change intentional? > >It is at least intentional in the sense that in 3k it works the same as with >genexps, which give the same errors in 2.5. This looks like a bug to me. A list comprehension's local scope should be the locals of the enclosing code, even if its loop indexes aren't exposed to that scope. From guido at python.org Sat Jul 7 00:32:15 2007 From: guido at python.org (Guido van Rossum) Date: Sat, 7 Jul 2007 00:32:15 +0200 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070706172258.1E65B3A4046@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> Message-ID: On 7/6/07, Phillip J. Eby wrote: > At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote: > >Collin Winter schrieb: > > > While experimenting with porting setuptools to py3k (as of r56155), I > > > ran into this situation: > > > > > > class C: > > > a = (4, 5) > > > b = [c for c in range(2) if a] > > > > > > results in a "NameError: global name 'a' is not defined" error, while > > > > > > class C: > > > a = (4, 5) > > > b = [c for c in a] > > > > > > works fine. This gives the same error as above: > > > > > > class C: > > > a = (4, 5) > > > b = [a for c in range(2)] > > > > > > Both now-erroneous snippets work in 2.5.1. Was this change intentional? > > > >It is at least intentional in the sense that in 3k it works the same as with > >genexps, which give the same errors in 2.5. > > This looks like a bug to me. A list comprehension's local scope > should be the locals of the enclosing code, even if its loop indexes > aren't exposed to that scope. It's because the class scope is not made available to the methods. That is intentional. Georg's later example is relevant: class C: a = 1 def f(self): print(a) # <-- raises NameError for 'a' This is in turn intentional so that too-clever kids don't develop a habit of referencing class variables without prefixing them with self or C. The OP's use case is rare enough that I don't think we should do anything about it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Sat Jul 7 01:36:07 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 06 Jul 2007 19:36:07 -0400 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> Message-ID: <20070706233354.DA07A3A4046@sparrow.telecommunity.com> At 12:32 AM 7/7/2007 +0200, Guido van Rossum wrote: >On 7/6/07, Phillip J. Eby wrote: > > At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote: > > >Collin Winter schrieb: > > > > While experimenting with porting setuptools to py3k (as of r56155), I > > > > ran into this situation: > > > > > > > > class C: > > > > a = (4, 5) > > > > b = [c for c in range(2) if a] > > > > > > > > results in a "NameError: global name 'a' is not defined" error, while > > > > > > > > class C: > > > > a = (4, 5) > > > > b = [c for c in a] > > > > > > > > works fine. > > > > This looks like a bug to me. A list comprehension's local scope > > should be the locals of the enclosing code, even if its loop indexes > > aren't exposed to that scope. > >It's because the class scope is not made available to the methods. The examples are in the class body, not in methods. The code is statically initializing the class contents, so using C.a isn't possible. I suppose it can be worked around by moving the static initialization code outside the class body; it's just not obvious why it happens. Collin, where did you find this code in setuptools, btw? I've been looking around at other packages of mine where static class initialization uses data structures like this, and I haven't found any place where anything but the "in" clause of a comprehension depends on class-scope variables. So, if setuptools is the only one of my libraries that does this, I'd have to agree with Guido that it is indeed quite rare. :) If I had to hazard a guess, I'd guess that it's in one of the setuptools command classes that subclasses a distutils command, and proceeds to muck around with the original options in some fashion. I just don't want to check all of them if you know which one it is. :) From guido at python.org Sat Jul 7 01:41:16 2007 From: guido at python.org (Guido van Rossum) Date: Sat, 7 Jul 2007 01:41:16 +0200 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070706233354.DA07A3A4046@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <20070706233354.DA07A3A4046@sparrow.telecommunity.com> Message-ID: On 7/7/07, Phillip J. Eby wrote: > At 12:32 AM 7/7/2007 +0200, Guido van Rossum wrote: > >On 7/6/07, Phillip J. Eby wrote: > > > At 05:00 PM 7/6/2007 +0200, Georg Brandl wrote: > > > >Collin Winter schrieb: > > > > > While experimenting with porting setuptools to py3k (as of r56155), I > > > > > ran into this situation: > > > > > > > > > > class C: > > > > > a = (4, 5) > > > > > b = [c for c in range(2) if a] > > > > > > > > > > results in a "NameError: global name 'a' is not defined" error, while > > > > > > > > > > class C: > > > > > a = (4, 5) > > > > > b = [c for c in a] > > > > > > > > > > works fine. > > > > > > This looks like a bug to me. A list comprehension's local scope > > > should be the locals of the enclosing code, even if its loop indexes > > > aren't exposed to that scope. > > > >It's because the class scope is not made available to the methods. > > The examples are in the class body, not in methods. The code is > statically initializing the class contents, so using C.a isn't possible. Understood, but a generator expression (and hence in 3.0 also a list comprehension) is treated the same as a method body. > I suppose it can be worked around by moving the static initialization > code outside the class body; it's just not obvious why it happens. > > Collin, where did you find this code in setuptools, btw? I've been > looking around at other packages of mine where static class > initialization uses data structures like this, and I haven't found > any place where anything but the "in" clause of a comprehension > depends on class-scope variables. So, if setuptools is the only one > of my libraries that does this, I'd have to agree with Guido that it > is indeed quite rare. :) > > If I had to hazard a guess, I'd guess that it's in one of the > setuptools command classes that subclasses a distutils command, and > proceeds to muck around with the original options in some fashion. I > just don't want to check all of them if you know which one it is. :) > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Sat Jul 7 03:17:39 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 07 Jul 2007 13:17:39 +1200 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070706172258.1E65B3A4046@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> Message-ID: <468EE9B3.40902@canterbury.ac.nz> Phillip J. Eby wrote: > This looks like a bug to me. A list comprehension's local scope > should be the locals of the enclosing code, even if its loop indexes > aren't exposed to that scope. It sounds like list comprehensions are being implemented using genexps behind the scenes now. Is this wise? In a recent thread, I suggested that one of the reasons for keeping the LC syntax was that it could be faster than list(genexp). Has anyone investigated whether any speed is being lost by making them equivalent? -- Greg From g.brandl at gmx.net Sat Jul 7 08:55:12 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 07 Jul 2007 08:55:12 +0200 Subject: [Python-3000] Change to class construction? In-Reply-To: <468EE9B3.40902@canterbury.ac.nz> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468EE9B3.40902@canterbury.ac.nz> Message-ID: Greg Ewing schrieb: > Phillip J. Eby wrote: >> This looks like a bug to me. A list comprehension's local scope >> should be the locals of the enclosing code, even if its loop indexes >> aren't exposed to that scope. > > It sounds like list comprehensions are being implemented > using genexps behind the scenes now. That's not true, but the implementation is somewhat similar in that the code is executed in its own function context. > Is this wise? In a recent thread, I suggested that one > of the reasons for keeping the LC syntax was that it > could be faster than list(genexp). Has anyone investigated > whether any speed is being lost by making them equivalent? I don't remember the details, but IIRC the new LC implementation was not slower than the 2.x one. Nick should know more about that. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From ncoghlan at gmail.com Sat Jul 7 16:15:54 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 08 Jul 2007 00:15:54 +1000 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468EE9B3.40902@canterbury.ac.nz> Message-ID: <468FA01A.6040707@gmail.com> Georg Brandl wrote: > Greg Ewing schrieb: >> Phillip J. Eby wrote: >>> This looks like a bug to me. A list comprehension's local scope >>> should be the locals of the enclosing code, even if its loop indexes >>> aren't exposed to that scope. >> It sounds like list comprehensions are being implemented >> using genexps behind the scenes now. > > That's not true, but the implementation is somewhat similar in that > the code is executed in its own function context. Georg is correct. A list comprehension like: [(x * y) for x in seq1 for y in seq2] expands to the following in 2.x (% prefixes the compiler's hidden variables): %n = [] for x in seq1: for y in seq2: %n.append(x*y) # Special opcode, not a normal call In py3k it expands to: def (outermost): %0 = [] for x in outermost: for y in seq2: %0.append(x*y) # Special opcode, not a normal call return %0 %n = (seq1) Python's scoping rules are somewhat tricky - doing it this way means we know they are being applied the same way in list and set comprehensions as they are applied in generator expressions, even if it isn't quite as fast as the 2.x approach to comprehensions. Another significant benefit from a maintainability point of view is that the 3 kinds of comprehension (list, set, genexp) now follow the same code path through the compiler, with only minor variations in the setup/cleanup code and the statement inside the innermost loop. >> Is this wise? In a recent thread, I suggested that one >> of the reasons for keeping the LC syntax was that it >> could be faster than list(genexp). Has anyone investigated >> whether any speed is being lost by making them equivalent? > > I don't remember the details, but IIRC the new LC implementation > was not slower than the 2.x one. Nick should know more about that. Inside a function, Py3k is slower by a constant amount relative to 2.x (the cost of creating and calling a function object) regardless of the length of the resulting list/set. At module level, Py3k will typically be faster, as the fixed cost from the anonymous function object will be overtaken by the speedup from the iteration variables becoming function locals instead of module globals. The Py3k comprehensions are still significantly faster than the equivalent generator expressions, as they still avoid suspending and resuming a generator for each value in the resulting sequence. The bit that makes all of this tricky isn't really hiding the iteration variables from the containing scope - it's making sure that the body of the comprehension can still see them after you have done so (particularly challenging if the comprehension itself contains a lambda expression, or another comprehension/genexp). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From tjreedy at udel.edu Sat Jul 7 19:08:15 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 7 Jul 2007 13:08:15 -0400 Subject: [Python-3000] Change to class construction? References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468EE9B3.40902@canterbury.ac.nz> <468FA01A.6040707@gmail.com> Message-ID: "Nick Coghlan" wrote in message news:468FA01A.6040707 at gmail.com... | Georg is correct. A list comprehension like: | | [(x * y) for x in seq1 for y in seq2] | | expands to the following in 2.x (% prefixes the compiler's hidden | variables): | | %n = [] | for x in seq1: | for y in seq2: | %n.append(x*y) # Special opcode, not a normal call | | In py3k it expands to: | | def (outermost): | %0 = [] | for x in outermost: | for y in seq2: | %0.append(x*y) # Special opcode, not a normal call | return %0 | %n = (seq1) Why not pass both seq1 *and* seq2 to the function so both become locals? The difference of treatment is quite surprising. From tjreedy at udel.edu Sat Jul 7 19:15:55 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 7 Jul 2007 13:15:55 -0400 Subject: [Python-3000] PEP 368: Standard image protocol and class References: Message-ID: Reference Implementation ======================== If this PEP is accepted, the author will provide a reference implementation of the new classes in pure Python (that can run in CPython, PyPy, Jython and IronPython) and a second one optimized for speed in Python and C, suitable for inclusion in the CPython standard library. The author will also submit the required Tkinter patches. For all the code will be available a version for Python 2.x and a version for Python 3.0 (it is expected that the two version will be very similar and the Python 3.0 one will probably be generated almost completely automatically). Acknowledgments =============== The implementation of this PEP, if accepted, is sponsored by Google through the Google Summer of Code program. **************************************************** ***************************************************** 1. I think this *should* conform to the mew buffer protocol. Assume that it will be in 3.0. 2. I don't see how work you promised to do for your stipend can be contingent on acceptance into the standard lib. In any case, this should be released at the end of the summer as patches and 3rd party module on PyPI so it can be tested in practice and then proposed for the library. Very few new library modules get accepted before written ;-). From ncoghlan at gmail.com Sun Jul 8 07:10:16 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 08 Jul 2007 15:10:16 +1000 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468EE9B3.40902@canterbury.ac.nz> <468FA01A.6040707@gmail.com> Message-ID: <469071B8.8030604@gmail.com> Terry Reedy wrote: > "Nick Coghlan" wrote in message > news:468FA01A.6040707 at gmail.com... > | In py3k it expands to: > | > | def (outermost): > | %0 = [] > | for x in outermost: > | for y in seq2: > | %0.append(x*y) # Special opcode, not a normal call > | return %0 > | %n = (seq1) > > Why not pass both seq1 *and* seq2 to the function so both become locals? > The difference of treatment is quite surprising. The inner iterable expressions can't be evaluated early, as they need to be re-evaluated for each pass around the outer loop (or loops). An example where the iterable expression for the inner loop refers to the iteration variable of the outer loop should make that clear: .>>> [y for x in range(4) for y in range(x)] [0, 0, 1, 0, 1, 2] The advantage of the Py3k approach is that it eliminates the current semantic differences between a list comprehension and list() with a generator expression argument, while keeping most of the performance benefits of the special syntax. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From collinw at gmail.com Sun Jul 8 10:55:45 2007 From: collinw at gmail.com (Collin Winter) Date: Sun, 8 Jul 2007 11:55:45 +0300 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070706233354.DA07A3A4046@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <20070706233354.DA07A3A4046@sparrow.telecommunity.com> Message-ID: <43aa6ff70707080155r820ec31t595442753817f0ae@mail.gmail.com> On 7/7/07, Phillip J. Eby wrote: > Collin, where did you find this code in setuptools, btw? I've been > looking around at other packages of mine where static class > initialization uses data structures like this, and I haven't found > any place where anything but the "in" clause of a comprehension > depends on class-scope variables. So, if setuptools is the only one > of my libraries that does this, I'd have to agree with Guido that it > is indeed quite rare. :) > > If I had to hazard a guess, I'd guess that it's in one of the > setuptools command classes that subclasses a distutils command, and > proceeds to muck around with the original options in some fashion. I > just don't want to check all of them if you know which one it is. :) Yep, it's in setuptools.command.install, lines 20-23 (setuptools v0.6c6). Collin Winter From ferringb at gmail.com Sun Jul 8 16:07:42 2007 From: ferringb at gmail.com (Brian Harring) Date: Sun, 8 Jul 2007 07:07:42 -0700 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> Message-ID: <20070708140741.GD23765@seldon> On Sat, Jul 07, 2007 at 01:08:15PM -0400, Terry Reedy wrote: > > "Nick Coghlan" wrote in message > news:468FA01A.6040707 at gmail.com... > | Georg is correct. A list comprehension like: > | > | [(x * y) for x in seq1 for y in seq2] > | > | expands to the following in 2.x (% prefixes the compiler's hidden > | variables): > | > | %n = [] > | for x in seq1: > | for y in seq2: > | %n.append(x*y) # Special opcode, not a normal call > | > | In py3k it expands to: > | > | def (outermost): > | %0 = [] > | for x in outermost: > | for y in seq2: > | %0.append(x*y) # Special opcode, not a normal call > | return %0 > | %n = (seq1) > > Why not pass both seq1 *and* seq2 to the function so both become locals? > The difference of treatment is quite surprising. I'd be curious if there is anyway to preserve the existing behaviour; class foo: some_list = ('blacklist1', 'blacklist2') known_bad = some_list += ('blah',) locals().update([(attr, some_callable) for attr in some_list]) is slightly contrived, but I use similar code quite often for method generation- both for tests, and standard enough objects. Realize I could do the same via metaclasses, but it's an extra step and not nearly as easy/friendly imo. So... anyway to preserve that trick under py3k? ~harring -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-3000/attachments/20070708/565e6b76/attachment.pgp From pje at telecommunity.com Sun Jul 8 19:50:30 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 08 Jul 2007 13:50:30 -0400 Subject: [Python-3000] Change to class construction? In-Reply-To: <43aa6ff70707080155r820ec31t595442753817f0ae@mail.gmail.com > References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <20070706233354.DA07A3A4046@sparrow.telecommunity.com> <43aa6ff70707080155r820ec31t595442753817f0ae@mail.gmail.com> Message-ID: <20070708174816.E26A33A404D@sparrow.telecommunity.com> At 11:55 AM 7/8/2007 +0300, Collin Winter wrote: >On 7/7/07, Phillip J. Eby wrote: >>Collin, where did you find this code in setuptools, btw? I've been >>looking around at other packages of mine where static class >>initialization uses data structures like this, and I haven't found >>any place where anything but the "in" clause of a comprehension >>depends on class-scope variables. So, if setuptools is the only one >>of my libraries that does this, I'd have to agree with Guido that it >>is indeed quite rare. :) >> >>If I had to hazard a guess, I'd guess that it's in one of the >>setuptools command classes that subclasses a distutils command, and >>proceeds to muck around with the original options in some fashion. I >>just don't want to check all of them if you know which one it is. :) > >Yep, it's in setuptools.command.install, lines 20-23 (setuptools v0.6c6). Ah. Yeah, no big deal to change it; 'new_commands' and '_nc' don't need to be attributes of the class, and so could just be done before the 'class:' statement. I don't know why I even bothered with the _nc thing there, either. From ncoghlan at gmail.com Mon Jul 9 13:03:15 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 09 Jul 2007 21:03:15 +1000 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070708140741.GD23765@seldon> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> Message-ID: <469215F3.90807@gmail.com> Brian Harring wrote: > > I'd be curious if there is anyway to preserve the existing behaviour; > > class foo: > some_list = ('blacklist1', 'blacklist2') > known_bad = some_list += ('blah',) > locals().update([(attr, some_callable) for attr in some_list]) > > is slightly contrived, but I use similar code quite often for method > generation- both for tests, and standard enough objects. Realize I > could do the same via metaclasses, but it's an extra step and not > nearly as easy/friendly imo. > > So... anyway to preserve that trick under py3k? As you've written it, that trick isn't affected by the semantic change at all (as the expression inside the list comprehension doesn't try to refer to a class variable). If 'some_callable' was actually a method of the class, then you'd need to use an actual for loop instead of the list comprehension: class foo(object): some_list = ('blacklist1', 'blacklist2') def some_method(self): # whatever pass for attr in some_list: locals()[attr] = some_method However, I will point out that setting class attributes via locals() is formally undefined (it happens to work in current versions of CPython, but there's no guarantee that will always be the case). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From pje at telecommunity.com Mon Jul 9 17:07:06 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 11:07:06 -0400 Subject: [Python-3000] Change to class construction? In-Reply-To: <469215F3.90807@gmail.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> Message-ID: <20070709150454.641193A404D@sparrow.telecommunity.com> At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote: >However, I will point out that setting class attributes via locals() is >formally undefined (it happens to work in current versions of CPython, >but there's no guarantee that will always be the case). As of PEP 3115, it's no longer undefined for class statements. Of course, if it were truly undefined to begin with, we wouldn't be so worried about how to implement the potential optimizations that the undefinedness theoretically implies. :) (i.e. optimized globals/locals) From guido at python.org Mon Jul 9 17:13:46 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Jul 2007 18:13:46 +0300 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070709150454.641193A404D@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> Message-ID: On 7/9/07, Phillip J. Eby wrote: > At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote: > >However, I will point out that setting class attributes via locals() is > >formally undefined (it happens to work in current versions of CPython, > >but there's no guarantee that will always be the case). > > As of PEP 3115, it's no longer undefined for class statements. Where does it say so? To be honest, I don't know where ti find Nick's claim in the reference manual. But I'm surprised that you read anything about locals() into that PEP, as it doesn't mention that function at all. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Mon Jul 9 18:03:28 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 12:03:28 -0400 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> Message-ID: <20070709160115.16C323A404D@sparrow.telecommunity.com> At 06:13 PM 7/9/2007 +0300, Guido van Rossum wrote: >On 7/9/07, Phillip J. Eby wrote: > > At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote: > > >However, I will point out that setting class attributes via locals() is > > >formally undefined (it happens to work in current versions of CPython, > > >but there's no guarantee that will always be the case). > > > > As of PEP 3115, it's no longer undefined for class statements. > >Where does it say so? To be honest, I don't know where ti find Nick's >claim in the reference manual. I assume Nick is referring to: http://www.python.org/doc/2.2/ref/execframes.html which says it's undefined. I can't seem to find where this section went to in 2.3 and beyond, or anything that says what happens with non-dictionary objects, except: http://docs.python.org/ref/exec.html which makes a much stronger claim: "The built-in functions globals() and locals() return the current global and local dictionary, respectively" and also states that as of 2.4, exec allows the use of any mapping object as the locals. There isn't any mention of the fact that locals() may not be writable, which should probably be considered an error. >But I'm surprised that you read >anything about locals() into that PEP, as it doesn't mention that >function at all. Correct -- which means that either the PEP is in error, or the semantics of locals() must be that the actual namespace in use is returned. My reasoning: since PEP 3115 allows an arbitrary mapping object to be used, there is no way that such an object can be converted to a read-only dictionary, and the current definition (as I understand it) is that locals() returns you either the actual local namespace object, or a "dictionary representing the ... namespace" (per the reference manual). Since PEP 3115 does not require that there be any way of converting the arbitrary mapping object into a dictionary (or even that there be any pre-defined way of *reading* its contents!) there is no way that locals() can fulfill its existing contract *except* by returning that object. QED. Well, that's the spelled-out reasoning for my intuition, anyway. :) That doesn't mean the PEP or the specification of locals() can't change, but it seems to me that if one or the other doesn't, then modifying class-suite locals() to create class members implicitly becomes official, since the failure for it to do so would become a bug in locals(). (Since it will no longer be returning a "dictionary representing the namespace" if it doesn't return that mapping object, and can't possibly return anything else that "represents" the namespace in any meaningful way.) From pje at telecommunity.com Mon Jul 9 20:44:09 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 14:44:09 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 Message-ID: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> PEP 3100 suggests dict.setdefault() may be removed in Python 3, since it is in principle no longer necessary (due to the new defaultdict type). However, there is another class of use cases which use setdefault for its limited atomic properties - the initialization of non-mutated data structures that are shared among threads. (And defaultdict cannot achieve the same thing.) I currently have three places where I use this, off the top of my head: 1. a "synchronized" decorator that initializes an object's __lock__ attribute (if not found) using ob.__dict__.setdefault('__lock__', allocate_lock()) 2. an Aspect implementation that does almost exactly the same thing, so that if multiple threads ask for an Aspect that doesn't exist for a given object, they will not end up using different instances. 3. a configuration library that supports "write many, read once" configurations shared across threads. A key may have its value written to any number of times, so long as it has never been read. As soon as the value has been read by any thread, it becomes fixed and it cannot be set to any other value. (Setting it to the same value has no effect.) This is essentially a simple way of having a provably race-condition-free data structure -- if you have a race condition, you will get an error. As a bonus, it is completely non-blocking and single threaded code does not pay any overhead for the use of the data structures. Of course, to take advantage of setdefault's atomic properties, one must be using CPython, and all the dictionary keys must have __hash__ and __eq__ methods implemented entirely in C (recursively to their contents, if tuples are involved). However, for all three of the above applications this latter condition is actually quite trivial to ensure. I realize, however, that this is an "impure" usage, in that other Python implementations usually do not have any atomicity guarantees, period. But it would save me having to write a setdefault function in C when porting any of the above code to 3.0. ;-) From tav at espians.com Mon Jul 9 20:59:11 2007 From: tav at espians.com (tav) Date: Mon, 9 Jul 2007 19:59:11 +0100 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com> > PEP 3100 suggests dict.setdefault() may be removed in Python 3, since > it is in principle no longer necessary (due to the new defaultdict type). > > However, there is another class of use cases which use setdefault for > its limited atomic properties - the initialization of non-mutated > data structures that are shared among threads. (And defaultdict > cannot achieve the same thing.) +1 setdefault's ability to return current value is also a very useful functionality and has saved writing: if key not in dict: value = dict[key] = value with the simpler: value = dict.setdefault(key, ) Is there a better way to do the above without .setdefault? -- love, tav founder and ceo, esp metanational llp plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From pje at telecommunity.com Mon Jul 9 21:17:12 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 15:17:12 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.co m> References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com> Message-ID: <20070709191500.4C4033A404D@sparrow.telecommunity.com> At 07:59 PM 7/9/2007 +0100, tav wrote: >>PEP 3100 suggests dict.setdefault() may be removed in Python 3, since >>it is in principle no longer necessary (due to the new defaultdict type). >> >>However, there is another class of use cases which use setdefault for >>its limited atomic properties - the initialization of non-mutated >>data structures that are shared among threads. (And defaultdict >>cannot achieve the same thing.) > >+1 > >setdefault's ability to return current value is also a very useful >functionality and has saved writing: > > if key not in dict: > value = > dict[key] = value > >with the simpler: > > value = dict.setdefault(key, ) > >Is there a better way to do the above without .setdefault? Yes, in 2.5 there's collections.defaultdict. Of course, that only works if there is a fixed mapping from keys to initial computed values for the entire dictionary for all time. Oh, and if your code gets to create the dictionary. :) From barry at python.org Mon Jul 9 21:35:50 2007 From: barry at python.org (Barry Warsaw) Date: Mon, 9 Jul 2007 15:35:50 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jul 9, 2007, at 2:44 PM, Phillip J. Eby wrote: > PEP 3100 suggests dict.setdefault() may be removed in Python 3, since > it is in principle no longer necessary (due to the new defaultdict > type). > > However, there is another class of use cases which use setdefault for > its limited atomic properties - the initialization of non-mutated > data structures that are shared among threads. (And defaultdict > cannot achieve the same thing.) Phillip, I support any initiative to keep .setdefault() or similar functionality. When this thread came up before, I wasn't against defaultdict, I just didn't think it covered enough of the use cases of .setdefault() to warrant its removal. You describe some additional use cases. However, .setdefault() is a horrible name because it's not clear from the name that a 'get' operation also happens. It occurs to me that I haven't reached my stupid idea quota for the day, so here goes. What if we ditched .setdefault() as a name and gave .get() an optional argument to also set the key's value when it's missing. class dict2(dict): """ >>> d = dict2() >>> d.setdefault('foo', []).append(7) >>> sorted(d.items()) [('foo', [7])] >>> d.setdefault('foo', []).append(8) >>> sorted(d.items()) [('foo', [7, 8])] >>> d.get('bar', [], set_missing=True).append(9) >>> sorted(d.items()) [('bar', [9]), ('foo', [7, 8])] >>> d.get('bar', [], True).append(10) >>> sorted(d.items()) [('bar', [9, 10]), ('foo', [7, 8])] """ def get(self, key, default=None, set_missing=False): missing = object() value = super(dict2, self).get(key, missing) if value is not missing: return value if set_missing: self[key] = default return default This more or less conveys that both a get and a set operation is happening. It also doesn't violate the rule against letting an argument change the return type of a function. Maybe it will make this useful functionality more palatable. Cheers, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRpKOGHEjvBPtnXfVAQJIxwP9Ev7aASfVOw3q1aiCZ3Pr4VsQwzmeb0SR 4xJR9VvAZVcsjL4wAaleU55vFir9fBnFkvEnMMRFOBJ49NtS6EuLt+yGkt22gadg TSlfNK0t4oVeFT4MJ6AebaHwBL8PvILAbV5eJ6x3H0hH383rdcdtrRyFzvhKnBRy tPqtjIZlU6Q= =WxDp -----END PGP SIGNATURE----- From guido at python.org Mon Jul 9 22:56:08 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Jul 2007 23:56:08 +0300 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070709160115.16C323A404D@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> <20070709160115.16C323A404D@sparrow.telecommunity.com> Message-ID: On 7/9/07, Phillip J. Eby wrote: > At 06:13 PM 7/9/2007 +0300, Guido van Rossum wrote: > >On 7/9/07, Phillip J. Eby wrote: > > > At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote: > > > >However, I will point out that setting class attributes via locals() is > > > >formally undefined (it happens to work in current versions of CPython, > > > >but there's no guarantee that will always be the case). > > > > > > As of PEP 3115, it's no longer undefined for class statements. > > > >Where does it say so? To be honest, I don't know where ti find Nick's > >claim in the reference manual. > > I assume Nick is referring to: > > http://www.python.org/doc/2.2/ref/execframes.html > > which says it's undefined. I can't seem to find where this section > went to in 2.3 and beyond, or anything that says what happens with > non-dictionary objects, except: > > http://docs.python.org/ref/exec.html > > which makes a much stronger claim: > > "The built-in functions globals() and locals() return the current > global and local dictionary, respectively" > > and also states that as of 2.4, exec allows the use of any mapping > object as the locals. There isn't any mention of the fact that > locals() may not be writable, which should probably be considered an error. > > > >But I'm surprised that you read > >anything about locals() into that PEP, as it doesn't mention that > >function at all. > > Correct -- which means that either the PEP is in error, or the > semantics of locals() must be that the actual namespace in use is returned. > > My reasoning: since PEP 3115 allows an arbitrary mapping object to be > used, there is no way that such an object can be converted to a > read-only dictionary, and the current definition (as I understand it) > is that locals() returns you either the actual local namespace > object, or a "dictionary representing the ... namespace" (per the > reference manual). > > Since PEP 3115 does not require that there be any way of converting > the arbitrary mapping object into a dictionary (or even that there be > any pre-defined way of *reading* its contents!) there is no way that > locals() can fulfill its existing contract *except* by returning that object. > > QED. Well, that's the spelled-out reasoning for my intuition, > anyway. :) That doesn't mean the PEP or the specification of > locals() can't change, but it seems to me that if one or the other > doesn't, then modifying class-suite locals() to create class members > implicitly becomes official, since the failure for it to do so would > become a bug in locals(). (Since it will no longer be returning a > "dictionary representing the namespace" if it doesn't return that > mapping object, and can't possibly return anything else that > "represents" the namespace in any meaningful way.) Python's specification isn't as rigid as it should be, and such a "proof" isn't worth much, especially as the reference manual hasn't always been updated as things changed. The use of the word "mapping" might easily be construed as implementing abc.Mapping, and then iteration and reading the contents would be well-defined. The weasel-words about "a dictionary representing the namespace" are meant to cover the situation for a function's local scope, which isn't stored in a mapping-like object at all until you use exec() or locals(), or a few others. We could easily change this to return a writable mapping that's not a dict at all but a "view" on the locals just as dict.keys() returns a view on a dict. I don't see why locals() couldn't return the object used to represent the namespace, but I don't see that it couldn't be some view on that object either, depending on the details of the implementation. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jul 9 23:01:15 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Jul 2007 00:01:15 +0300 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com> References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> <95d8c0810707091159k657f0fe1k5aa4eb9a5a4c96c4@mail.gmail.com> Message-ID: On 7/9/07, tav wrote: > setdefault's ability to return current value is also a very useful > functionality and has saved writing: > > if key not in dict: > value = > dict[key] = value > > with the simpler: > > value = dict.setdefault(key, ) > > Is there a better way to do the above without .setdefault? Those are not equivalent, as the form using setdefault() *always* evaluates while the other form only evaluates it when needed. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brandon at rhodesmill.org Mon Jul 9 23:01:38 2007 From: brandon at rhodesmill.org (Brandon Craig Rhodes) Date: Mon, 09 Jul 2007 17:01:38 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: (Barry Warsaw's message of "Mon, 9 Jul 2007 15:35:50 -0400") References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: <87odilnvf1.fsf@ten22.rhodesmill.org> Barry Warsaw writes: > However, .setdefault() is a horrible name because it's not clear > from the name that a 'get' operation also happens. Agreed! From the name, a clever but naive user would assume that "setdefault" sets what value the dictionary returns when a key does not exist. On first encountering the name, one imagines: >>> d = {} >>> d[1] KeyError: 1 >>> d.setdefault('missing') >>> d[1] 'missing' -- Brandon Craig Rhodes brandon at rhodesmill.org http://rhodesmill.org/brandon From guido at python.org Mon Jul 9 23:04:56 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Jul 2007 00:04:56 +0300 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: On 7/9/07, Barry Warsaw wrote: > Phillip, I support any initiative to keep .setdefault() or similar > functionality. When this thread came up before, I wasn't against > defaultdict, I just didn't think it covered enough of the use cases > of .setdefault() to warrant its removal. You describe some > additional use cases. > > However, .setdefault() is a horrible name because it's not clear from > the name that a 'get' operation also happens. We had a long name discussion when it was introduced. Perhaps we can go back to the list suggested then and see if a better alternative was overlooked? > It occurs to me that I haven't reached my stupid idea quota for the > day, so here goes. What if we ditched .setdefault() as a name and > gave .get() an optional argument to also set the key's value when > it's missing. > [...] > def get(self, key, default=None, set_missing=False): > missing = object() > value = super(dict2, self).get(key, missing) > if value is not missing: > return value > if set_missing: > self[key] = default > return default > > This more or less conveys that both a get and a set operation is > happening. It also doesn't violate the rule against letting an > argument change the return type of a function. Maybe it will make > this useful functionality more palatable. But it does violate the rule that if you have a boolean flag to indicate a "variant" of an API and in practice you'll always be passing a constant for that flag, you're better off defining two methods with different names. Although if the return type isn't different, the semantics are certainly *very* different here. So I'm strongly against this. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Mon Jul 9 23:29:04 2007 From: python at rcn.com (Raymond Hettinger) Date: Mon, 9 Jul 2007 14:29:04 -0700 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: <002a01c7c270$42de5060$a389763f@RaymondLaptop1> > PEP 3100 suggests dict.setdefault() may be removed in Python 3, since > it is in principle no longer necessary (due to the new defaultdict type). I've forgotten. What was the whole point of Python 3.0? Is it to make the language fat with lots of ways to do everything? Guys, this is your ONE chance to slim down the language and pare away anything that is unnecessary or arcane. The setdefault() method has too many defects to keep around. Why would you want a method that instantiates the default on every call even if not needed. Let this one die. The dict API already heavily loaded. Thinning it a bit would be a nice improvement. Raymond From fumanchu at amor.org Mon Jul 9 23:55:41 2007 From: fumanchu at amor.org (Robert Brewer) Date: Mon, 9 Jul 2007 14:55:41 -0700 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <002a01c7c270$42de5060$a389763f@RaymondLaptop1> Message-ID: <435DF58A933BA74397B42CDEB8145A860DBCAFEA@ex9.hostedexchange.local> Raymond Hettinger wrote: > > PEP 3100 suggests dict.setdefault() may be removed in > > Python 3, since it is in principle no longer necessary > > (due to the new defaultdict type). > > I've forgotten. What was the whole point of Python 3.0? > Is it to make the language fat with lots of ways to do everything? > Guys, this is your ONE chance to slim down the language and > pare away anything that is unnecessary or arcane. > > The setdefault() method has too many defects to keep around. > Why would you want a method that instantiates the default on > every call even if not needed. > > Let this one die. The dict API already heavily loaded. Thinning > it a bit would be a nice improvement. I have to agree, even though it means more work for me (due to my own heavy use of setdefault for its atomicity). Perhaps a better resolution for these use cases would be a stdlib module which would provide fast, thread-safe collections. This would standardize, across implementations, some of the CPython behaviors we've come to rely on. It would also make make it clear that the given type is being used specifically for its thread-safety. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From barry at python.org Tue Jul 10 00:14:33 2007 From: barry at python.org (Barry Warsaw) Date: Mon, 9 Jul 2007 18:14:33 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: <9D661F09-FBD2-4C5D-9F90-2DDC476767D7@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jul 9, 2007, at 5:04 PM, Guido van Rossum wrote: > On 7/9/07, Barry Warsaw wrote: >> Phillip, I support any initiative to keep .setdefault() or similar >> functionality. When this thread came up before, I wasn't against >> defaultdict, I just didn't think it covered enough of the use cases >> of .setdefault() to warrant its removal. You describe some >> additional use cases. >> >> However, .setdefault() is a horrible name because it's not clear from >> the name that a 'get' operation also happens. > > We had a long name discussion when it was introduced. Perhaps we can > go back to the list suggested then and see if a better alternative was > overlooked? Don't look here because some big dummy contradicts himself seven years later: http://mail.python.org/pipermail/python-dev/2000-August/007819.html hmm-put()-ly y'rs, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRpKzSXEjvBPtnXfVAQKRmQP8DZDYKFOhOjYvtf+OkmmgAnwWaOI5tpPv kHHxtMGPdgEM3cXAdT0U5m04W1IUmMKBItV/JE4qGO4OdD0eFIUPaZBufVUIIg3b 230qJnamVWrzZ/uRUhgDK363Kt2NstrxKce+kX37FPy2qHUSu3RMiBpzx9NJBW8I P3rjaqYZycg= =cU+w -----END PGP SIGNATURE----- From barry at python.org Tue Jul 10 00:17:08 2007 From: barry at python.org (Barry Warsaw) Date: Mon, 9 Jul 2007 18:17:08 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <002a01c7c270$42de5060$a389763f@RaymondLaptop1> References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> <002a01c7c270$42de5060$a389763f@RaymondLaptop1> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jul 9, 2007, at 5:29 PM, Raymond Hettinger wrote: >> PEP 3100 suggests dict.setdefault() may be removed in Python 3, since >> it is in principle no longer necessary (due to the new defaultdict >> type). > > I've forgotten. What was the whole point of Python 3.0? > Is it to make the language fat with lots of ways to do everything? > Guys, this is your ONE chance to slim down the language and > pare away anything that is unnecessary or arcane. > > The setdefault() method has too many defects to keep around. > Why would you want a method that instantiates the default on > every call even if not needed. Um, like .get()? > Let this one die. The dict API already heavily loaded. Thinning > it a bit would be a nice improvement. Unless you remove something useful. The problem with setdefault() isn't what it does, it's the name. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRpKz5HEjvBPtnXfVAQKV4gP+Ntpkcmo9Yx0d0CvPuGen1E78RLGVquhm wtaGY2OHsQk8Fq+5DSLdTLQcqba5Ru8kToxcFG+FbKuul7xvN+yFJ4yfFzBKvp6z CLwE+GkP6v/zC/W1hJ0zkd/0zWE4tPp5Egmug5BhZ6n2ZkwX2ExCfq2jMXf/xmsV cmu7z3TWQXI= =BzxB -----END PGP SIGNATURE----- From pje at telecommunity.com Tue Jul 10 02:13:56 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 20:13:56 -0400 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> <20070709160115.16C323A404D@sparrow.telecommunity.com> Message-ID: <20070710001144.3B26A3A404D@sparrow.telecommunity.com> At 11:56 PM 7/9/2007 +0300, Guido van Rossum wrote: > The use of the word "mapping" >might easily be construed as implementing abc.Mapping, and then >iteration and reading the contents would be well-defined. I'm not sure which use of the word "mapping" you're talking about. PEP 3115 is explicit that there is no specific requirements for the __prepare__()'d namespace; it just mentions some things that might be useful to have in such an object. So, in order to replace it with a view or something, we'd want to change the PEP to explicitly document what is required. Personally, I'd just as soon make it explicitly official that locals() in a class suite gives you the __prepare__()'d object, whatever it is. If a given Python implementation can support PEP 3115 in the first place, then it clearly knows what object to return. ;-) From pje at telecommunity.com Tue Jul 10 02:21:44 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 20:21:44 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: <20070710001930.07D653A404D@sparrow.telecommunity.com> At 12:04 AM 7/10/2007 +0300, Guido van Rossum wrote: >On 7/9/07, Barry Warsaw wrote: >>Phillip, I support any initiative to keep .setdefault() or similar >>functionality. When this thread came up before, I wasn't against >>defaultdict, I just didn't think it covered enough of the use cases >>of .setdefault() to warrant its removal. You describe some >>additional use cases. >> >>However, .setdefault() is a horrible name because it's not clear from >>the name that a 'get' operation also happens. > >We had a long name discussion when it was introduced. Perhaps we can >go back to the list suggested then and see if a better alternative was >overlooked? Personally, for my use cases it wouldn't matter if it didn't return a value, because I'm not using it to shorten the code. So if you took away the return value and left the name (or changed it to something clearer), that'd be okay by me. The alternative, of course, is as Robert suggested, to just write some library code to deal with this and similar issues. If I have to import setdefault from somewhere to use it (ala the heapq.* functions), that's fine by me too, as long as it's still able to be atomic. That approach might also address Raymond's desire to narrow the dictionary object API. From rrr at ronadam.com Tue Jul 10 02:21:31 2007 From: rrr at ronadam.com (Ron Adam) Date: Mon, 09 Jul 2007 19:21:31 -0500 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: <4692D10B.5040902@ronadam.com> Barry Warsaw wrote: > However, .setdefault() is a horrible name because it's not clear from > the name that a 'get' operation also happens. The return value of .setdefault() could be changed to None, then the name would be correct. And then a helper function could fill the current use case of returning the added abject at the same time. >>> d = {} >>> def setget(setter, getter, vars): ... setter(*vars) ... return getter(*vars) ... >>> setget(d.setdefault, d.get, ('foo', [])).append(7) >>> d {'foo': [7]} >>> setget(d.setdefault, d.get, ('foo', [])).append(8) >>> d {'foo': [7, 8]} Now if this could be made to be more general so it worked with with other objects it might really be useful. ;-) Cheers, Ron From rrr at ronadam.com Tue Jul 10 02:40:18 2007 From: rrr at ronadam.com (Ron Adam) Date: Mon, 09 Jul 2007 19:40:18 -0500 Subject: [Python-3000] Change to class construction? In-Reply-To: References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> <20070709160115.16C323A404D@sparrow.telecommunity.com> Message-ID: <4692D572.4080401@ronadam.com> Guido van Rossum wrote: > We could easily change this to return a > writable mapping that's not a dict at all but a "view" on the locals > just as dict.keys() returns a view on a dict. I don't see why locals() > couldn't return the object used to represent the namespace, but I > don't see that it couldn't be some view on that object either, > depending on the details of the implementation. This sounds great! I just recently wanted to pass a namespace to exec, but it refuses to accept anything but a dictionary for a local name space. What I really want to do is pass an object as the local namespace. And have the exec() use it complete with it's properties intact. Passing obj.__dict__ doesn't work in this case. Cheers, Ron From pje at telecommunity.com Tue Jul 10 02:48:54 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 09 Jul 2007 20:48:54 -0400 Subject: [Python-3000] Change to class construction? In-Reply-To: <4692D572.4080401@ronadam.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> <20070709160115.16C323A404D@sparrow.telecommunity.com> <4692D572.4080401@ronadam.com> Message-ID: <20070710004640.93BD83A40A4@sparrow.telecommunity.com> At 07:40 PM 7/9/2007 -0500, Ron Adam wrote: >Guido van Rossum wrote: > >>We could easily change this to return a >>writable mapping that's not a dict at all but a "view" on the locals >>just as dict.keys() returns a view on a dict. I don't see why locals() >>couldn't return the object used to represent the namespace, but I >>don't see that it couldn't be some view on that object either, >>depending on the details of the implementation. > >This sounds great! I just recently wanted to pass a namespace to >exec, but it refuses to accept anything but a dictionary for a local >name space. You can already do that in Python 2.4. >What I really want to do is pass an object as the local >namespace. And have the exec() use it complete with it's properties >intact. Passing obj.__dict__ doesn't work in this case. You need a wrapper, e.g.: class AttrMap(object): def __init__(self, ob): self.ob = ob def __getitem__(self, key): try: return getattr(self.ob, key) except AttributeError: raise KeyError, key # setitem, delitem, etc... From greg.ewing at canterbury.ac.nz Tue Jul 10 03:16:20 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Jul 2007 13:16:20 +1200 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> Message-ID: <4692DDE4.9090707@canterbury.ac.nz> Phillip J. Eby wrote: > However, there is another class of use cases which use setdefault for > its limited atomic properties - the initialization of non-mutated > data structures that are shared among threads. Isn't it rather dangerous to rely on any built-in Python operations to be atomic? They might happen to be, but I don't think there's any guarantee they will stay that way. -- Greg From matt-python at theory.org Tue Jul 10 03:33:07 2007 From: matt-python at theory.org (Matt Chisholm) Date: Mon, 9 Jul 2007 18:33:07 -0700 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: References: <20070630205444.GD22221@theory.org> Message-ID: <20070710013307.GA5495@theory.org> On Jul 3 2007, 10:14, Guido van Rossum wrote: >On 6/30/07, Matt Chisholm wrote: >>I've created and submitted a new PEP proposing support for labels in >>Python's break and continue statements. Georg Brandl has graciously >>added it to the PEP list as PEP 3136: >> >>http://www.python.org/dev/peps/pep-3136/ > >I think this is a good summary of various proposals that have been >floated in the past, plus some new ones. As a PEP, it falls short >because it doesn't pick a solution but merely offers a large menu of >possible options. Also, there is nothing about implementation yet. I was hoping the community would pick their favorite option. And I planned to address implementation if the PEP was well received. >However, I'm rejecting it on the basis that code so complicated to >require this feature is very rare. In most cases there are existing >work-arounds that produce clean code, for example using 'return'. >While I'm sure there are some (rare) real cases where clarity of the >code would suffer from a refactoring that makes it possible to use >return, this is offset by two issues: > >1. The complexity added to the language, permanently. This affects not >only all Python implementations, but also every source analysis tool, >plus of course all documentation for the language. Not knowing anything about the implementation details, I can't argue with that. >2. My expectation that the feature will be abused more than it will be >used right, leading to a net decrease in code clarity (measured across >all Python code written henceforth). Lazy programmers are everywhere, >and before you know it you have an incredible mess on your hands of >unintelligible code. Are break / continue currently abused more than they are used right, or used to make code difficult to understand? I am trying to come up with an example of mis-use of labeled break or continue that is mitigated by the absence of labels, and I can't quite think of one. Maybe I'm being unimaginative. :) >I realize this is a heavy bar to pass, and somewhat subjective. That's >okay. There is real value in having a small language. Also, as I said, >while there are no past PEPs to document it, this has been brought up >and rejected many times before. So, I don't quite agree, but you're the boss. If this has been rejected before, I don't want to waste everybody's time discussing it again. Should I add your justification to the PEP and change it's status? -matt P.S. Thanks to everybody who read the PEP and commented. :) From greg.ewing at canterbury.ac.nz Tue Jul 10 03:40:51 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Jul 2007 13:40:51 +1200 Subject: [Python-3000] Announcing PEP 3136 In-Reply-To: <20070710013307.GA5495@theory.org> References: <20070630205444.GD22221@theory.org> <20070710013307.GA5495@theory.org> Message-ID: <4692E3A3.9010600@canterbury.ac.nz> Matt Chisholm wrote: > Are break / continue currently abused more than they are used right, > or used to make code difficult to understand? In my experience, using break and continue for anything other than a standard loop-and-a-half makes code hard to follow, even when there is only one loop. Labels would not mitigate that. -- Greg From fdrake at acm.org Tue Jul 10 05:20:01 2007 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 9 Jul 2007 23:20:01 -0400 Subject: [Python-3000] A request to keep dict.setdefault() in 3.0 In-Reply-To: <4692DDE4.9090707@canterbury.ac.nz> References: <20070709184156.E2EFC3A404D@sparrow.telecommunity.com> <4692DDE4.9090707@canterbury.ac.nz> Message-ID: <200707092320.01898.fdrake@acm.org> On Monday 09 July 2007, Greg Ewing wrote: > Isn't it rather dangerous to rely on any built-in > Python operations to be atomic? They might happen > to be, but I don't think there's any guarantee > they will stay that way. My limited recollection is that setdefault() was all about it being atomic; otherwise there's no benefit to building it in C. The documentation sadly omits mentioning this very important property of setdefault(), however. If the atomicity isn't promised, then there's no benefit, and writing a helper in Python would be fine. However, as we've seen in this discussion, that's critical to many users of the method. Without it, most users would have to add a C (or whatever) function that did the same task and made the atomicity promise. IMHO, it's better to have a single shared implementation with this promise; that makes it easier to recognize when reading unfamiliar code. -Fred -- Fred L. Drake, Jr. From rrr at ronadam.com Tue Jul 10 06:03:04 2007 From: rrr at ronadam.com (Ron Adam) Date: Mon, 09 Jul 2007 23:03:04 -0500 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070710004640.93BD83A40A4@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> <20070709160115.16C323A404D@sparrow.telecommunity.com> <4692D572.4080401@ronadam.com> <20070710004640.93BD83A40A4@sparrow.telecommunity.com> Message-ID: <469304F8.2060303@ronadam.com> Phillip J. Eby wrote: > At 07:40 PM 7/9/2007 -0500, Ron Adam wrote: > >> Guido van Rossum wrote: >> >>> We could easily change this to return a >>> writable mapping that's not a dict at all but a "view" on the locals >>> just as dict.keys() returns a view on a dict. I don't see why locals() >>> couldn't return the object used to represent the namespace, but I >>> don't see that it couldn't be some view on that object either, >>> depending on the details of the implementation. >> >> This sounds great! I just recently wanted to pass a namespace to exec, >> but it refuses to accept anything but a dictionary for a local name >> space. > > You can already do that in Python 2.4. > > >> What I really want to do is pass an object as the local namespace. >> And have the exec() use it complete with it's properties intact. >> Passing obj.__dict__ doesn't work in this case. > > You need a wrapper, e.g.: > > class AttrMap(object): > def __init__(self, ob): > self.ob = ob > def __getitem__(self, key): > try: return getattr(self.ob, key) > except AttributeError: raise KeyError, key > # setitem, delitem, etc... Thanks, that should solves (I hope) the particular case I have. Although it would have been nicer if it was in the library someplace. Of course everyone says that about nearly everything. It might be nice if locals() could receive an argument so it can be used with class's. Possible returning a wrapped class view such as the example you gave. Regards, Ron From ncoghlan at gmail.com Tue Jul 10 11:33:04 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 10 Jul 2007 19:33:04 +1000 Subject: [Python-3000] Change to class construction? In-Reply-To: <20070709160115.16C323A404D@sparrow.telecommunity.com> References: <43aa6ff70707060703w6d4f2edbm82a0a6da4a6bbd90@mail.gmail.com> <20070706172258.1E65B3A4046@sparrow.telecommunity.com> <468FA01A.6040707@gmail.com> <20070708140741.GD23765@seldon> <469215F3.90807@gmail.com> <20070709150454.641193A404D@sparrow.telecommunity.com> <20070709160115.16C323A404D@sparrow.telecommunity.com> Message-ID: <46935250.8060903@gmail.com> Phillip J. Eby wrote: > At 06:13 PM 7/9/2007 +0300, Guido van Rossum wrote: >> On 7/9/07, Phillip J. Eby wrote: >> > At 09:03 PM 7/9/2007 +1000, Nick Coghlan wrote: >> > >However, I will point out that setting class attributes via >> locals() is >> > >formally undefined (it happens to work in current versions of CPython, >> > >but there's no guarantee that will always be the case). >> > >> > As of PEP 3115, it's no longer undefined for class statements. >> >> Where does it say so? To be honest, I don't know where ti find Nick's >> claim in the reference manual. > > I assume Nick is referring to: > > http://www.python.org/doc/2.2/ref/execframes.html > > which says it's undefined. I was actually referring to this warning in the library reference docs for the locals() function: """Warning: The contents of this dictionary should not be modified; changes may not affect the values of local variables used by the interpreter.""" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Tue Jul 10 23:14:27 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 00:14:27 +0300 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni Message-ID: One of the most daunting tasks remaining for Python 3.0a1 (to be released by the end of August) is fixing the remaining failing unit tests in the py3k-struni branch (http://svn.python.org/view/python/branches/py3k-struni/). This is the branch where I have started the work on the string/unification branch. I want to promote this branch to become the "main" Py3k branch ASAP (by renaming it to py3k), but I don't want to do that until all unit tests pass. I've been working diligently on this task, and I've got it down to about 50 tests that are failing on at least one of OSX and Ubuntu (the platforms to which I have easy access). Now I need help. To facilitate distributing the task of getting the remaining tests to pass, I've created a wiki page: http://wiki.python.org/moin/Py3kStrUniTests . Please help! It's easy to help: (1) check out the py3k-struni branch; (2) build it; (3) pick a test and figure out why it's failing; (4) produce a fix; (5) submit the fix to SF (or check it in, if you have submit privileges and are confident enough). In order to avoid duplicate work, I've come up with a simple protocol: you mark a test in the wiki as "MINE" (with your name) when you start looking at it. You mark it as "FIXED [IN SF]" once you fix it, adding the patch# if the fix is in SF. If you give up, remove your lock, adding instead a note with what you've found (even just the names of the failing subtests is helpful). Please help! There are other tasks, see PEP 3100. Mail me if you're interested in anything specifically. (Please don't ask me "do you think I could do this" -- you know better than I whether you're capable of coding at a specific level. If you don't understand the task, you're probably not qualified.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From lists at cheimes.de Wed Jul 11 00:30:13 2007 From: lists at cheimes.de (Christian Heimes) Date: Wed, 11 Jul 2007 00:30:13 +0200 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: Message-ID: <46940875.2000606@cheimes.de> Guido van Rossum wrote: > Please help! I've made a meta patch that makes debugging the bugs a lot easier. It replaces assert_(foo == bar) and failUnless(foo == bar) with failUnlessEqual(foo, bar). failUnlessEqual shows the value of foo and bar when they are not equal. http://www.python.org/sf/1751515 sed -r "s/self\.assert_\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py sed -r "s/self\.failUnless\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py By the way the ctypes unit tests are causing a segfault on my machine: test_ctypes Warning: could not import ctypes.test.test_numbers: unpack requires a string argument of length 1 Segmentation fault Ubunutu 7.04 on i386 machine with an Intel P3. Christian From steven.bethard at gmail.com Wed Jul 11 00:38:53 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 10 Jul 2007 16:38:53 -0600 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: <46940875.2000606@cheimes.de> References: <46940875.2000606@cheimes.de> Message-ID: On 7/10/07, Christian Heimes wrote: > Guido van Rossum wrote: > > Please help! > > I've made a meta patch that makes debugging the bugs a lot easier. It > replaces assert_(foo == bar) and failUnless(foo == bar) with > failUnlessEqual(foo, bar). failUnlessEqual shows the value of foo and > bar when they are not equal. > > http://www.python.org/sf/1751515 > > sed -r "s/self\.assert_\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py > sed -r "s/self\.failUnless\((.*)\ ==/self.failUnlessEqual\(\1,/" -i *.py Some of these look questionable, e.g.: - self.assert_(d == self.spamle or d == self.spambe) + self.failUnlessEqual(d == self.spamle or d, self.spambe) ... - self.assert_((a == 42) is False) + self.failUnlessEqual((a, 42) is False) I'd probably go with something a little more restrictive, maybe: r'self.assert_\(\S+ == \S+\)' Something like that ought to have fewer false positives. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From lists at cheimes.de Wed Jul 11 01:17:26 2007 From: lists at cheimes.de (Christian Heimes) Date: Wed, 11 Jul 2007 01:17:26 +0200 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: <46941386.9080301@cheimes.de> Steven Bethard wrote: > I'd probably go with something a little more restrictive, maybe: > > r'self.assert_\(\S+ == \S+\)' > > Something like that ought to have fewer false positives. Woops! You are right. Even your pattern has caused some false positives but I've reread the patch and removed the offending lines. I'm going to upload another patch as soon as I have verified mine again. Christian From lists at cheimes.de Wed Jul 11 03:54:49 2007 From: lists at cheimes.de (Christian Heimes) Date: Wed, 11 Jul 2007 03:54:49 +0200 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: Message-ID: I found a bug in the str type that may affect a lot of tests. In the py3k-struni branch the str() constructor doesn't use __str__ when the argument is an instance of a subclass of str. A user defined string can't change __str__(). The __repr__ method isn't affected. It works in Python 2.5 and in the p3yk branch. Python 3.0x (py3k-struni:56245, Jul 10 2007, 23:34:56) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class Mystr(str): ... def __str__(self): return 'v' ... >>> s = Mystr('x') >>> s 'x' >>> str(s) 'x' # <- SHOULD RETURN 'v' Christian From guido at python.org Wed Jul 11 08:48:49 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 09:48:49 +0300 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: <46941386.9080301@cheimes.de> References: <46940875.2000606@cheimes.de> <46941386.9080301@cheimes.de> Message-ID: Please use self.assertEqual() instead of self.failUnlessEqual() -- the assertEqual() form is much more common. Otherwise, good idea! On 7/11/07, Christian Heimes wrote: > Steven Bethard wrote: > > I'd probably go with something a little more restrictive, maybe: > > > > r'self.assert_\(\S+ == \S+\)' > > > > Something like that ought to have fewer false positives. > > Woops! You are right. Even your pattern has caused some false positives > but I've reread the patch and removed the offending lines. I'm going to > upload another patch as soon as I have verified mine again. > > Christian > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Wed Jul 11 09:51:58 2007 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 11 Jul 2007 09:51:58 +0200 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: Message-ID: <46948C1E.5050800@livinglogic.de> Christian Heimes wrote: > I found a bug in the str type that may affect a lot of tests. > > In the py3k-struni branch the str() constructor doesn't use __str__ when > the argument is an instance of a subclass of str. A user defined string > can't change __str__(). The __repr__ method isn't affected. This hasn't been rewired yet. Behind the covers str still behaves like unicode, i.e. it uses __unicode__ for conversion. Servus, Walter From guido at python.org Wed Jul 11 10:01:05 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 11:01:05 +0300 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: <46948C1E.5050800@livinglogic.de> References: <46948C1E.5050800@livinglogic.de> Message-ID: Yeah, I'm looking in to this right now. What a mess! But I'm close to a fix. There's more that causes test_descr to fail however. Bleh, what a terrible unit test -- it doesn't use the unittest module, and a single failure aborts the rest of the test. --Guido On 7/11/07, Walter D?rwald wrote: > Christian Heimes wrote: > > > I found a bug in the str type that may affect a lot of tests. > > > > In the py3k-struni branch the str() constructor doesn't use __str__ when > > the argument is an instance of a subclass of str. A user defined string > > can't change __str__(). The __repr__ method isn't affected. > > This hasn't been rewired yet. Behind the covers str still behaves like > unicode, i.e. it uses __unicode__ for conversion. > > Servus, > Walter > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 11 11:30:58 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 12:30:58 +0300 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46948C1E.5050800@livinglogic.de> Message-ID: Fixed in subversion. Please do review r56252 to see that I did the right thing. On 7/11/07, Guido van Rossum wrote: > Yeah, I'm looking in to this right now. What a mess! But I'm close to a fix. > > There's more that causes test_descr to fail however. Bleh, what a > terrible unit test -- it doesn't use the unittest module, and a single > failure aborts the rest of the test. > > --Guido > > On 7/11/07, Walter D?rwald wrote: > > Christian Heimes wrote: > > > > > I found a bug in the str type that may affect a lot of tests. > > > > > > In the py3k-struni branch the str() constructor doesn't use __str__ when > > > the argument is an instance of a subclass of str. A user defined string > > > can't change __str__(). The __repr__ method isn't affected. > > > > This hasn't been rewired yet. Behind the covers str still behaves like > > unicode, i.e. it uses __unicode__ for conversion. > > > > Servus, > > Walter > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 11 13:45:04 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 14:45:04 +0300 Subject: [Python-3000] Fwd: Your confirmation is required to leave the Python-3000 mailing list In-Reply-To: References: Message-ID: Which joker tried to unsub me? ---------- Forwarded message ---------- From: python-3000-confirm+e08ed5828...1ff418f380758543 at python.org Date: Jul 11, 2007 12:43 PM Subject: Your confirmation is required to leave the Python-3000 mailing list To: guido at python.org Mailing list removal confirmation notice for mailing list Python-3000 We have received a request for the removal of your email address, "guido at python.org" from the python-3000 at python.org mailing list. To confirm that you want to be removed from this mailing list, simply reply to this message, keeping the Subject: header intact. Or visit this web page: http://mail.python.org/mailman/confirm/python-3000/e08ed5828...8543 Or include the following line -- and only the following line -- in a message to python-3000-request at python.org: confirm e08e...0758543 Note that simply sending a `reply' to this message should work from most mail readers, since that usually leaves the Subject: line in the right form (additional "Re:" text in the Subject: is okay). If you do not wish to be removed from this list, please simply disregard this message. If you think you are being maliciously removed from the list, or have any other questions, send them to python-3000-owner at python.org. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 11 13:46:12 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 14:46:12 +0300 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: On 7/11/07, Thomas Heller wrote: > Christian Heimes schrieb: > > > > By the way the ctypes unit tests are causing a segfault on my machine: > > test_ctypes > > Warning: could not import ctypes.test.test_numbers: unpack requires a > > string argument of length 1 > > Segmentation fault > > > > Ubunutu 7.04 on i386 machine with an Intel P3. > > I can reproduce this. ctypes.test.test_numbers is easy to fix, but there > are other severe problems with ctypes. > > I would love to look into these, but I prefer debugging on Windows. > However, the windows build does not work because the _fileio builtin > module is missing from config.c. Again, this is not so easy to fix, > because the ftruncate function does not exist on Windows. I don't have a Windows box; contributions to fix this situation are welcome. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From amauryfa at gmail.com Wed Jul 11 14:27:35 2007 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 11 Jul 2007 14:27:35 +0200 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: Thomas Heller wrote: > I would love to look into these, but I prefer debugging on Windows. > However, the windows build does not work because the _fileio builtin > module is missing from config.c. Again, this is not so easy to fix, > because the ftruncate function does not exist on Windows. In fileobject.c, there is a replacement for ftruncate. See the code around the call to SetEndOfFile(). I'll try to provide a patch later today. -- Amaury Forgeot d'Arc From guido at python.org Wed Jul 11 14:41:21 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 15:41:21 +0300 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: That would be great! Assign it to theller who can test it much better than I can. On 7/11/07, Amaury Forgeot d'Arc wrote: > Thomas Heller wrote: > > I would love to look into these, but I prefer debugging on Windows. > > However, the windows build does not work because the _fileio builtin > > module is missing from config.c. Again, this is not so easy to fix, > > because the ftruncate function does not exist on Windows. > > In fileobject.c, there is a replacement for ftruncate. See the code > around the call to SetEndOfFile(). > > I'll try to provide a patch later today. > > -- > Amaury Forgeot d'Arc > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at ctypes.org Wed Jul 11 14:50:44 2007 From: theller at ctypes.org (Thomas Heller) Date: Wed, 11 Jul 2007 14:50:44 +0200 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: Guido van Rossum schrieb: > That would be great! Assign it to theller who can test it much better > than I can. > > On 7/11/07, Amaury Forgeot d'Arc wrote: >> Thomas Heller wrote: >> > I would love to look into these, but I prefer debugging on Windows. >> > However, the windows build does not work because the _fileio builtin >> > module is missing from config.c. Again, this is not so easy to fix, >> > because the ftruncate function does not exist on Windows. >> >> In fileobject.c, there is a replacement for ftruncate. See the code >> around the call to SetEndOfFile(). >> >> I'll try to provide a patch later today. Awaiting your patch ;-). The most important problem, IMO, is now that wide filenames on Windows are not implemented, see the code starting at line 148 in _fileio.c. This prevents most unittests to run because test_support cannot be imported: C:\svn\py3k-struni\PCbuild>python -E -tt ../lib/test/regrtest.py Traceback (most recent call last): File "../lib/test/regrtest.py", line 165, in from test import test_support File "C:\svn\py3k-struni\lib\test\test_support.py", line 182, in fp = open(TESTFN, 'w+') File "C:\svn\py3k-struni\lib\site.py", line 412, in __new__ return io.open(*args, **kwds) File "C:\svn\py3k-struni\lib\io.py", line 122, in open (updating and "+" or "")) NotImplementedError: Windows wide filenames are not yet supported Thomas From theller at ctypes.org Wed Jul 11 16:08:47 2007 From: theller at ctypes.org (Thomas Heller) Date: Wed, 11 Jul 2007 16:08:47 +0200 Subject: [Python-3000] Heaptypes Message-ID: ctypes creates heaptypes with this call, in _ctypes.c, line 3986 (slightly simplified): result = PyObject_CallFunction((PyObject *)&ArrayType_Type, "s(O){s:n,s:O}", name, &Array_Type, "_length_", length, "_type_", itemtype ); The call succeeds. Printing the type fails with an assertion: theller at tubu:~/devel/py3k-struni$ ./python Python 3.0x (py3k-struni:56268M, Jul 11 2007, 15:56:43) [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from ctypes import c_int [54751 refs] >>> atype = c_int * 3 [54762 refs] >>> atype.__name__ s'c_int_Array_3' [55278 refs] >>> repr(atype) python: Objects/unicodeobject.c:630: PyUnicodeUCS2_FromFormatV: Assertion `obj && ((((obj)->ob_type)->tp_flags & ((1L<<28))) != 0)' failed. Abgebrochen theller at tubu:~/devel/py3k-struni$ As one can see, the __name__ is a byte string (or how is this called now?). The fix is probably to use an 'U' format character in the PyObject_CallFunction format string, but I assume the call should have failed in the first place? And what about the dictionary that is constructed for the call '{s:n,s:O}', should it use 'U' format chars also? Thomas From guido at python.org Wed Jul 11 16:15:41 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 17:15:41 +0300 Subject: [Python-3000] Heaptypes In-Reply-To: References: Message-ID: There are currently three "string" types, here shown with there repr styles: - str = 'same as unicode in 2.x' - bytes = b'new, mutable list of small ints' - str8 = s'same as str in 2.x' The s'...' notation means it's an 8-bit string (not a bytes array). This is not supported in the syntax; it's just used on output. (Use str8(b'...') to create one of these.) I'm still hoping to remove this type before the release, but it appears to be still necessary so far. I don't know enouch about ...CallFunction to help you with the rest. --Guido On 7/11/07, Thomas Heller wrote: > ctypes creates heaptypes with this call, in _ctypes.c, line 3986 (slightly simplified): > > result = PyObject_CallFunction((PyObject *)&ArrayType_Type, > "s(O){s:n,s:O}", > name, > &Array_Type, > "_length_", > length, > "_type_", > itemtype > ); > > The call succeeds. Printing the type fails with an assertion: > > theller at tubu:~/devel/py3k-struni$ ./python > Python 3.0x (py3k-struni:56268M, Jul 11 2007, 15:56:43) > [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from ctypes import c_int > [54751 refs] > >>> atype = c_int * 3 > [54762 refs] > >>> atype.__name__ > s'c_int_Array_3' > [55278 refs] > >>> repr(atype) > python: Objects/unicodeobject.c:630: PyUnicodeUCS2_FromFormatV: Assertion `obj && ((((obj)->ob_type)->tp_flags & ((1L<<28))) != 0)' failed. > Abgebrochen > theller at tubu:~/devel/py3k-struni$ > > As one can see, the __name__ is a byte string (or how is this called now?). > The fix is probably to use an 'U' format character in the PyObject_CallFunction format string, > but I assume the call should have failed in the first place? And what about the dictionary that > is constructed for the call '{s:n,s:O}', should it use 'U' format chars also? > > Thomas > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at ctypes.org Wed Jul 11 16:39:00 2007 From: theller at ctypes.org (Thomas Heller) Date: Wed, 11 Jul 2007 16:39:00 +0200 Subject: [Python-3000] Heaptypes In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > There are currently three "string" types, here shown with there repr styles: > > - str = 'same as unicode in 2.x' > - bytes = b'new, mutable list of small ints' > - str8 = s'same as str in 2.x' > > The s'...' notation means it's an 8-bit string (not a bytes array). > This is not supported in the syntax; it's just used on output. (Use > str8(b'...') to create one of these.) I'm still hoping to remove this > type before the release, but it appears to be still necessary so far. > > I don't know enouch about ...CallFunction to help you with the rest. Let me explain it in other words. This code creates a new type: >>> ht = type("name", (object,), {}) [47054 refs] >>> ht [47093 refs] The '__name__' attribute is a (unicode) string: >>> ht.__name__ 'name' [47121 refs] >>> But I can also create a type in this way: >>> ht = type(str8(b"name"), (object,), {}) [47208 refs] The __name__ attribute is a str8 instance: >>> ht.__name__ s'name' [47236 refs] Printing the type triggers an assertion: >>> ht Assertion failed: obj && PyUnicode_Check(obj), file \svn\py3k-struni\Objects\unicodeobject.c, line 630 C:\svn\py3k-struni\PCbuild> because parts of the code assume that the '__name__' is a (unicode) string. Thomas From guido at python.org Wed Jul 11 16:47:47 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Jul 2007 17:47:47 +0300 Subject: [Python-3000] Heaptypes In-Reply-To: References: Message-ID: On 7/11/07, Thomas Heller wrote: > Let me explain it in other words. This code creates a new type: > > >>> ht = type("name", (object,), {}) > [47054 refs] > >>> ht > > [47093 refs] > > The '__name__' attribute is a (unicode) string: > > >>> ht.__name__ > 'name' > [47121 refs] > >>> > > But I can also create a type in this way: > > >>> ht = type(str8(b"name"), (object,), {}) > [47208 refs] > > The __name__ attribute is a str8 instance: > > >>> ht.__name__ > s'name' > [47236 refs] > > Printing the type triggers an assertion: > > >>> ht > Assertion failed: obj && PyUnicode_Check(obj), file \svn\py3k-struni\Objects\unicodeobject.c, line 630 > C:\svn\py3k-struni\PCbuild> > > because parts of the code assume that the '__name__' is a (unicode) string. Hm. I guess the creation must insist that __name__ is a unicode. Can you fix this yourself? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at python.org Wed Jul 11 17:07:48 2007 From: thomas at python.org (Thomas Wouters) Date: Wed, 11 Jul 2007 08:07:48 -0700 Subject: [Python-3000] Fwd: Your confirmation is required to leave the Python-3000 mailing list In-Reply-To: References: Message-ID: <9e804ac0707110807i1c2e515dy5538aa487b711649@mail.gmail.com> I can't find the message you forwarded in mail.python.org's logs (although I may be looking wrong; its hard to do such a search without the full headers of the original) -- but it looks to me like it was a hoax, and not an actual unsubscription request from mail.python.org. On 7/11/07, Guido van Rossum wrote: > > Which joker tried to unsub me? > > ---------- Forwarded message ---------- > From: python-3000-confirm+e08ed5828...1ff418f380758543 at python.org > > Date: Jul 11, 2007 12:43 PM > Subject: Your confirmation is required to leave the Python-3000 mailing > list > To: guido at python.org > > > Mailing list removal confirmation notice for mailing list Python-3000 > > We have received a request for the removal of your email address, > "guido at python.org" from the python-3000 at python.org mailing list. To > confirm that you want to be removed from this mailing list, simply > reply to this message, keeping the Subject: header intact. Or visit > this web page: > > http://mail.python.org/mailman/confirm/python-3000/e08ed5828...8543 > > > Or include the following line -- and only the following line -- in a > message to python-3000-request at python.org: > > confirm e08e...0758543 > > Note that simply sending a `reply' to this message should work from > most mail readers, since that usually leaves the Subject: line in the > right form (additional "Re:" text in the Subject: is okay). > > If you do not wish to be removed from this list, please simply > disregard this message. If you think you are being maliciously > removed from the list, or have any other questions, send them to > python-3000-owner at python.org. > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/thomas%40python.org > -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20070711/b89d3fc1/attachment.html From walter at livinglogic.de Wed Jul 11 17:28:09 2007 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Wed, 11 Jul 2007 17:28:09 +0200 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46948C1E.5050800@livinglogic.de> Message-ID: <4694F709.2040304@livinglogic.de> Guido van Rossum wrote: > Fixed in subversion. Please do review r56252 to see that I did the right > thing. I haven't looked at test_descr.py but the rest looks good to me. I guess for the final version of Py3000 type_set_name() in typeobject.c will not downgrade unicode strings to str8, but instead upgrade str8 objects to unicode. Also now that PyObject_Unicode() tries __unicode__ first and then tp_str should we rename all __unicode__ methods to __str__, or will __unicode__ stay? Servus, Walter > On 7/11/07, Guido van Rossum wrote: >> Yeah, I'm looking in to this right now. What a mess! But I'm close to >> a fix. >> >> There's more that causes test_descr to fail however. Bleh, what a >> terrible unit test -- it doesn't use the unittest module, and a single >> failure aborts the rest of the test. >> >> --Guido >> >> On 7/11/07, Walter D?rwald wrote: >> > Christian Heimes wrote: >> > >> > > I found a bug in the str type that may affect a lot of tests. >> > > >> > > In the py3k-struni branch the str() constructor doesn't use >> __str__ when >> > > the argument is an instance of a subclass of str. A user defined >> string >> > > can't change __str__(). The __repr__ method isn't affected. >> > >> > This hasn't been rewired yet. Behind the covers str still behaves like >> > unicode, i.e. it uses __unicode__ for conversion. >> > >> > Servus, >> > Walter >> > _______________________________________________ >> > Python-3000 mailing list >> > Python-3000 at python.org >> > http://mail.python.org/mailman/listinfo/python-3000 >> > Unsubscribe: >> http://mail.python.org/mailman/options/python-3000/guido%40python.org >> > >> >> >> -- >> --Guido van Rossum (home page: http://www.python.org/~guido/) >> > > From theller at ctypes.org Wed Jul 11 17:52:49 2007 From: theller at ctypes.org (Thomas Heller) Date: Wed, 11 Jul 2007 17:52:49 +0200 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: <4694F709.2040304@livinglogic.de> References: <46948C1E.5050800@livinglogic.de> <4694F709.2040304@livinglogic.de> Message-ID: Walter D?rwald schrieb: > > I guess for the final version of Py3000 type_set_name() in typeobject.c > will not downgrade unicode strings to str8, but instead upgrade str8 > objects to unicode. I'm currently working on type_set_name, see the other message with subject 'Heaptypes'. Thomas From amauryfa at gmail.com Wed Jul 11 17:53:14 2007 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 11 Jul 2007 17:53:14 +0200 Subject: [Python-3000] Fwd: Your confirmation is required to leave the Python-3000 mailing list In-Reply-To: <9e804ac0707110807i1c2e515dy5538aa487b711649@mail.gmail.com> References: <9e804ac0707110807i1c2e515dy5538aa487b711649@mail.gmail.com> Message-ID: Hello, Thomas Wouters wrote: > > I can't find the message you forwarded in mail.python.org's logs (although I > may be looking wrong; its hard to do such a search without the full headers > of the original) -- but it looks to me like it was a hoax, and not an actual > unsubscription request from mail.python.org. ... > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/amauryfa%40gmail.com > Every mail sent by mailman seems to contain a self-unsubscribe link, like the one just above. On reply, the link (with *my* address) is part of the quoted text. Did someone click on such a link, and used the web interface? -- Amaury Forgeot d'Arc From chrism at plope.com Wed Jul 11 19:16:01 2007 From: chrism at plope.com (Chris McDonough) Date: Wed, 11 Jul 2007 13:16:01 -0400 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: Message-ID: I have a very remedial question about how to fix test failures due to the side effects of string-unicode integration. The xmlrpc library uses explicit encoding to encode XML tag payloads to (almost always) utf8. Tag literals are not encoded. What would be the best way to mimic this behavior under the new regime? Just use unicode everywhere and encode the entire XML body to utf-8 at the end? Or deal explicitly in bytes everywhere? Or..? Remedially, - C On Jul 10, 2007, at 5:14 PM, Guido van Rossum wrote: > One of the most daunting tasks remaining for Python 3.0a1 (to be > released by the end of August) is fixing the remaining failing unit > tests in the py3k-struni branch > (http://svn.python.org/view/python/branches/py3k-struni/). > > This is the branch where I have started the work on the > string/unification branch. I want to promote this branch to become the > "main" Py3k branch ASAP (by renaming it to py3k), but I don't want to > do that until all unit tests pass. I've been working diligently on > this task, and I've got it down to about 50 tests that are failing on > at least one of OSX and Ubuntu (the platforms to which I have easy > access). Now I need help. > > To facilitate distributing the task of getting the remaining tests to > pass, I've created a wiki page: > http://wiki.python.org/moin/Py3kStrUniTests . Please help! It's easy > to help: (1) check out the py3k-struni branch; (2) build it; (3) pick > a test and figure out why it's failing; (4) produce a fix; (5) submit > the fix to SF (or check it in, if you have submit privileges and are > confident enough). > > In order to avoid duplicate work, I've come up with a simple protocol: > you mark a test in the wiki as "MINE" (with your name) when you start > looking at it. You mark it as "FIXED [IN SF]" once you fix it, adding > the patch# if the fix is in SF. If you give up, remove your lock, > adding instead a note with what you've found (even just the names of > the failing subtests is helpful). > > Please help! > > There are other tasks, see PEP 3100. Mail me if you're interested in > anything specifically. (Please don't ask me "do you think I could do > this" -- you know better than I whether you're capable of coding at a > specific level. If you don't understand the task, you're probably not > qualified.) > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists > %40plope.com > From amauryfa at gmail.com Wed Jul 11 20:33:46 2007 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 11 Jul 2007 20:33:46 +0200 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: Hi, Thomas Heller wrote: > The most important problem, IMO, is now that wide filenames on Windows are not > implemented, see the code starting at line 148 in _fileio.c. This prevents > most unittests to run because test_support cannot be imported: > > C:\svn\py3k-struni\PCbuild>python -E -tt ../lib/test/regrtest.py > Traceback (most recent call last): > File "../lib/test/regrtest.py", line 165, in > from test import test_support > File "C:\svn\py3k-struni\lib\test\test_support.py", line 182, in > fp = open(TESTFN, 'w+') > File "C:\svn\py3k-struni\lib\site.py", line 412, in __new__ > return io.open(*args, **kwds) > File "C:\svn\py3k-struni\lib\io.py", line 122, in open > (updating and "+" or "")) > NotImplementedError: Windows wide filenames are not yet supported The attached patch corrects this. Now open() accept both unicode strings and bytes objects. -- Amaury Forgeot d'Arc -------------- next part -------------- A non-text attachment was scrubbed... Name: fileio-1.diff Type: application/octet-stream Size: 1473 bytes Desc: not available Url : http://mail.python.org/pipermail/python-3000/attachments/20070711/5bb54244/attachment.obj From amauryfa at gmail.com Wed Jul 11 21:13:31 2007 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 11 Jul 2007 21:13:31 +0200 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: Re-hello, Thomas Heller wrote: > > On 7/11/07, Amaury Forgeot d'Arc wrote: > >> Thomas Heller wrote: > >> > I would love to look into these, but I prefer debugging on Windows. > >> > However, the windows build does not work because the _fileio builtin > >> > module is missing from config.c. Again, this is not so easy to fix, > >> > because the ftruncate function does not exist on Windows. > >> > >> In fileobject.c, there is a replacement for ftruncate. See the code > >> around the call to SetEndOfFile(). > >> > >> I'll try to provide a patch later today. > > Awaiting your patch ;-). Ok, here it is; shamelessly copied from fileobject.c. BTW, what is the status of this fileobject? open() doesn't seem to use it anymore. Will file() be removed at some point? Now test_fileio passes on Windows, with the exception of testAbles(): since c:\dev is an existing directory on my machine, /dev/tty is a regular file and is seekable... Maybe skip this test on win32? I have a couple of other corrections, found by randomly playing with the tests functions... shall I post the corrections here as well? -- Amaury Forgeot d'Arc -------------- next part -------------- A non-text attachment was scrubbed... Name: fileio-2.diff Type: application/octet-stream Size: 1681 bytes Desc: not available Url : http://mail.python.org/pipermail/python-3000/attachments/20070711/d18cdd5d/attachment-0001.obj From theller at ctypes.org Wed Jul 11 22:07:11 2007 From: theller at ctypes.org (Thomas Heller) Date: Wed, 11 Jul 2007 22:07:11 +0200 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: Amaury Forgeot d'Arc schrieb: > Re-hello, > > Thomas Heller wrote: >> > On 7/11/07, Amaury Forgeot d'Arc wrote: >> >> Thomas Heller wrote: >> >> > I would love to look into these, but I prefer debugging on Windows. >> >> > However, the windows build does not work because the _fileio builtin >> >> > module is missing from config.c. Again, this is not so easy to fix, >> >> > because the ftruncate function does not exist on Windows. >> >> >> >> In fileobject.c, there is a replacement for ftruncate. See the code >> >> around the call to SetEndOfFile(). >> >> >> >> I'll try to provide a patch later today. >> >> Awaiting your patch ;-). > > Ok, here it is; shamelessly copied from fileobject.c. Amaury, please upload your patches to the SF bug tracker, and assign them to me. I will (hopefully) look into them tomorrow. > BTW, what is the status of this fileobject? open() doesn't seem to use > it anymore. Will file() be removed at some point? > > Now test_fileio passes on Windows, > with the exception of testAbles(): since c:\dev is an existing > directory on my machine, /dev/tty is a regular file and is seekable... > Maybe skip this test on win32? > > I have a couple of other corrections, found by randomly playing with > the tests functions... shall I post the corrections here as well? See above: posting them to the tracker makes sure they don't get lost. Thanks, Thomas From guido at python.org Wed Jul 11 23:03:03 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Jul 2007 00:03:03 +0300 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: <4694F709.2040304@livinglogic.de> References: <46948C1E.5050800@livinglogic.de> <4694F709.2040304@livinglogic.de> Message-ID: On 7/11/07, Walter D?rwald wrote: > I guess for the final version of Py3000 type_set_name() in typeobject.c > will not downgrade unicode strings to str8, but instead upgrade str8 > objects to unicode. Right, Thomas is working on this (but I have some feedback on his fix). > Also now that PyObject_Unicode() tries __unicode__ first and then tp_str > should we rename all __unicode__ methods to __str__, or will __unicode__ > stay? __unicode__ should be renamed to __str__, or removed (depending on whether the __str__ method already does the right thing). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 11 23:05:39 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Jul 2007 00:05:39 +0300 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: Message-ID: On 7/11/07, Chris McDonough wrote: > I have a very remedial question about how to fix test failures due to > the side effects of string-unicode integration. > > The xmlrpc library uses explicit encoding to encode XML tag payloads > to (almost always) utf8. Tag literals are not encoded. > > What would be the best way to mimic this behavior under the new > regime? Just use unicode everywhere and encode the entire XML body > to utf-8 at the end? Or deal explicitly in bytes everywhere? Or..? The correct approach would be to use Unicode (i.e., str) everywhere and encode to UTF-8 at the end. If that's too hard something's wrong with the philosophy of using Unicode everywhere... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 11 23:08:35 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Jul 2007 00:08:35 +0300 Subject: [Python-3000] [Python-Dev] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46940875.2000606@cheimes.de> Message-ID: On 7/11/07, Amaury Forgeot d'Arc wrote: > BTW, what is the status of this fileobject? open() doesn't seem to use > it anymore. Will file() be removed at some point? The 'file' builtin is already gone. (You did use the py3k-struni branch, didn't you?) Some parts of the fileobject.c file will remain, but the only APIs that remain in there are generic I/O APIs that work with file-like objects (in particular io.IOBase). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nas at arctrix.com Thu Jul 12 01:12:45 2007 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 11 Jul 2007 23:12:45 +0000 (UTC) Subject: [Python-3000] Change _Py prefix for 3k? Message-ID: It's a small detail but I wonder if it's time to stop using a leading underscore for internal APIs. I'm not sure what would be a good replacement, perhaps a trailing underscore. In case people don't remember, the _Py prefix could, theoretically, be invalid C on some platforms. Regards, Neil From tjreedy at udel.edu Thu Jul 12 04:01:01 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 11 Jul 2007 22:01:01 -0400 Subject: [Python-3000] PEP 3099 += no bool change? Message-ID: Someone asked if Py3 would get a "real" or "pure" bool type (one not subclassing int). [The usual complaints and rehash about current bool ensured.] I believe (and said so) that this is a settled question. If so, please add a line under Standard types * bool will continue to subclass int. tjr From joe at bitworking.org Thu Jul 12 07:02:51 2007 From: joe at bitworking.org (Joe Gregorio) Date: Thu, 12 Jul 2007 01:02:51 -0400 Subject: [Python-3000] test_mmap.py and OSError Message-ID: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> I decided to try to tackle the unit tests failing on the py3k-struni branch for mmap. It now passes all the unit tests but one, and the problem is that I don't know what should be 'fixed'. The code in the unit test is: finally: try: f.close() except OSError: pass The problem is that the file is already closed and in Lib/io.py, the close calls flush() and flush() raises ValueError() if the file is already closed, but the unit test is looking for OSError. Should io.py raise OSError instead of ValueError? Or should test_mmap.py be expecting ValueError? Or is there something else that I'm completely missing? [ The wisdom of choosing mmap as my first fiddling with Python internals can be debated later :) ] Thanks, -joe -- Joe Gregorio http://bitworking.org From greg.ewing at canterbury.ac.nz Thu Jul 12 07:26:54 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 12 Jul 2007 17:26:54 +1200 Subject: [Python-3000] test_mmap.py and OSError In-Reply-To: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> Message-ID: <4695BB9E.2030202@canterbury.ac.nz> Joe Gregorio wrote: > flush() raises > ValueError() if the file is already closed, > > Should io.py raise OSError instead of ValueError? Is it really necessary to raise anything at all? An already-closed file is as flushed as it can get, so why not just let it be a no-op? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From guido at python.org Thu Jul 12 09:02:29 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Jul 2007 10:02:29 +0300 Subject: [Python-3000] test_mmap.py and OSError In-Reply-To: <4695BB9E.2030202@canterbury.ac.nz> References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> <4695BB9E.2030202@canterbury.ac.nz> Message-ID: On 7/12/07, Greg Ewing wrote: > Joe Gregorio wrote: > > flush() raises > > ValueError() if the file is already closed, > > > > Should io.py raise OSError instead of ValueError? > > Is it really necessary to raise anything at all? > An already-closed file is as flushed as it can > get, so why not just let it be a no-op? I like that much better. So close() shouldn't try to flush() if it's already closed. This means fixing io.py. (Unfortunately it's a bit of a mess, a bit of refactoring would do it good.) BTW whenever changing io.py, always run both test_io.py and test_file.py, as they test slightly different sets of behavior. (Though occasionally these tests must be adjusted too.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Jul 12 09:04:44 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Jul 2007 10:04:44 +0300 Subject: [Python-3000] Change _Py prefix for 3k? In-Reply-To: References: Message-ID: On 7/12/07, Neil Schemenauer wrote: > It's a small detail but I wonder if it's time to stop using a > leading underscore for internal APIs. I'm not sure what would be a > good replacement, perhaps a trailing underscore. In case people > don't remember, the _Py prefix could, theoretically, be invalid C on > some platforms. There are lots of things we do that could theoretically be bad C. I doubt that this particular one will ever bite us. Are there any other reasons for such a change? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Thu Jul 12 14:16:33 2007 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Thu, 12 Jul 2007 14:16:33 +0200 Subject: [Python-3000] Need help fixing failing Py3k Unittests in py3k-struni In-Reply-To: References: <46948C1E.5050800@livinglogic.de> <4694F709.2040304@livinglogic.de> Message-ID: <46961BA1.9080206@livinglogic.de> Guido van Rossum wrote: > On 7/11/07, Walter D?rwald wrote: >> I guess for the final version of Py3000 type_set_name() in typeobject.c >> will not downgrade unicode strings to str8, but instead upgrade str8 >> objects to unicode. > > Right, Thomas is working on this (but I have some feedback on his fix). > >> Also now that PyObject_Unicode() tries __unicode__ first and then tp_str >> should we rename all __unicode__ methods to __str__, or will __unicode__ >> stay? > > __unicode__ should be renamed to __str__, or removed (depending on > whether the __str__ method already does the right thing). I've dropped __unicode__ from tkinter. The only remaining __unicode__ use is in the email package (besides the tests, where IMHO __unicode__ should stay as long as its handled by PyObject_Unicode()). email.Header.Header defines a __unicode__ which is different from the __str__ method. I guess Barry will know how to fix this. Servus, Walter From joe at bitworking.org Thu Jul 12 15:54:23 2007 From: joe at bitworking.org (Joe Gregorio) Date: Thu, 12 Jul 2007 09:54:23 -0400 Subject: [Python-3000] test_mmap.py and OSError In-Reply-To: References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> <4695BB9E.2030202@canterbury.ac.nz> Message-ID: <3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com> On 7/12/07, Guido van Rossum wrote: > On 7/12/07, Greg Ewing wrote: > > Joe Gregorio wrote: > > > flush() raises > > > ValueError() if the file is already closed, > > > > > > Should io.py raise OSError instead of ValueError? > > > > Is it really necessary to raise anything at all? > > An already-closed file is as flushed as it can > > get, so why not just let it be a no-op? > > I like that much better. So close() shouldn't try to flush() if it's > already closed. This means fixing io.py. (Unfortunately it's a bit of > a mess, a bit of refactoring would do it good.) Thanks for the guidance. This patch fixes mmap and also changes io.py so that close() doesn't flush if it's already closed. I did run both test_io.py and test_file.py when checking the changes to io.py. http://www.python.org/sf/1752647 Thanks, -joe -- Joe Gregorio http://bitworking.org From nas at arctrix.com Thu Jul 12 17:53:48 2007 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 12 Jul 2007 09:53:48 -0600 Subject: [Python-3000] Change _Py prefix for 3k? In-Reply-To: References: Message-ID: <20070712155348.GA29907@arctrix.com> On Thu, Jul 12, 2007 at 10:04:44AM +0300, Guido van Rossum wrote: > There are lots of things we do that could theoretically be bad C. I > doubt that this particular one will ever bite us. Are there any other > reasons for such a change? I think Python is one of the only open source projects to use a _[A-Z] prefix on non-local symbols. That seems more dangerous that other non-standard stuff. Also, it could be hard to work around if someone runs into trouble. My gut feeling is that it's not worth the effort to change but I wanted it to be considered for 3k. Neil From martin at v.loewis.de Fri Jul 13 17:19:45 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 13 Jul 2007 17:19:45 +0200 Subject: [Python-3000] Heaptypes In-Reply-To: References: Message-ID: <46979811.2050405@v.loewis.de> > I don't know enouch about ...CallFunction to help you with the rest. I wonder whether the "s" specifier in CallFunction, BuildValue etc should create Unicode objects, rather than str8 objects. Regards, Martin From pje at telecommunity.com Fri Jul 13 19:41:47 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 13 Jul 2007 13:41:47 -0400 Subject: [Python-3000] pep 3124 plans In-Reply-To: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.co m> References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> Message-ID: <20070713173936.53C213A404D@sparrow.telecommunity.com> At 07:39 AM 7/13/2007 +0200, Michele Simionato wrote: >But I want to ask your opinion first, in order to understand if you >are willing to scale down your proposal or not. At EuroPython Guido >said that in private mail you made some strong argument explaining >why the PEP could not be simplified, but he did not say more than that It's not an argument that the PEP can't be simplified; only that a simpler PEP won't accomplish my original goal for the PEP (of having a generic API for generic functions) vs. simply having a generic function implementation in the stdlib. The first goal requires the second, but the second doesn't need the first, and as far as I'm aware, I'm the only person who really wants the first. A simpler PEP could exist to implement the second goal only, implementing dynamic overloading in Python 3.0 with all of the non-controversial features of 3124, and using Guido's preferred API. The holdup is that I don't have time to work on the *implementation* of both my version *and* this simplified version; there is little overlap between the two because mine is highly self-referential/self-bootstrapping, absolutely dependent on being able to modify functions in-place (a feature Guido seems near -1 on), and virtually impossible to scale down. So, it is much lower on my priorities at the moment to implement the simplified version, because I will neither gain code reuse *nor* the API standardization I'd hoped for. At the moment, my plan is to finish implementing a PEP 3124-like, fully extensible implementation for Python 2.x (see PEAK-Rules), then look at splitting 3124 into a simplified version and a separate extension API PEP aimed at Python 3.1 or later. At that point, I will know for sure what extension API features are necessary to implement the more advanced features I want in PEAK-Rules. I expect to be able to start work on this (i.e., revisiting the proposal) in about a month. With luck, I will be able to carve out enough time to create the simpler implementation and update the PEP in a reasonable amount of time. However, there is nothing stopping anyone else who wishes it from either making the simpler implementation or drafting the scaled-down PEP. The simpler version Guido wants isn't really that different from his existing generic function prototype, especially if you drop all forms of method combination (including :next_method). It will also need positional dispatching, but that's another feature that could perhaps wait for 3.1 as well. In short, if you want a PEP 3124 implementation started on sooner than about a month from now, you need to find a volunteer or do it yourself. >The point is that for 95% of my use cases, simplegeneric would be >enough, and it is alreay available *now*. So, if Guido was willing >to accept something like simplegeneric for Python 3.0, I would not >mind waiting for multiple dispatch in 3.1. You'll have to ask him about that. For what it's worth, the pkgutil module already contains an even simpler generic function implementation than simplegeneric, and is already in the stdlib albeit undocumented. >The reason why I am not using simplegeneric or RuleDispatch already, >is that I do not want to commit in production to a technology >without the official approval of the BDFL, and I prefer to wait now >than having to change my code later. I guess this means you never use any packages from the Cheeseshop? :) From michele.simionato at gmail.com Fri Jul 13 20:37:40 2007 From: michele.simionato at gmail.com (Michele Simionato) Date: Fri, 13 Jul 2007 18:37:40 +0000 (UTC) Subject: [Python-3000] pep 3124 plans References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.co m> <20070713173936.53C213A404D@sparrow.telecommunity.com> Message-ID: Phillip J. Eby telecommunity.com> writes: > For what it's worth, the pkgutil > module already contains an even simpler generic function > implementation than simplegeneric, and is already in the stdlib > albeit undocumented. Well, that is good to know. Personally I would be content with something at that level of sophistication (i.e. the absolute minimum). I think there is no much experience in the community with generic functions (except for you) and there is no danger in waiting and in acquiring more experience before including in the standard library a fully featured package. After all, RuleDispatch is already out there and there is no reason for putting everything in the stdlib. For the same reason, I am happy that Zope interfaces will stay out of the stdlib, and that we will have the much simpler ABC (of course one could argue that generic functions are better than ABC and actually I think so, but still ABC are a simpler entry point for most programmers, more in line with how Python has worked until now, and they will allows me to throw away an half-backed interface implementation I am using now, which is always a good thing ;) Michele Simionato From theller at ctypes.org Fri Jul 13 21:13:39 2007 From: theller at ctypes.org (Thomas Heller) Date: Fri, 13 Jul 2007 21:13:39 +0200 Subject: [Python-3000] pep3115 - metaclasses in python 3000 Message-ID: playing a little with py3k... pep3115 mentions that "__prepare__ returns a dictionary-like object which is used to store the class member definitions during evaluation of the class body." It does not mention whether this dict-like object is used afterwards as the class-dictionary of the created class or not (when the __new__ method of the metaclass is called). The sample-code suggests that it would be used as class dict of the newly created class (the sample code copies it into a regular dictionary before it is passed to the type.__new__ call). However, the actual code in the py3k-struni branch (typeobject.c) copies the passed in dict again. In other words, it seems impossible even with pep3115 to use a custom subclass of dict as a type's __dict__ member, and afaik it is impossible in Python to replace that afterwards. Is this analysis correct? Is that the intent of pep3115? Or could the code be changed so that it is possible to supply a custom type dict with the metaclass? Thanks, Thomas From pje at telecommunity.com Fri Jul 13 23:51:52 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 13 Jul 2007 17:51:52 -0400 Subject: [Python-3000] pep3115 - metaclasses in python 3000 In-Reply-To: References: Message-ID: <20070713214940.9A5883A404D@sparrow.telecommunity.com> At 09:13 PM 7/13/2007 +0200, Thomas Heller wrote: >playing a little with py3k... > >pep3115 mentions that "__prepare__ returns a dictionary-like object >which is used to store the class member definitions during evaluation >of the class body." > >It does not mention whether this dict-like object is used afterwards >as the class-dictionary of the created class or not (when the __new__ >method of the metaclass is called). > >The sample-code suggests that it would be used as class dict of the >newly created class (the sample code copies it into a regular dictionary >before it is passed to the type.__new__ call). >However, the actual code in the py3k-struni branch (typeobject.c) copies >the passed in dict again. > >In other words, it seems impossible even with pep3115 to use a custom >subclass of dict as a type's __dict__ member, and afaik it is impossible >in Python to replace that afterwards. > >Is this analysis correct? Is that the intent of pep3115? Or could >the code be changed so that it is possible to supply a custom type dict >with the metaclass? I would suggest that we do not intend that the class __dict__ == the __prepare__ object, even as the default case. Otherwise, we have to find everything that accesses type dictionaries and make sure they can work with other kinds of objects. From talin at acm.org Sat Jul 14 06:56:44 2007 From: talin at acm.org (Talin) Date: Fri, 13 Jul 2007 21:56:44 -0700 Subject: [Python-3000] pep3115 - metaclasses in python 3000 In-Reply-To: References: Message-ID: <4698578C.3080808@acm.org> Thomas Heller wrote: > playing a little with py3k... > > pep3115 mentions that "__prepare__ returns a dictionary-like object > which is used to store the class member definitions during evaluation > of the class body." > > It does not mention whether this dict-like object is used afterwards > as the class-dictionary of the created class or not (when the __new__ > method of the metaclass is called). The intention is that it's up to the metaclass to decide. I suspect that most metaclasses won't want to use the dict-like object as the class dict, for two reasons: 1) The behavior of assigning to the class dict after class creation is likely to be different than the behavior of assignment during class creation. In particular, a typical 'dict-like' object is likely to be slower than a dict (it has more work to do, after all), and you don't want that slowness around once your class is finished initializing. 2) A 'dict-like' object doesn't have to support all of the methods of a real dict, wherease a class dict does. So your dict-like wrapper can be relatively simple. -- Talin From lists at cheimes.de Sat Jul 14 15:36:04 2007 From: lists at cheimes.de (Christian Heimes) Date: Sat, 14 Jul 2007 15:36:04 +0200 Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in py3k-struni Message-ID: Hello! I'm having some troubles with unit tests in the py3k-struni branch. Some test like test_uu are failing because an io.TextIOWrapper instance's write() method doesn't handle bytes. The method is defined as: def write(self, s: str): if self.closed: raise ValueError("write to closed file") # XXX What if we were just reading? b = s.encode(self._encoding) if isinstance(b, str): b = bytes(b) n = self.buffer.write(b) if "\n" in s: # XXX only if isatty self.flush() self._snapshot = self._decoder = None return len(s) The problematic lines are the lines from s.encode() to b = bytes(b). The behavior is more than questionable. A bytes object doesn't have an encode() method and str's encode method() always returns bytes. IMO the write() method should be changed to: def write(self, s: (str, bytes)): if self.closed: raise ValueError("write to closed file") # XXX What if we were just reading? if isinstance(s, basestring): b = s.encode(self._encoding) elif isinstance(s, bytes): b = s else: b = bytes(b) n = self.buffer.write(b) if b"\n" in b: # XXX only if isatty self.flush() self._snapshot = self._decoder = None return len(s) Or the write() should explictly raise a TypeError when it is not allowed to handle bytes. Christian From guido at python.org Sat Jul 14 16:08:31 2007 From: guido at python.org (Guido van Rossum) Date: Sat, 14 Jul 2007 17:08:31 +0300 Subject: [Python-3000] Heaptypes In-Reply-To: <46979811.2050405@v.loewis.de> References: <46979811.2050405@v.loewis.de> Message-ID: That sounds like a good idea to try. It may break some more tests but those are all indications of places that incorrectly still require str8. On 7/13/07, "Martin v. L?wis" wrote: > > I don't know enouch about ...CallFunction to help you with the rest. > > I wonder whether the "s" specifier in CallFunction, BuildValue etc > should create Unicode objects, rather than str8 objects. > > Regards, > Martin > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Jul 15 16:17:00 2007 From: guido at python.org (Guido van Rossum) Date: Sun, 15 Jul 2007 07:17:00 -0700 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error Message-ID: When a source file contains a string literal with an out-of-range \U escape (e.g. "\U12345678"), instead of a syntax error pointing to the offending literal, I get this, without any indication of the file or line: UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 0-9: illegal Unicode character This is quite hard to track down. (Both the location of the bad literal in the source file, and the origin of the error in the parser. :-) Can someone come up with a fix? I note that raw escapes show a slightly different error. I also note that the same issue exists for u"..." literals in Python 2.5. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From g.brandl at gmx.net Sun Jul 15 23:04:19 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 15 Jul 2007 23:04:19 +0200 Subject: [Python-3000] exclusion feature for 2to3? Message-ID: In order to have a codebase run in 2.x and 3.x, via automated translated by 2to3, there should be some "exclusion feature" for single lines that tells the refactorer not to touch those lines. For example, if you have some object that still has an iteritems() method and keeps it, it'll have to stay the same during translation. Same goes, e.g., for methods named next(), has_key() etc. Most obvious would be a special comment, something like for x in curiousobject.iteritems(): # 2to3:keep foo(x) Does that make sense? Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From python3now at gmail.com Mon Jul 16 03:14:00 2007 From: python3now at gmail.com (James Thiele) Date: Sun, 15 Jul 2007 18:14:00 -0700 Subject: [Python-3000] exclusion feature for 2to3? In-Reply-To: References: Message-ID: <8f01efd00707151814y600e6derb248e3dec921162c@mail.gmail.com> It makes sense - what would you suggest to specify lines/features to exclude? On 7/15/07, Georg Brandl wrote: > In order to have a codebase run in 2.x and 3.x, via automated translated by > 2to3, there should be some "exclusion feature" for single lines that tells > the refactorer not to touch those lines. > > For example, if you have some object that still has an iteritems() method and > keeps it, it'll have to stay the same during translation. > Same goes, e.g., for methods named next(), has_key() etc. > > Most obvious would be a special comment, something like > > for x in curiousobject.iteritems(): # 2to3:keep > foo(x) > > Does that make sense? > > Georg > > -- > Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. > Four shall be the number of spaces thou shalt indent, and the number of thy > indenting shall be four. Eight shalt thou not indent, nor either indent thou > two, excepting that thou then proceed to four. Tabs are right out. > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/python3now%40gmail.com > From guido at python.org Mon Jul 16 04:22:15 2007 From: guido at python.org (Guido van Rossum) Date: Sun, 15 Jul 2007 19:22:15 -0700 Subject: [Python-3000] exclusion feature for 2to3? In-Reply-To: References: Message-ID: On 7/15/07, Georg Brandl wrote: > In order to have a codebase run in 2.x and 3.x, via automated translated by > 2to3, there should be some "exclusion feature" for single lines that tells > the refactorer not to touch those lines. > > For example, if you have some object that still has an iteritems() method and > keeps it, it'll have to stay the same during translation. > Same goes, e.g., for methods named next(), has_key() etc. > > Most obvious would be a special comment, something like > > for x in curiousobject.iteritems(): # 2to3:keep > foo(x) > > Does that make sense? Absolutely. (Were you in the audience of my keynote at EuroPython? I believe I briefly mentioned the need for such a feature there. :-) Can't say I have a good feeling for how to implement it yet, but it should definitely be possible. Precise syntax to be done. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Mon Jul 16 08:12:26 2007 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 15 Jul 2007 23:12:26 -0700 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: References: Message-ID: On 7/15/07, Guido van Rossum wrote: > When a source file contains a string literal with an out-of-range \U > escape (e.g. "\U12345678"), instead of a syntax error pointing to the > offending literal, I get this, without any indication of the file or > line: > > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in > position 0-9: illegal Unicode character > > This is quite hard to track down. (Both the location of the bad > literal in the source file, and the origin of the error in the parser. > :-) Can someone come up with a fix? Take a look at the patch http://python.org/sf/1031213 That might help. I'm not sure if it's the same problem. I really need to dispose of a bunch of things assigned to me. :-( n From g.brandl at gmx.net Mon Jul 16 13:23:29 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 16 Jul 2007 13:23:29 +0200 Subject: [Python-3000] exclusion feature for 2to3? In-Reply-To: References: Message-ID: Guido van Rossum schrieb: > On 7/15/07, Georg Brandl wrote: >> In order to have a codebase run in 2.x and 3.x, via automated translated by >> 2to3, there should be some "exclusion feature" for single lines that tells >> the refactorer not to touch those lines. >> >> For example, if you have some object that still has an iteritems() method and >> keeps it, it'll have to stay the same during translation. >> Same goes, e.g., for methods named next(), has_key() etc. >> >> Most obvious would be a special comment, something like >> >> for x in curiousobject.iteritems(): # 2to3:keep >> foo(x) >> >> Does that make sense? > > Absolutely. (Were you in the audience of my keynote at EuroPython? I > believe I briefly mentioned the need for such a feature there. :-) No, I ran the new documentation toolset through 2to3; and e.g. docutils nodes have a has_key() that does something else than __contains__(). Good to know it's planned! Georg From guido at python.org Mon Jul 16 16:16:10 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jul 2007 07:16:10 -0700 Subject: [Python-3000] exclusion feature for 2to3? In-Reply-To: References: Message-ID: On 7/16/07, Georg Brandl wrote: > > Absolutely. (Were you in the audience of my keynote at EuroPython? I > > believe I briefly mentioned the need for such a feature there. :-) > > No, I ran the new documentation toolset through 2to3; and e.g. docutils > nodes have a has_key() that does something else than __contains__(). > > Good to know it's planned! Planned is a big word. Someone has to design and implement it. BTW I hope to see more core developers from Europe at EuroPython next year! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jul 16 20:29:17 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jul 2007 11:29:17 -0700 Subject: [Python-3000] test_mmap.py and OSError In-Reply-To: <3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com> References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> <4695BB9E.2030202@canterbury.ac.nz> <3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com> Message-ID: So, after seeing the patch and thinking this over some more, I have changed my mind (again). Attempting to flush a closed file seems to indicate that you're confused about whether a file is closed or not, and that seems indicative of unclear thinking, i.e. it's likely a bug that ought to be caught. I think the original thinking that lead to this being treated as an error in 2.x was correct. I don't see attempts to close an already closed file the same way -- this is a state transition to a final state and it makes total sense that you can reach that state from itself. There are good use cases for allowing this. I don't see the use case for flushing a closed file. --Guido On 7/12/07, Joe Gregorio wrote: > On 7/12/07, Guido van Rossum wrote: > > On 7/12/07, Greg Ewing wrote: > > > Joe Gregorio wrote: > > > > flush() raises > > > > ValueError() if the file is already closed, > > > > > > > > Should io.py raise OSError instead of ValueError? > > > > > > Is it really necessary to raise anything at all? > > > An already-closed file is as flushed as it can > > > get, so why not just let it be a no-op? > > > > I like that much better. So close() shouldn't try to flush() if it's > > already closed. This means fixing io.py. (Unfortunately it's a bit of > > a mess, a bit of refactoring would do it good.) > > Thanks for the guidance. > > This patch fixes mmap and also changes io.py > so that close() doesn't flush if it's already closed. > I did run both test_io.py and test_file.py when checking > the changes to io.py. > > http://www.python.org/sf/1752647 > > Thanks, > -joe > > -- > Joe Gregorio http://bitworking.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From joe at bitworking.org Mon Jul 16 20:45:05 2007 From: joe at bitworking.org (Joe Gregorio) Date: Mon, 16 Jul 2007 14:45:05 -0400 Subject: [Python-3000] test_mmap.py and OSError In-Reply-To: References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> <4695BB9E.2030202@canterbury.ac.nz> <3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com> Message-ID: <3f1451f50707161145i3a17541arf8c9c8595d641c39@mail.gmail.com> On 7/16/07, Guido van Rossum wrote: > So, after seeing the patch and thinking this over some more, I have > changed my mind (again). Attempting to flush a closed file seems to > indicate that you're confused about whether a file is closed or not, > and that seems indicative of unclear thinking, i.e. it's likely a bug > that ought to be caught. I think the original thinking that lead to > this being treated as an error in 2.x was correct. > > I don't see attempts to close an already closed file the same way -- > this is a state transition to a final state and it makes total sense > that you can reach that state from itself. There are good use cases > for allowing this. I don't see the use case for flushing a closed > file. Personally I like that better, it seems more consistent. Should I change the try/except block in the mmap unit test to look for ValueError or should the exception raised in io.py be of type OSError like the 2.5 code expects? test_mmap.py:108 try: f.close() except OSError: pass Thanks, -joe -- Joe Gregorio http://bitworking.org From guido at python.org Mon Jul 16 21:36:59 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jul 2007 12:36:59 -0700 Subject: [Python-3000] test_mmap.py and OSError In-Reply-To: <3f1451f50707161145i3a17541arf8c9c8595d641c39@mail.gmail.com> References: <3f1451f50707112202n320e4b25hfadb3670129ba33a@mail.gmail.com> <4695BB9E.2030202@canterbury.ac.nz> <3f1451f50707120654s13e81551x25df9a1dadccafb0@mail.gmail.com> <3f1451f50707161145i3a17541arf8c9c8595d641c39@mail.gmail.com> Message-ID: On 7/16/07, Joe Gregorio wrote: > On 7/16/07, Guido van Rossum wrote: > > So, after seeing the patch and thinking this over some more, I have > > changed my mind (again). Attempting to flush a closed file seems to > > indicate that you're confused about whether a file is closed or not, > > and that seems indicative of unclear thinking, i.e. it's likely a bug > > that ought to be caught. I think the original thinking that lead to > > this being treated as an error in 2.x was correct. > > > > I don't see attempts to close an already closed file the same way -- > > this is a state transition to a final state and it makes total sense > > that you can reach that state from itself. There are good use cases > > for allowing this. I don't see the use case for flushing a closed > > file. > > Personally I like that better, it seems more consistent. > > Should I change the try/except block in the mmap unit test to look for > ValueError or should the exception raised in io.py be of type OSError like > the 2.5 code expects? > > test_mmap.py:108 > > try: > f.close() > except OSError: > pass > > Thanks, > -joe I just checked in your changes, but looking at the code, I think it's bogus either way: there should be two separate try/finally blocks corresponding to the two 'f = open(...)' calls. I'll fix it that way. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jul 16 22:35:12 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jul 2007 13:35:12 -0700 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: References: Message-ID: Doesn't look like it's the same problem. I've assigned that one to Martin who knows that area best of all. On 7/15/07, Neal Norwitz wrote: > On 7/15/07, Guido van Rossum wrote: > > When a source file contains a string literal with an out-of-range \U > > escape (e.g. "\U12345678"), instead of a syntax error pointing to the > > offending literal, I get this, without any indication of the file or > > line: > > > > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in > > position 0-9: illegal Unicode character > > > > This is quite hard to track down. (Both the location of the bad > > literal in the source file, and the origin of the error in the parser. > > :-) Can someone come up with a fix? > > Take a look at the patch http://python.org/sf/1031213 > > That might help. I'm not sure if it's the same problem. > > I really need to dispose of a bunch of things assigned to me. :-( > > n > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Jul 16 23:23:33 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jul 2007 14:23:33 -0700 Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in py3k-struni In-Reply-To: References: Message-ID: On 7/14/07, Christian Heimes wrote: > I'm having some troubles with unit tests in the py3k-struni branch. Some > test like test_uu are failing because an io.TextIOWrapper instance's > write() method doesn't handle bytes. The method is defined as: > > def write(self, s: str): > if self.closed: > raise ValueError("write to closed file") > # XXX What if we were just reading? > b = s.encode(self._encoding) > if isinstance(b, str): > b = bytes(b) > n = self.buffer.write(b) > if "\n" in s: > # XXX only if isatty > self.flush() > self._snapshot = self._decoder = None > return len(s) > > The problematic lines are the lines from s.encode() to b = bytes(b). The > behavior is more than questionable. A bytes object doesn't have an > encode() method and str's encode method() always returns bytes. IMO the > write() method should be changed to: > > def write(self, s: (str, bytes)): > if self.closed: > raise ValueError("write to closed file") > # XXX What if we were just reading? > if isinstance(s, basestring): > b = s.encode(self._encoding) > elif isinstance(s, bytes): > b = s > else: > b = bytes(b) > n = self.buffer.write(b) > if b"\n" in b: > # XXX only if isatty > self.flush() > self._snapshot = self._decoder = None > return len(s) > > Or the write() should explictly raise a TypeError when it is not allowed > to handle bytes. I came across this in your SF patch. I disagree with your desire to let TextIOWrapper.write() handle bytes: it should *only* be passed str objects. The uu test was failing because it was writing bytes to a text stream. Perhaps the error should be better; though I'm not sure I want to add explicit type checks (as it would defeat duck typing). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Jul 17 01:58:36 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jul 2007 16:58:36 -0700 Subject: [Python-3000] pep3115 - metaclasses in python 3000 In-Reply-To: <4698578C.3080808@acm.org> References: <4698578C.3080808@acm.org> Message-ID: On 7/13/07, Talin wrote: > Thomas Heller wrote: > > playing a little with py3k... > > > > pep3115 mentions that "__prepare__ returns a dictionary-like object > > which is used to store the class member definitions during evaluation > > of the class body." > > > > It does not mention whether this dict-like object is used afterwards > > as the class-dictionary of the created class or not (when the __new__ > > method of the metaclass is called). > > The intention is that it's up to the metaclass to decide. I suspect that > most metaclasses won't want to use the dict-like object as the class > dict, for two reasons: > > 1) The behavior of assigning to the class dict after class creation is > likely to be different than the behavior of assignment during class > creation. In particular, a typical 'dict-like' object is likely to be > slower than a dict (it has more work to do, after all), and you don't > want that slowness around once your class is finished initializing. > > 2) A 'dict-like' object doesn't have to support all of the methods of a > real dict, wherease a class dict does. So your dict-like wrapper can be > relatively simple. The object returned by __prepare__() actually *is* incorporated into the class object, unless the metaclass' __new__() passes something else to type.__new__(). However this isn't obvious when you ask for the class' __dict__ attribute: you always get a dict proxy. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Jul 17 02:11:09 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Jul 2007 17:11:09 -0700 Subject: [Python-3000] pep3115 - metaclasses in python 3000 In-Reply-To: References: <4698578C.3080808@acm.org> Message-ID: On 7/16/07, Guido van Rossum wrote: > The object returned by __prepare__() actually *is* incorporated into > the class object, unless the metaclass' __new__() passes something > else to type.__new__(). However this isn't obvious when you ask for > the class' __dict__ attribute: you always get a dict proxy. I take it back. The object is copied, for the reasons Phillip explained. There is no way around this without writing C code, as the only way to create a type object from Python is to call type.__new__() -- the __new__() method if a subclass of type still must call type's __new__() method to create the actual object. (Embarrassed, since I wrote all the code involved.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From lists at cheimes.de Tue Jul 17 03:22:13 2007 From: lists at cheimes.de (Christian Heimes) Date: Tue, 17 Jul 2007 03:22:13 +0200 Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in py3k-struni In-Reply-To: References: Message-ID: <469C19C5.3010006@cheimes.de> Guido van Rossum wrote: > I came across this in your SF patch. I disagree with your desire to > let TextIOWrapper.write() handle bytes: it should *only* be passed str > objects. The uu test was failing because it was writing bytes to a > text stream. > > Perhaps the error should be better; though I'm not sure I want to add > explicit type checks (as it would defeat duck typing). Yes, duck typing is very useful but this duck doesn't quack me why it hurts. ;) It's rather confusing at first. What do you think about def write(self, s: str): if self.closed: raise ValueError("write to closed file") try: b = s.encode(self._encoding) except AttributeError: raise TypeError("str expected, got %r" % s) ... def write(self, s: str): if self.closed: raise ValueError("write to closed file") if not hasattr(s, 'encode') raise TypeError("str expected, got %r" % s) ... ? It explains what is going wrong. Christian From martin at v.loewis.de Tue Jul 17 06:52:27 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 17 Jul 2007 06:52:27 +0200 Subject: [Python-3000] Heaptypes In-Reply-To: References: <46979811.2050405@v.loewis.de> Message-ID: <469C4B0B.50605@v.loewis.de> Guido van Rossum schrieb: > That sounds like a good idea to try. It may break some more tests but > those are all indications of places that incorrectly still require > str8. > >> I wonder whether the "s" specifier in CallFunction, BuildValue etc >> should create Unicode objects, rather than str8 objects. Done. I fixed a number of test cases that broke because of that. In particular, bytes.__reduce__ could not easily return str8 objects as its marshalling state anymore (and shouldn't do so, anyway). So I made bytes a builtin type of pickle, using the S code. As a consequence, a number of other types had to get fixed. So in total, it adds one new failure: something in test_pickle now complains that bytes objects are not hashable. Regards, Martin From p.f.moore at gmail.com Tue Jul 17 13:04:13 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 17 Jul 2007 12:04:13 +0100 Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in py3k-struni In-Reply-To: <469C19C5.3010006@cheimes.de> References: <469C19C5.3010006@cheimes.de> Message-ID: <79990c6b0707170404x68b6b99cj6be77e4f8e65c82@mail.gmail.com> On 17/07/07, Christian Heimes wrote: > def write(self, s: str): > if self.closed: > raise ValueError("write to closed file") > if not hasattr(s, 'encode') > raise TypeError("str expected, got %r" % s) > ... > > ? It explains what is going wrong. Surely the error should say that the object passed needs an encode method, rather than that it should be a str? Paul. From ncoghlan at gmail.com Tue Jul 17 14:15:31 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Jul 2007 22:15:31 +1000 Subject: [Python-3000] TextIOWrapper.write(s:str) and bytes in py3k-struni In-Reply-To: <469C19C5.3010006@cheimes.de> References: <469C19C5.3010006@cheimes.de> Message-ID: <469CB2E3.5070309@gmail.com> Christian Heimes wrote: > What do you think about > > def write(self, s: str): > if self.closed: > raise ValueError("write to closed file") > try: > b = s.encode(self._encoding) > except AttributeError: > raise TypeError("str expected, got %r" % s) > ... The try/except here is a bit too broad - you only want to trap the attribute error. That said, I'm not sure what error you could raise that would be clearer than complaining that the object passed in doesn't have an encode() method. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Tue Jul 17 16:25:30 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Jul 2007 07:25:30 -0700 Subject: [Python-3000] Heaptypes In-Reply-To: <469C4B0B.50605@v.loewis.de> References: <46979811.2050405@v.loewis.de> <469C4B0B.50605@v.loewis.de> Message-ID: Thanks! Can you add test_pickle to the wiki page? (http://wiki.python.org/moin/Py3kStrUniTests) On 7/16/07, "Martin v. L?wis" wrote: > Guido van Rossum schrieb: > > That sounds like a good idea to try. It may break some more tests but > > those are all indications of places that incorrectly still require > > str8. > > > >> I wonder whether the "s" specifier in CallFunction, BuildValue etc > >> should create Unicode objects, rather than str8 objects. > > Done. I fixed a number of test cases that broke because of that. > In particular, bytes.__reduce__ could not easily return str8 objects > as its marshalling state anymore (and shouldn't do so, anyway). > So I made bytes a builtin type of pickle, using the S code. > As a consequence, a number of other types had to get fixed. > > So in total, it adds one new failure: something in test_pickle > now complains that bytes objects are not hashable. > > Regards, > Martin > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Jul 17 22:42:54 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 17 Jul 2007 22:42:54 +0200 Subject: [Python-3000] Heaptypes In-Reply-To: References: <46979811.2050405@v.loewis.de> <469C4B0B.50605@v.loewis.de> Message-ID: <469D29CE.5050600@v.loewis.de> Guido van Rossum schrieb: > Thanks! Can you add test_pickle to the wiki page? > (http://wiki.python.org/moin/Py3kStrUniTests) Done! Martin From guido at python.org Tue Jul 17 23:04:14 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Jul 2007 14:04:14 -0700 Subject: [Python-3000] Heaptypes In-Reply-To: <469D29CE.5050600@v.loewis.de> References: <46979811.2050405@v.loewis.de> <469C4B0B.50605@v.loewis.de> <469D29CE.5050600@v.loewis.de> Message-ID: On 7/17/07, "Martin v. L?wis" wrote: > Guido van Rossum schrieb: > > Thanks! Can you add test_pickle to the wiki page? > > (http://wiki.python.org/moin/Py3kStrUniTests) > > Done! But now I'm confused. I don't see the failure. Are you sure you checked in what you did? In the py3k-struni branch? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Jul 17 23:47:51 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Jul 2007 14:47:51 -0700 Subject: [Python-3000] pep 3124 plans In-Reply-To: References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> Message-ID: On 7/13/07, Michele Simionato wrote: > Phillip J. Eby telecommunity.com> writes: > > For what it's worth, the pkgutil > > module already contains an even simpler generic function > > implementation than simplegeneric, and is already in the stdlib > > albeit undocumented. > > Well, that is good to know. Personally I would be content with something > at that level of sophistication (i.e. the absolute minimum). I think there > is no much experience in the community with generic functions (except for > you) and there is no danger in waiting and in acquiring more experience > before including in the standard library a fully featured package. After > all, RuleDispatch is already out there and there is no reason for putting > everything in the stdlib. For the same reason, I am happy that Zope interfaces > will stay out of the stdlib, and that we will have the much simpler ABC > (of course one could argue that generic functions are better than ABC and > actually I think so, but still ABC are a simpler entry point for most > programmers, more in line with how Python has worked until now, and they > will allows me to throw away an half-backed interface implementation I am > using now, which is always a good thing ;) Actually, I believe ABCs and GFs work well together, and I believe Phillip has said so too. Regarding the fate of PEP 3124, perhaps the right thing is to reject the PEP, and be content with having GFs as a third party add-on? There seems to be nothing particular about Python 3.0 as the point of introduction of GFs anyway -- they can be introduced just as easily in 3.1 or 4.0 or any time later (or earlier, as Phillip's existing implementation show). I have one remaining question for Phillip: why is your design "absolutely dependent on being able to modify functions in-place"? That dependency would appear to make it harder to port the design to other Python implementations whose function objects don't behave the same way. I can see it as a philosophical desirable feature; but I don't understand the technical need for it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Wed Jul 18 00:38:06 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 17 Jul 2007 18:38:06 -0400 Subject: [Python-3000] pep 3124 plans In-Reply-To: References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> Message-ID: <20070717223550.7B1B13A403A@sparrow.telecommunity.com> At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote: >I have one remaining question for Phillip: why is your design >"absolutely dependent on being able to modify functions in-place"? >That dependency would appear to make it harder to port the design to >other Python implementations whose function objects don't behave the >same way. I can see it as a philosophical desirable feature; but I >don't understand the technical need for it. It allows the framework to bootstrap via successive approximation. Initially, the 'implies()' function is just a plain function, and then it later becomes a generic function. (And of course it gets called in between those two points.) The same happens for 'disjuncts()' and 'overrides()'. Is it potentially possible that there's another way to do it, given enough restrictions on how other code uses the exported API and enough hackery during bootstrapping? Perhaps, but I don't know of such a way. The modification-in-place approach allows me to just write the functions and not care precisely when they become generic. I still have to do a little extra special bootstrapping for implies(), because of its self-referential nature, but everything else I can pretty much blaze right on through with. (By the way, AFAIK IronPython, Jython (2.2), and PyPy all support writable func_code attributes, so it's evidently practical to do so for reasonably dynamic Python implementations.) >Regarding the fate of PEP 3124, perhaps the right thing is to reject >the PEP, and be content with having GFs as a third party add-on? I've also suggested simply deferring it. I'd still like to see a "blessed" meta-API for generic functions at some point. Also, as I've said, there's nothing stopping anybody from stepping up with a less-ambitious and less-controversial implementation based on your preferred API. I just won't be able to get to it myself for a month or so. (Also, nothing stops such a less-ambitious approach from being later folded into something more like my approach, with full extensibility and all the bells and whistles. In the worst case, one could always make a backward compatibility layer that fakes the more limited API using the more general one, as long as the lesser API is a strict subset of the greater -- and I believe it is.) >There seems to be nothing particular about Python 3.0 as the point of >introduction of GFs anyway -- they can be introduced just as easily in >3.1 or 4.0 or any time later (or earlier, as Phillip's existing >implementation show). Well, the one thing that might still be relevant is the "overloading inside classes" rule. That's the only bit that has any effect on Python 3.0 semantics vis-a-vis metaclasses, class decorators, etc. The way things currently stand for 3.0, I actually *won't* be able to make a GF implementation that handles the "first argument should be of the containing class" rule without users having an explicit metaclass or class decorator that supports it. In 2.x, I take advantage of the ability of code run inside a class suite to change the enclosing class' __metaclass__; in 3.0, you can't do this anymore since the __metaclass__ doesn't come from the class suite, and there isn't a replacement hook. From guido at python.org Wed Jul 18 00:53:24 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Jul 2007 15:53:24 -0700 Subject: [Python-3000] pep 3124 plans In-Reply-To: <20070717223550.7B1B13A403A@sparrow.telecommunity.com> References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> Message-ID: On 7/17/07, Phillip J. Eby wrote: > At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote: > >I have one remaining question for Phillip: why is your design > >"absolutely dependent on being able to modify functions in-place"? > >That dependency would appear to make it harder to port the design to > >other Python implementations whose function objects don't behave the > >same way. I can see it as a philosophical desirable feature; but I > >don't understand the technical need for it. > > It allows the framework to bootstrap via successive > approximation. Initially, the 'implies()' function is just a plain > function, and then it later becomes a generic function. (And of > course it gets called in between those two points.) The same happens > for 'disjuncts()' and 'overrides()'. Why isn't it possible to mark these functions as explicitly overloadable? I'm not sure I understand what you mean by "bootstrapping". > Is it potentially possible that there's another way to do it, given > enough restrictions on how other code uses the exported API and > enough hackery during bootstrapping? Perhaps, but I don't know of > such a way. The modification-in-place approach allows me to just > write the functions and not care precisely when they become > generic. I still have to do a little extra special bootstrapping for > implies(), because of its self-referential nature, but everything > else I can pretty much blaze right on through with. I guess I'll have to reserve judgment until the implementation exists. > (By the way, AFAIK IronPython, Jython (2.2), and PyPy all support > writable func_code attributes, so it's evidently practical to do so > for reasonably dynamic Python implementations.) Fair enough, though I suspect that IronPython might use certain optimizations that depend on func_code not being written. However, I certainly don't know enough about it. Anyone familiar with IronPython on this list care to comment? > >Regarding the fate of PEP 3124, perhaps the right thing is to reject > >the PEP, and be content with having GFs as a third party add-on? > > I've also suggested simply deferring it. I'd still like to see a > "blessed" meta-API for generic functions at some point. I'll defer it. It seems you are the only one who can write such a blessed meta-API, and I'm guessing that's the part of PEP 3124 that was never completed. > Also, as I've said, there's nothing stopping anybody from stepping up > with a less-ambitious and less-controversial implementation based on > your preferred API. I just won't be able to get to it myself for a > month or so. I'm not sure anybody else cares enough to pre-empt you. > (Also, nothing stops such a less-ambitious approach from being later > folded into something more like my approach, with full extensibility > and all the bells and whistles. In the worst case, one could always > make a backward compatibility layer that fakes the more limited API > using the more general one, as long as the lesser API is a strict > subset of the greater -- and I believe it is.) > > > >There seems to be nothing particular about Python 3.0 as the point of > >introduction of GFs anyway -- they can be introduced just as easily in > >3.1 or 4.0 or any time later (or earlier, as Phillip's existing > >implementation show). > > Well, the one thing that might still be relevant is the "overloading > inside classes" rule. That's the only bit that has any effect on > Python 3.0 semantics vis-a-vis metaclasses, class decorators, etc. > > The way things currently stand for 3.0, I actually *won't* be able to > make a GF implementation that handles the "first argument should be > of the containing class" rule without users having an explicit > metaclass or class decorator that supports it. > > In 2.x, I take advantage of the ability of code run inside a class > suite to change the enclosing class' __metaclass__; in 3.0, you can't > do this anymore since the __metaclass__ doesn't come from the class > suite, and there isn't a replacement hook. I don't understand enough of your implementation to understand this requirement. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexandre at peadrop.com Wed Jul 18 01:27:54 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Tue, 17 Jul 2007 19:27:54 -0400 Subject: [Python-3000] Introspection broken for objects using Py_FindMethod() Message-ID: Hi, It is intentional that the introspection broken for objects using Py_FindMethod()? For example: Python 3.0x (cpy_merge:56413:56414M, Jul 17 2007, 13:57:23) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 >>> import cPickle >>> dir(cPickle.Unpickler(file)) [] >>> dir(cPickle.Pickler(file)) ['PicklingError', snip...] Thanks, -- Alexandre From guido at python.org Wed Jul 18 01:52:16 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Jul 2007 16:52:16 -0700 Subject: [Python-3000] Introspection broken for objects using Py_FindMethod() In-Reply-To: References: Message-ID: On 7/17/07, Alexandre Vassalotti wrote: > Hi, > > It is intentional that the introspection broken for objects using > Py_FindMethod()? For example: > > Python 3.0x (cpy_merge:56413:56414M, Jul 17 2007, 13:57:23) > [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 > >>> import cPickle > >>> dir(cPickle.Unpickler(file)) > [] > >>> dir(cPickle.Pickler(file)) > ['PicklingError', snip...] Yes, see a thread between me, Georg and Brett around March 7-10: http://mail.python.org/pipermail/python-3000/2007-March/006061.html I think the conclusion was to get rid of Py_FindMethod altogether. The replacement isn't very hard. But it hasn't been done yet. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Wed Jul 18 02:27:02 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 17 Jul 2007 20:27:02 -0400 Subject: [Python-3000] pep 3124 plans In-Reply-To: References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> Message-ID: <20070718002446.4B2763A403A@sparrow.telecommunity.com> At 03:53 PM 7/17/2007 -0700, Guido van Rossum wrote: >On 7/17/07, Phillip J. Eby wrote: > > At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote: > > >I have one remaining question for Phillip: why is your design > > >"absolutely dependent on being able to modify functions in-place"? > > >That dependency would appear to make it harder to port the design to > > >other Python implementations whose function objects don't behave the > > >same way. I can see it as a philosophical desirable feature; but I > > >don't understand the technical need for it. > > > > It allows the framework to bootstrap via successive > > approximation. Initially, the 'implies()' function is just a plain > > function, and then it later becomes a generic function. (And of > > course it gets called in between those two points.) The same happens > > for 'disjuncts()' and 'overrides()'. > >Why isn't it possible to mark these functions as explicitly >overloadable? How would I ever add rules to them, if I need them to already be callable in order to add any rules in the first place? :) (In practice, things are even hairier, because I also sometimes need to call these functions *while they are already being called*, if there's no cache hit!) This is partly a consequence of splitting responsibilities between "rule sets" and "dispatch engines". PEAK-Rules wants to be able to use a simple type-tuple dispatcher (like your prototype), but also upgrade to fancier engines as required for specific functions, without changing the rules already registered for the function. So it treats the set of overloads as a separate object from the engine that actually implements dispatching. That way, you can upgrade the engine, even while keeping the rules. However, to populate a rule set, you need to know the disjuncts() of a rule... so you could never add the default rule to disjuncts() without a default rule already being there. None of this is relevant for a design that doesn't care about having more than one supported implementation, though, which is why a reduced-in-scope implementation that's not trying to be a universal API can just ignore all of this. (Heck, disjuncts() wouldn't even be needed in an implementation that wasn't trying to support arbitrary engine extensions, since its purpose is to list the "or"-ed conditions of a rule that can be fulfilled in more than one way.) > > Well, the one thing that might still be relevant is the "overloading > > inside classes" rule. That's the only bit that has any effect on > > Python 3.0 semantics vis-a-vis metaclasses, class decorators, etc. > > > > The way things currently stand for 3.0, I actually *won't* be able to > > make a GF implementation that handles the "first argument should be > > of the containing class" rule without users having an explicit > > metaclass or class decorator that supports it. > > > > In 2.x, I take advantage of the ability of code run inside a class > > suite to change the enclosing class' __metaclass__; in 3.0, you can't > > do this anymore since the __metaclass__ doesn't come from the class > > suite, and there isn't a replacement hook. > >I don't understand enough of your implementation to understand this >requirement. This part would actually be relevant even for a scaled-down non-extensible implementation. The requirement is this: overloads defined in a class need to implicitly treat the first argument of the overloading method as if it were explicitly declared "self: EnclosingClass". In order to do this, the equivalent code in RuleDispatch currently sticks a temporary metaclass into the class locals(), so that it can defer the overload operation until after the class exists. Then it adds in the class to the overload registration. This could be handled by any other sort of mechanism that would allow code in a class body to register a callback to receive the created class. A custom metaclass or class decorator would certainly do the trick, but then you have do something like: @class_contains_overloads class Something: @some_function.overload def blah(self, ...): yadda() It'd be nice to be able to skip the redundant class decorator, as it's not adding any useful information for the reader, and forgetting it will produce a bug. So if method decorators were allowed to request class decorators to be added, that would be the simplest way to manage this. However, if this has to wait for 3.1, it's no big deal. From jimjjewett at gmail.com Wed Jul 18 03:04:01 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 17 Jul 2007 21:04:01 -0400 Subject: [Python-3000] pep 3124 plans In-Reply-To: <20070718002446.4B2763A403A@sparrow.telecommunity.com> References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> <20070718002446.4B2763A403A@sparrow.telecommunity.com> Message-ID: On 7/17/07, Phillip J. Eby wrote: > At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote: > >I have one remaining question for Phillip: why is your design > >"absolutely dependent on being able to modify functions in-place"? > It allows the framework to bootstrap via successive > approximation. Initially, the 'implies()' function is just a plain Would it work to make the original 'implies()' something other than an ordinary function? I realize that you prefer being able to overload anything, but it seems that you *could* mark the ones you'll need to overload as part of bootstrapping. > In 2.x, I take advantage of the ability of code run inside a class > suite to change the enclosing class' __metaclass__; in 3.0, What was missing from the __class__ attribute that you get from the super PEP fail? Was it that you wanted access to the class while defining the class, before the method is ever called? Why can't an ordinary class decorator work? Is it because you want the funky stuff to be conditional? If so, is that really required? Or are you just objecting to the fact that metaclasses like this won't be the default? -jJ From greg.ewing at canterbury.ac.nz Wed Jul 18 03:37:10 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Jul 2007 13:37:10 +1200 Subject: [Python-3000] pep 3124 plans In-Reply-To: <20070717223550.7B1B13A403A@sparrow.telecommunity.com> References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> Message-ID: <469D6EC6.9010005@canterbury.ac.nz> Phillip J. Eby wrote: > It allows the framework to bootstrap via successive > approximation. Initially, the 'implies()' function is just a plain > function, and then it later becomes a generic function. (And of > course it gets called in between those two points.) The same happens > for 'disjuncts()' and 'overrides()'. But you know from the outset that these functions will eventually become generic, so why can't they be defined as some callable object that can have its insides switched, if you're on a Python whose normal function objects don't allow that? -- Greg From pje at telecommunity.com Wed Jul 18 04:03:20 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 17 Jul 2007 22:03:20 -0400 Subject: [Python-3000] pep 3124 plans In-Reply-To: References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> <20070718002446.4B2763A403A@sparrow.telecommunity.com> Message-ID: <20070718020107.EA7123A403A@sparrow.telecommunity.com> At 09:04 PM 7/17/2007 -0400, Jim Jewett wrote: >On 7/17/07, Phillip J. Eby wrote: > > At 02:47 PM 7/17/2007 -0700, Guido van Rossum wrote: > > >I have one remaining question for Phillip: why is your design > > >"absolutely dependent on being able to modify functions in-place"? > > > It allows the framework to bootstrap via successive > > approximation. Initially, the 'implies()' function is just a plain > >Would it work to make the original 'implies()' something other than an >ordinary function? I realize that you prefer being able to overload >anything, but it seems that you *could* mark the ones you'll need to >overload as part of bootstrapping. Fair enough. The design is still dependent on modifying "functions" in place, for some value of "function". It just never occurred to me to introduce a *third* type of "function", besides the two already being dealt with (i.e., standard functions and generic functions). Even *thinking* about the idea right now is like fingernails on a chalkboard to me, so I can see why it didn't occur to me. :) > > In 2.x, I take advantage of the ability of code run inside a class > > suite to change the enclosing class' __metaclass__; in 3.0, > >What was missing from the __class__ attribute that you get from the >super PEP fail? Was it that you wanted access to the class while >defining the class, before the method is ever called? Correct; you need access to it before the method is called, since it's to add an overload to a generic function. >Why can't an ordinary class decorator work? It can; it's just noise. > Is it because you want >the funky stuff to be conditional? If so, is that really required? I don't understand what you mean by "funky stuff" or "conditional", here. >Or are you just objecting to the fact that metaclasses like this won't >be the default? The idea is to make it so that using generic functions doesn't require a bunch of extra bookkeeping, like adding metaclasses or decorators. Metaclasses are particularly problematic in that mixing multiple metaclasses is not an activity for novice wizards. That's why I don't use that approach in today's Python: I can safely wizard around the problem using pseudo-metaclasses, such that the user's metaclasses aren't touched. Post-PEP 3115, however, it won't be an option any more, and you'll at least need a boilerplate decorator for it to work, and it'll silently break without it, giving absolutely no clue as to the problem. From pje at telecommunity.com Wed Jul 18 04:05:25 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 17 Jul 2007 22:05:25 -0400 Subject: [Python-3000] pep 3124 plans In-Reply-To: <469D6EC6.9010005@canterbury.ac.nz> References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> <469D6EC6.9010005@canterbury.ac.nz> Message-ID: <20070718020310.2168A3A403A@sparrow.telecommunity.com> At 01:37 PM 7/18/2007 +1200, Greg Ewing wrote: >Phillip J. Eby wrote: > > It allows the framework to bootstrap via successive > > approximation. Initially, the 'implies()' function is just a plain > > function, and then it later becomes a generic function. (And of > > course it gets called in between those two points.) The same happens > > for 'disjuncts()' and 'overrides()'. > >But you know from the outset that these functions will >eventually become generic, so why can't they be defined >as some callable object that can have its insides >switched, if you're on a Python whose normal function >objects don't allow that? Well, phrased that way, it sounds like a justification for treating it as a porting strategy for such Pythons. The library could just use a "copy_code(srcfunc, dstfunc)" function that's implemented differently on different Pythons. From martin at v.loewis.de Wed Jul 18 04:29:14 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 Jul 2007 04:29:14 +0200 Subject: [Python-3000] Heaptypes In-Reply-To: References: <46979811.2050405@v.loewis.de> <469C4B0B.50605@v.loewis.de> <469D29CE.5050600@v.loewis.de> Message-ID: <469D7AFA.5030505@v.loewis.de> > But now I'm confused. I don't see the failure. Are you sure you > checked in what you did? In the py3k-struni branch? Oops, no. The commit was rejected because it was not whitespace-normalized correctly, and I didn't notice. Now I tried again. Martin From unknown_kev_cat at hotmail.com Tue Jul 17 19:16:42 2007 From: unknown_kev_cat at hotmail.com (Joe Smith) Date: Tue, 17 Jul 2007 13:16:42 -0400 Subject: [Python-3000] Py3k_struni additional test failures under cygwin Message-ID: Building Py3k_struni under Cygwin I've noticed a few more tests failing than the wiki shows. These are using SVN revision 56413. Some spurious errors seem to occur if Python/ is not remaned temporally. I have not included those. (This is an oddity of the cygwin '.exe' autohandling combined with case-insensitivity) Test_coding: Errors. Traceback included at end of message. "test test_descr failed -- ['foo\u1234bar'] slots not caught" "test test_largefile failed -- got b'z', but expected 'z'" test_marshal: Tests that fail are fasiling with a recursion limit exceeded error. Tracebacks: test test_coding failed -- Traceback (most recent call last): File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in test_bad_c oding2 self.verify_bad_module(module_name) File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in verify_bad _module text = fp.read() File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read res += decoder.decode(self.buffer.read(), True) File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128) Just a heads up. From martin at v.loewis.de Wed Jul 18 05:36:05 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 Jul 2007 05:36:05 +0200 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: References: Message-ID: <469D8AA5.1080502@v.loewis.de> > When a source file contains a string literal with an out-of-range \U > escape (e.g. "\U12345678"), instead of a syntax error pointing to the > offending literal, I get this, without any indication of the file or > line: > > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in > position 0-9: illegal Unicode character > > This is quite hard to track down. I think the fundamental flaw is that a codec is used to implement the Python syntax (or, rather, lexical rules). Not quite sure what the rationale for this design was; doing it on the lexical level is (was) tricky because \u escapes were allowed only for Unicode literals, and the lexer had no knowledge of the prefix preceding a literal. (In 3k, it's still similar, because \U escapes have no effect in bytes and raw literals). Still, even if it is "only" handled at the parsing level, I don't see why it needs to be a codec. Instead, implementing escapes in the compiler would still allow for proper diagnostics (notice that in the AST the original lexical form of the string literal is gone). > (Both the location of the bad > literal in the source file, and the origin of the error in the parser. > :-) Can someone come up with a fix? The language definition makes it difficult to fix it where I would consider the "proper" place, i.e. in the tokenization: http://docs.python.org/ref/strings.html says that escapeseq is "\" . So "\x" is a valid shortstring. Then it becomes fuzzy: It says that any unrecognized escape sequences are left in the string. While that appears like a clear specification, it is not implemented (and has not since Python 2.0 anymore). According to the spec, '\U12345678' is well-formed, and denotes the same string as '\\U12345678'. I now see the following choices: 1. Restore implementing the spec again. Stop complaining about invalid escapes for \x and \U, and just interpret the \ as '\\'. In this case, the current design could be left in place, and the codecs would just stop raising these errors. 2. Change the spec to make it an error if \x is not followed by two hex digits, \u not by four hex digits, \U not by 8, or the value denoted by the \U digits is out of range. In this case, I would propose to move the lexical analysis back into the parser, or just make an internal API that will raise a proper SyntaxError (it will be tricky to compute the column in the original source line, though). 3. Change the spec to make constrain escapeseq, giving up the rule that uninterpreted escapes silently become two characters. That's difficult to write down in EBNF, so should be formulated through constraints in natural language. The lexer would have to keep track of what kind of literal it is processing, and reject invalid escapes directly on source level. There are probably other options as well. Regards, Martin From martin at v.loewis.de Wed Jul 18 05:37:56 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 18 Jul 2007 05:37:56 +0200 Subject: [Python-3000] exclusion feature for 2to3? In-Reply-To: References: Message-ID: <469D8B14.4050907@v.loewis.de> > BTW I hope to see more core developers from Europe at EuroPython next year! It's always difficult to get there for me, as it takes place during the semester :-( Regards, Martin From kbk at shore.net Wed Jul 18 08:04:13 2007 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed, 18 Jul 2007 02:04:13 -0400 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: (Guido van Rossum's message of "Sun, 15 Jul 2007 07:17:00 -0700") References: Message-ID: <87odia5jtu.fsf@hydra.bayview.thirdcreek.com> "Guido van Rossum" writes: > When a source file contains a string literal with an out-of-range \U > escape (e.g. "\U12345678"), instead of a syntax error pointing to the > offending literal, I get this, without any indication of the file or > line: > > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in > position 0-9: illegal Unicode character > > This is quite hard to track down. (Both the location of the bad > literal in the source file, and the origin of the error in the parser. > :-) Can someone come up with a fix? > > I note that raw escapes show a slightly different error. I also note > that the same issue exists for u"..." literals in Python 2.5. For what it's worth, I posted a patch to ast.c against the 2.6 trunk which massages the unicode exception into a SyntaxError showing the location. That approach lets unicodeobject.c handle the gory details while ast.c handles the SyntaxError generation. It might be a solution until something deeper along the lines of Martin's thoughts is possibly developed. I don't think that any reference adjustments are needed, but someone should check the patch. www.python.org/sf/1755885 -- KBK From kbk at shore.net Wed Jul 18 08:04:13 2007 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed, 18 Jul 2007 02:04:13 -0400 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: (Guido van Rossum's message of "Sun, 15 Jul 2007 07:17:00 -0700") References: Message-ID: <87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com> "Guido van Rossum" writes: > When a source file contains a string literal with an out-of-range \U > escape (e.g. "\U12345678"), instead of a syntax error pointing to the > offending literal, I get this, without any indication of the file or > line: > > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in > position 0-9: illegal Unicode character > > This is quite hard to track down. (Both the location of the bad > literal in the source file, and the origin of the error in the parser. > :-) Can someone come up with a fix? > > I note that raw escapes show a slightly different error. I also note > that the same issue exists for u"..." literals in Python 2.5. For what it's worth, I posted a patch to ast.c against the 2.6 trunk which massages the unicode exception into a SyntaxError showing the location. That approach lets unicodeobject.c handle the gory details while ast.c handles the SyntaxError generation. It might be a solution until something deeper along the lines of Martin's thoughts is possibly developed. I don't think that any reference adjustments are needed, but someone should check the patch. www.python.org/sf/1755885 -- KBK From amauryfa at gmail.com Wed Jul 18 10:20:36 2007 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 18 Jul 2007 10:20:36 +0200 Subject: [Python-3000] Py3k_struni additional test failures under cygwin In-Reply-To: References: Message-ID: Hello, 2007/7/17, Joe Smith wrote: > Building Py3k_struni under Cygwin I've noticed a few more tests failing than > the wiki shows. > These are using SVN revision 56413. > > Some spurious errors seem to occur if Python/ is not remaned temporally. I > have not included those. (This is an oddity of the cygwin '.exe' > autohandling combined with case-insensitivity) For this, I have added a line to runtests.sh: # Choose the Python binary. case `uname` in Darwin) PYTHON=./python.exe;; CYGWIN*) PYTHON=./python.exe;; *) PYTHON=./python;; esac Hope this helps, -- Amaury Forgeot d'Arc From guido at python.org Wed Jul 18 18:47:13 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 09:47:13 -0700 Subject: [Python-3000] pep 3124 plans In-Reply-To: <20070718020310.2168A3A403A@sparrow.telecommunity.com> References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> <469D6EC6.9010005@canterbury.ac.nz> <20070718020310.2168A3A403A@sparrow.telecommunity.com> Message-ID: On 7/17/07, Phillip J. Eby wrote: > At 01:37 PM 7/18/2007 +1200, Greg Ewing wrote: > >Phillip J. Eby wrote: > > > It allows the framework to bootstrap via successive > > > approximation. Initially, the 'implies()' function is just a plain > > > function, and then it later becomes a generic function. (And of > > > course it gets called in between those two points.) The same happens > > > for 'disjuncts()' and 'overrides()'. > > > >But you know from the outset that these functions will > >eventually become generic, so why can't they be defined > >as some callable object that can have its insides > >switched, if you're on a Python whose normal function > >objects don't allow that? > > Well, phrased that way, it sounds like a justification for treating > it as a porting strategy for such Pythons. The library could just > use a "copy_code(srcfunc, dstfunc)" function that's implemented > differently on different Pythons. Sorry, but I'm still totally uncomfortable with this. While I admit the feature exists, I really, really, really don't want it to be used on a regular basis. As long as Phillip calls a counterproposal "fingernails on a chalkboard", I call this unpythonic. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 18 18:59:46 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 09:59:46 -0700 Subject: [Python-3000] Py3k_struni additional test failures under cygwin In-Reply-To: References: Message-ID: On 7/18/07, Amaury Forgeot d'Arc wrote: > Hello, > > 2007/7/17, Joe Smith wrote: > > Building Py3k_struni under Cygwin I've noticed a few more tests failing than > > the wiki shows. > > These are using SVN revision 56413. > > > > Some spurious errors seem to occur if Python/ is not remaned temporally. I > > have not included those. (This is an oddity of the cygwin '.exe' > > autohandling combined with case-insensitivity) > > For this, I have added a line to runtests.sh: > > # Choose the Python binary. > case `uname` in > Darwin) PYTHON=./python.exe;; > CYGWIN*) PYTHON=./python.exe;; > *) PYTHON=./python;; > esac This is now committed to Subversion: (r56440). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 18 19:02:07 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 10:02:07 -0700 Subject: [Python-3000] Py3k_struni additional test failures under cygwin In-Reply-To: References: Message-ID: On 7/17/07, Joe Smith wrote: > Building Py3k_struni under Cygwin I've noticed a few more tests failing than > the wiki shows. > These are using SVN revision 56413. > > Some spurious errors seem to occur if Python/ is not remaned temporally. I > have not included those. (This is an oddity of the cygwin '.exe' > autohandling combined with case-insensitivity) > > > Test_coding: Errors. Traceback included at end of message. > "test test_descr failed -- ['foo\u1234bar'] slots not caught" > "test test_largefile failed -- got b'z', but expected 'z'" > test_marshal: Tests that fail are fasiling with a recursion limit exceeded > error. > > > > Tracebacks: > > test test_coding failed -- Traceback (most recent call last): > File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in > test_bad_c > oding2 > self.verify_bad_module(module_name) > File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in > verify_bad > _module > text = fp.read() > File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read > res += decoder.decode(self.buffer.read(), True) > File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in decode > return codecs.ascii_decode(input, self.errors)[0] > UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: > ordinal > not in range(128) The test_descr and test_largefile failures are reproducible on Ubuntu and someone will eventually fix them. I can't reproduce the test_marshal and test_coding failures; please investigate more on CYGWIN. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 18 19:27:01 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 10:27:01 -0700 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: <87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com> References: <87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com> Message-ID: On 7/17/07, Kurt B. Kaiser wrote: > For what it's worth, I posted a patch to ast.c against the 2.6 trunk > which massages the unicode exception into a SyntaxError showing the > location. > > That approach lets unicodeobject.c handle the gory details while ast.c > handles the SyntaxError generation. It might be a solution until > something deeper along the lines of Martin's thoughts is possibly > developed. > > I don't think that any reference adjustments are needed, but someone > should check the patch. > > www.python.org/sf/1755885 Thanks! Checked in, and merged into p3yk. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Wed Jul 18 19:27:49 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 18 Jul 2007 13:27:49 -0400 Subject: [Python-3000] pep 3124 plans In-Reply-To: References: <4edc17eb0707122239y1af87a17k99eaa710726050b0@mail.gmail.com> <20070713173936.53C213A404D@sparrow.telecommunity.com> <20070717223550.7B1B13A403A@sparrow.telecommunity.com> <469D6EC6.9010005@canterbury.ac.nz> <20070718020310.2168A3A403A@sparrow.telecommunity.com> Message-ID: <20070718172907.861383A40A4@sparrow.telecommunity.com> At 09:47 AM 7/18/2007 -0700, Guido van Rossum wrote: >Sorry, but I'm still totally uncomfortable with this. While I admit >the feature exists, I really, really, really don't want it to be used >on a regular basis. As long as Phillip calls a counterproposal >"fingernails on a chalkboard", I call this unpythonic. I didn't say I wouldn't *do* it, I just explained why I'd have never come up with the idea on my own. I don't have to like something in order to do it, though of course it helps. :) From guido at python.org Wed Jul 18 19:31:53 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 10:31:53 -0700 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: <469D8AA5.1080502@v.loewis.de> References: <469D8AA5.1080502@v.loewis.de> Message-ID: On 7/17/07, "Martin v. L?wis" wrote: > > When a source file contains a string literal with an out-of-range \U > > escape (e.g. "\U12345678"), instead of a syntax error pointing to the > > offending literal, I get this, without any indication of the file or > > line: > > > > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in > > position 0-9: illegal Unicode character > > > > This is quite hard to track down. > > I think the fundamental flaw is that a codec is used to implement > the Python syntax (or, rather, lexical rules). > > Not quite sure what the rationale for this design was; doing it on > the lexical level is (was) tricky because \u escapes were allowed > only for Unicode literals, and the lexer had no knowledge of the > prefix preceding a literal. (In 3k, it's still similar, because > \U escapes have no effect in bytes and raw literals). > > Still, even if it is "only" handled at the parsing level, I > don't see why it needs to be a codec. Instead, implementing > escapes in the compiler would still allow for proper diagnostics > (notice that in the AST the original lexical form of the string > literal is gone). I guess because it was deemed useful to have a codec for this purpose too, thereby exposing the algorithm to Python code that needs the same functionality (e.g. the compiler package, RIP). > > (Both the location of the bad > > literal in the source file, and the origin of the error in the parser. > > :-) Can someone come up with a fix? > > The language definition makes it difficult to fix it where I would > consider the "proper" place, i.e. in the tokenization: > > http://docs.python.org/ref/strings.html > > says that escapeseq is "\" . So > "\x" is a valid shortstring. > > Then it becomes fuzzy: It says that any unrecognized escape > sequences are left in the string. While that appears like a clear > specification, it is not implemented (and has not since Python > 2.0 anymore). According to the spec, '\U12345678' is well-formed, > and denotes the same string as '\\U12345678'. > > I now see the following choices: > 1. Restore implementing the spec again. Stop complaining about > invalid escapes for \x and \U, and just interpret the \ > as '\\'. In this case, the current design could be left in > place, and the codecs would just stop raising these errors. Sounds like a bad idea. I think \xNN (where N is not a hex digit) once behaved this way, and it was changed to explicitly complain instead as a service to users. > 2. Change the spec to make it an error if \x is not followed > by two hex digits, \u not by four hex digits, \U not by > 8, or the value denoted by the \U digits is out of range. > In this case, I would propose to move the lexical analysis > back into the parser, or just make an internal API that > will raise a proper SyntaxError (it will be tricky to > compute the column in the original source line, though). I'm all in favor of this spec change. Eventually we should change the lexer to do this right; for now, Kurt's patch is good enough. > 3. Change the spec to make constrain escapeseq, giving up > the rule that uninterpreted escapes silently become > two characters. That's difficult to write down in EBNF, > so should be formulated through constraints in natural > language. The lexer would have to keep track of what kind > of literal it is processing, and reject invalid escapes > directly on source level. -1 > There are probably other options as well. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From unknown_kev_cat at hotmail.com Wed Jul 18 19:56:07 2007 From: unknown_kev_cat at hotmail.com (Joe Smith) Date: Wed, 18 Jul 2007 13:56:07 -0400 Subject: [Python-3000] Py3k_struni additional test failures under cygwin References: Message-ID: "Guido van Rossum" wrote in message news:ca471dc20707181002w64e076aco9a509ec7e4e15b9a at mail.gmail.com... > On 7/17/07, Joe Smith wrote: >> Building Py3k_struni under Cygwin I've noticed a few more tests failing >> than >> the wiki shows. >> These are using SVN revision 56413. >> >> Some spurious errors seem to occur if Python/ is not remaned temporally. >> I >> have not included those. (This is an oddity of the cygwin '.exe' >> autohandling combined with case-insensitivity) >> >> >> Test_coding: Errors. Traceback included at end of message. >> "test test_descr failed -- ['foo\u1234bar'] slots not caught" >> "test test_largefile failed -- got b'z', but expected 'z'" >> test_marshal: Tests that fail are fasiling with a recursion limit >> exceeded >> error. >> >> >> >> Tracebacks: >> >> test test_coding failed -- Traceback (most recent call last): >> File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in >> test_bad_c >> oding2 >> self.verify_bad_module(module_name) >> File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in >> verify_bad >> _module >> text = fp.read() >> File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read >> res += decoder.decode(self.buffer.read(), True) >> File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in >> decode >> return codecs.ascii_decode(input, self.errors)[0] >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: >> ordinal >> not in range(128) > > The test_descr and test_largefile failures are reproducible on Ubuntu > and someone will eventually fix them. > > I can't reproduce the test_marshal and test_coding failures; please > investigate more on CYGWIN. For the test coding, apprently the module's contents are intended to be loaded, and then verified that a syntax error occurs when trying to parse the module. However, on cygwin i'm consistantly getting an error on the line that reads the file. Specificly fp.read(). Fp.read() appears to be trying to export a unicode string by interpreting the byte string as ascii. The byte string is most certainly not valid ascii. So the codec throws an error. I'm guessing for some reason python normally chose a different codec, but on my cygwin compiles it is choosing ascii. I'm not sure why. Nor am I sure how to inestigate further. Heres a fairly useless loking traceback for test_marshal. Many of the tests fail with nearly identical tracebacks: #====================================================================== #ERROR: test_tuple (test.test_marshal.ContainerTestCase) #---------------------------------------------------------------------- #Traceback (most recent call last): # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 134, in test_tuple # self.helper(tuple(self.d.keys())) # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 21, in helper # new = marshal.load(f) #ValueError: recursion limit exceeded For what it's worth here is the fll subtest list and status for test_marshal: #test_bool (test.test_marshal.IntTestCase) ... ERROR #test_int64 (test.test_marshal.IntTestCase) ... ok #test_ints (test.test_marshal.IntTestCase) ... ERROR #test_floats (test.test_marshal.FloatTestCase) ... ERROR #test_buffer (test.test_marshal.StringTestCase) ... ERROR #test_string (test.test_marshal.StringTestCase) ... ERROR #test_unicode (test.test_marshal.StringTestCase) ... ERROR #test_code (test.test_marshal.CodeTestCase) ... ok #test_dict (test.test_marshal.ContainerTestCase) ... ERROR #test_list (test.test_marshal.ContainerTestCase) ... ERROR #test_sets (test.test_marshal.ContainerTestCase) ... ERROR #test_tuple (test.test_marshal.ContainerTestCase) ... ERROR #test_exceptions (test.test_marshal.ExceptionTestCase) ... ok #test_bug_5888452 (test.test_marshal.BugsTestCase) ... ok #test_fuzz (test.test_marshal.BugsTestCase) ... ok #test_loads_recursion (test.test_marshal.BugsTestCase) ... ok #test_patch_873224 (test.test_marshal.BugsTestCase) ... ok #test_recursion_limit (test.test_marshal.BugsTestCase) ... ok #test_version_argument (test.test_marshal.BugsTestCase) ... ok I'm wondering if the recusion limit on my build is getting set too low somehow. From guido at python.org Wed Jul 18 20:13:24 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 11:13:24 -0700 Subject: [Python-3000] Py3k_struni additional test failures under cygwin In-Reply-To: References: Message-ID: On 7/18/07, Joe Smith wrote: > > "Guido van Rossum" wrote in message > news:ca471dc20707181002w64e076aco9a509ec7e4e15b9a at mail.gmail.com... > > On 7/17/07, Joe Smith wrote: > >> Building Py3k_struni under Cygwin I've noticed a few more tests failing > >> than > >> the wiki shows. > >> These are using SVN revision 56413. > >> > >> Some spurious errors seem to occur if Python/ is not remaned temporally. > >> I > >> have not included those. (This is an oddity of the cygwin '.exe' > >> autohandling combined with case-insensitivity) > >> > >> > >> Test_coding: Errors. Traceback included at end of message. > >> "test test_descr failed -- ['foo\u1234bar'] slots not caught" > >> "test test_largefile failed -- got b'z', but expected 'z'" > >> test_marshal: Tests that fail are fasiling with a recursion limit > >> exceeded > >> error. > >> > >> > >> > >> Tracebacks: > >> > >> test test_coding failed -- Traceback (most recent call last): > >> File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in > >> test_bad_c > >> oding2 > >> self.verify_bad_module(module_name) > >> File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in > >> verify_bad > >> _module > >> text = fp.read() > >> File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read > >> res += decoder.decode(self.buffer.read(), True) > >> File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in > >> decode > >> return codecs.ascii_decode(input, self.errors)[0] > >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: > >> ordinal > >> not in range(128) > > > > The test_descr and test_largefile failures are reproducible on Ubuntu > > and someone will eventually fix them. > > > > I can't reproduce the test_marshal and test_coding failures; please > > investigate more on CYGWIN. > > For the test coding, apprently the module's contents are intended to be > loaded, and then verified that a syntax error occurs when trying to parse > the module. However, on cygwin i'm consistantly getting an error on the line > that reads the file. Specificly fp.read(). > > Fp.read() appears to be trying to export a unicode string by interpreting > the byte string as ascii. The byte string is most certainly not valid ascii. > So the codec throws an error. I'm guessing for some reason python normally > chose a different codec, but on my cygwin compiles it is choosing ascii. I'm > not sure why. Nor am I sure how to inestigate further. The encoding defaults to the filesystem encoding or otherwise Latin-1. There's an XXX comment in io.py, in TextIOWrapper.__init__, admitting this is questionable. I'm guessing CYGWIN has a filesystem encoding equal to ASCII? Is this a good idea? Maybe the default encoding should always be UTF-8 (matching the source code default encoding). I can also fix it by changing test_coding.py to add encoding="utf-8" to the open() call in verify_bad_module(). > Heres a fairly useless loking traceback for test_marshal. Many of the tests > fail with nearly identical tracebacks: > > #====================================================================== > #ERROR: test_tuple (test.test_marshal.ContainerTestCase) > #---------------------------------------------------------------------- > #Traceback (most recent call last): > # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 134, in > test_tuple > # self.helper(tuple(self.d.keys())) > # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 21, in > helper > # new = marshal.load(f) > #ValueError: recursion limit exceeded > > For what it's worth here is the fll subtest list and status for > test_marshal: > > #test_bool (test.test_marshal.IntTestCase) ... ERROR > #test_int64 (test.test_marshal.IntTestCase) ... ok > #test_ints (test.test_marshal.IntTestCase) ... ERROR > #test_floats (test.test_marshal.FloatTestCase) ... ERROR > #test_buffer (test.test_marshal.StringTestCase) ... ERROR > #test_string (test.test_marshal.StringTestCase) ... ERROR > #test_unicode (test.test_marshal.StringTestCase) ... ERROR > #test_code (test.test_marshal.CodeTestCase) ... ok > #test_dict (test.test_marshal.ContainerTestCase) ... ERROR > #test_list (test.test_marshal.ContainerTestCase) ... ERROR > #test_sets (test.test_marshal.ContainerTestCase) ... ERROR > #test_tuple (test.test_marshal.ContainerTestCase) ... ERROR > #test_exceptions (test.test_marshal.ExceptionTestCase) ... ok > #test_bug_5888452 (test.test_marshal.BugsTestCase) ... ok > #test_fuzz (test.test_marshal.BugsTestCase) ... ok > #test_loads_recursion (test.test_marshal.BugsTestCase) ... ok > #test_patch_873224 (test.test_marshal.BugsTestCase) ... ok > #test_recursion_limit (test.test_marshal.BugsTestCase) ... ok > #test_version_argument (test.test_marshal.BugsTestCase) ... ok > > I'm wondering if the recusion limit on my build is getting set too low > somehow. Can you find out what it is? sys.getrecursionlimit(). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexandre at peadrop.com Wed Jul 18 20:27:40 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Wed, 18 Jul 2007 14:27:40 -0400 Subject: [Python-3000] Introspection broken for objects using Py_FindMethod() In-Reply-To: References: Message-ID: On 7/17/07, Guido van Rossum wrote: > Yes, see a thread between me, Georg and Brett around March 7-10: > > http://mail.python.org/pipermail/python-3000/2007-March/006061.html > Thanks for the pointer. > I think the conclusion was to get rid of Py_FindMethod altogether. The > replacement isn't very hard. But it hasn't been done yet. Do you need you some help for that? Perhaps, I could try to write a patch to replace the trivial use cases of Py_FindMethod in the stdlib. Also, I think it would be a good idea to document the change, too. -- Alexandre From unknown_kev_cat at hotmail.com Wed Jul 18 20:50:14 2007 From: unknown_kev_cat at hotmail.com (Joe Smith) Date: Wed, 18 Jul 2007 14:50:14 -0400 Subject: [Python-3000] Py3k_struni additional test failures under cygwin References: Message-ID: "Guido van Rossum" wrote in message news:ca471dc20707181113m360db736h2fd079f29f71220 at mail.gmail.com... > On 7/18/07, Joe Smith wrote: >> >> "Guido van Rossum" wrote in message >> news:ca471dc20707181002w64e076aco9a509ec7e4e15b9a at mail.gmail.com... >> > On 7/17/07, Joe Smith wrote: >> >> Building Py3k_struni under Cygwin I've noticed a few more tests >> >> failing >> >> than >> >> the wiki shows. >> >> These are using SVN revision 56413. >> >> >> >> Some spurious errors seem to occur if Python/ is not remaned >> >> temporally. >> >> I >> >> have not included those. (This is an oddity of the cygwin '.exe' >> >> autohandling combined with case-insensitivity) >> >> >> >> >> >> Test_coding: Errors. Traceback included at end of message. >> >> "test test_descr failed -- ['foo\u1234bar'] slots not caught" >> >> "test test_largefile failed -- got b'z', but expected 'z'" >> >> test_marshal: Tests that fail are fasiling with a recursion limit >> >> exceeded >> >> error. >> >> >> >> >> >> >> >> Tracebacks: >> >> >> >> test test_coding failed -- Traceback (most recent call last): >> >> File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 12, in >> >> test_bad_c >> >> oding2 >> >> self.verify_bad_module(module_name) >> >> File "/home/Owner/py3k-struni/Lib/test/test_coding.py", line 20, in >> >> verify_bad >> >> _module >> >> text = fp.read() >> >> File "/home/Owner/py3k-struni/Lib/io.py", line 1186, in read >> >> res += decoder.decode(self.buffer.read(), True) >> >> File "/home/Owner/py3k-struni/Lib/encodings/ascii.py", line 26, in >> >> decode >> >> return codecs.ascii_decode(input, self.errors)[0] >> >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position >> >> 0: >> >> ordinal >> >> not in range(128) >> > >> > The test_descr and test_largefile failures are reproducible on Ubuntu >> > and someone will eventually fix them. >> > >> > I can't reproduce the test_marshal and test_coding failures; please >> > investigate more on CYGWIN. >> >> For the test coding, apprently the module's contents are intended to be >> loaded, and then verified that a syntax error occurs when trying to parse >> the module. However, on cygwin i'm consistantly getting an error on the >> line >> that reads the file. Specificly fp.read(). >> >> Fp.read() appears to be trying to export a unicode string by interpreting >> the byte string as ascii. The byte string is most certainly not valid >> ascii. >> So the codec throws an error. I'm guessing for some reason python >> normally >> chose a different codec, but on my cygwin compiles it is choosing ascii. >> I'm >> not sure why. Nor am I sure how to inestigate further. > > The encoding defaults to the filesystem encoding or otherwise Latin-1. > There's an XXX comment in io.py, in TextIOWrapper.__init__, admitting > this is questionable. I'm guessing CYGWIN has a filesystem encoding > equal to ASCII? Is this a good idea? Quite possibly. I know they have wanted to move using the unicode API's to support everything, but that is a pain because of the meathod that windows uses internally to support Unicode. > Maybe the default encoding should always be UTF-8 (matching the source > code default encoding). > > I can also fix it by changing test_coding.py to add encoding="utf-8" > to the open() call in verify_bad_module(). > >> Heres a fairly useless loking traceback for test_marshal. Many of the >> tests >> fail with nearly identical tracebacks: >> >> #====================================================================== >> #ERROR: test_tuple (test.test_marshal.ContainerTestCase) >> #---------------------------------------------------------------------- >> #Traceback (most recent call last): >> # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 134, in >> test_tuple >> # self.helper(tuple(self.d.keys())) >> # File "/home/Owner/py3k-struni/Lib/test/test_marshal.py", line 21, in >> helper >> # new = marshal.load(f) >> #ValueError: recursion limit exceeded >> >> For what it's worth here is the fll subtest list and status for >> test_marshal: >> >> #test_bool (test.test_marshal.IntTestCase) ... ERROR >> #test_int64 (test.test_marshal.IntTestCase) ... ok >> #test_ints (test.test_marshal.IntTestCase) ... ERROR >> #test_floats (test.test_marshal.FloatTestCase) ... ERROR >> #test_buffer (test.test_marshal.StringTestCase) ... ERROR >> #test_string (test.test_marshal.StringTestCase) ... ERROR >> #test_unicode (test.test_marshal.StringTestCase) ... ERROR >> #test_code (test.test_marshal.CodeTestCase) ... ok >> #test_dict (test.test_marshal.ContainerTestCase) ... ERROR >> #test_list (test.test_marshal.ContainerTestCase) ... ERROR >> #test_sets (test.test_marshal.ContainerTestCase) ... ERROR >> #test_tuple (test.test_marshal.ContainerTestCase) ... ERROR >> #test_exceptions (test.test_marshal.ExceptionTestCase) ... ok >> #test_bug_5888452 (test.test_marshal.BugsTestCase) ... ok >> #test_fuzz (test.test_marshal.BugsTestCase) ... ok >> #test_loads_recursion (test.test_marshal.BugsTestCase) ... ok >> #test_patch_873224 (test.test_marshal.BugsTestCase) ... ok >> #test_recursion_limit (test.test_marshal.BugsTestCase) ... ok >> #test_version_argument (test.test_marshal.BugsTestCase) ... ok >> >> I'm wondering if the recusion limit on my build is getting set too low >> somehow. > > Can you find out what it is? sys.getrecursionlimit(). Hmm... It is a limit of 1000. That is probably large enough, no? Anyway, from some basic testing it looks like marshal is always throwing that error when marshal.load() is called. However, marshal.loads() works fine. Might this be another encoding related error? From guido at python.org Wed Jul 18 20:56:07 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 11:56:07 -0700 Subject: [Python-3000] Introspection broken for objects using Py_FindMethod() In-Reply-To: References: Message-ID: On 7/18/07, Alexandre Vassalotti wrote: > On 7/17/07, Guido van Rossum wrote: > > Yes, see a thread between me, Georg and Brett around March 7-10: > > > > http://mail.python.org/pipermail/python-3000/2007-March/006061.html > > > > Thanks for the pointer. > > > I think the conclusion was to get rid of Py_FindMethod altogether. The > > replacement isn't very hard. But it hasn't been done yet. > > Do you need you some help for that? Perhaps, I could try to write a > patch to replace the trivial use cases of Py_FindMethod in the stdlib. > Also, I think it would be a good idea to document the change, too. That would be great! The Python 3000 project can use all the help it can get! Please use the py3k-struni branch. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Jul 18 20:58:17 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 11:58:17 -0700 Subject: [Python-3000] Py3k_struni additional test failures under cygwin In-Reply-To: References: Message-ID: On 7/18/07, Joe Smith wrote: > >> I'm wondering if the recusion limit on my build is getting set too low > >> somehow. > > > > Can you find out what it is? sys.getrecursionlimit(). > > Hmm... It is a limit of 1000. > That is probably large enough, no? Yes, that's what it is for me. > Anyway, from some basic testing it looks like marshal is always throwing > that error when marshal.load() is called. > However, marshal.loads() works fine. > > Might this be another encoding related error? Perhaps. Or something else. Do try to investigate. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexandre at peadrop.com Wed Jul 18 22:32:57 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Wed, 18 Jul 2007 16:32:57 -0400 Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek properly In-Reply-To: References: Message-ID: So, any decision on the proposed semantic change of truncate? -- Alexandre On 7/3/07, Alexandre Vassalotti wrote: > On 7/2/07, Guido van Rossum wrote: > > Honestly, I think truncate() should always set the current position to > > the new size, even though that's not what it currently does. > > Thought about that and I think that would be the best thing to do. > That would avoid making StringIO unnecessary different from BytesIO. > And IMHO, it is less prone to bugs. If someone wants to truncate while > keeping the current position, then he will have to state is intention > explicitly by saving the value of tell() and calling seek() after > truncating. > > I also find the semantic make more sense too. For example: > > >>> s = StringIO("Good bye, world") > >>> s.truncate(10) > >>> s.write("cruel world") > >>> s.getvalue() > ??? > > I think that should return "Good bye, cruel world", not "cruel world". > > So, does anyone else agree with this small semantic change of truncate()? > From guido at python.org Wed Jul 18 22:36:26 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 13:36:26 -0700 Subject: [Python-3000] StringIO/BytesIO in io.py doesn't over-seek properly In-Reply-To: References: Message-ID: Unless anyone cares, it should imply a seek to the indicated position if an argument was present. On 7/18/07, Alexandre Vassalotti wrote: > So, any decision on the proposed semantic change of truncate? > > -- Alexandre > > On 7/3/07, Alexandre Vassalotti wrote: > > On 7/2/07, Guido van Rossum wrote: > > > Honestly, I think truncate() should always set the current position to > > > the new size, even though that's not what it currently does. > > > > Thought about that and I think that would be the best thing to do. > > That would avoid making StringIO unnecessary different from BytesIO. > > And IMHO, it is less prone to bugs. If someone wants to truncate while > > keeping the current position, then he will have to state is intention > > explicitly by saving the value of tell() and calling seek() after > > truncating. > > > > I also find the semantic make more sense too. For example: > > > > >>> s = StringIO("Good bye, world") > > >>> s.truncate(10) > > >>> s.write("cruel world") > > >>> s.getvalue() > > ??? > > > > I think that should return "Good bye, cruel world", not "cruel world". > > > > So, does anyone else agree with this small semantic change of truncate()? > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Wed Jul 18 23:34:05 2007 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed, 18 Jul 2007 17:34:05 -0400 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: (Guido van Rossum's message of "Wed, 18 Jul 2007 10:27:01 -0700") References: <87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com> Message-ID: <87d4yp5rci.fsf@hydra.bayview.thirdcreek.com> "Guido van Rossum" writes: >> www.python.org/sf/1755885 > > Thanks! Checked in, and merged into p3yk. Thanks! Unfortunately, I see there's an error from test_unicode.py, which I neglected to re-run. My apologies! I've checked in a fix on the trunk and the buildbots are relatively happy once more, it seems. Should be caught in the next merge. -- KBK From guido at python.org Wed Jul 18 23:42:37 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 18 Jul 2007 14:42:37 -0700 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: <87d4yp5rci.fsf@hydra.bayview.thirdcreek.com> References: <87k5sy5j6l.fsf@hydra.bayview.thirdcreek.com> <87d4yp5rci.fsf@hydra.bayview.thirdcreek.com> Message-ID: On 7/18/07, Kurt B. Kaiser wrote: > "Guido van Rossum" writes: > > >> www.python.org/sf/1755885 > > > > Thanks! Checked in, and merged into p3yk. > > Thanks! > > Unfortunately, I see there's an error from test_unicode.py, which I > neglected to re-run. My apologies! > > I've checked in a fix on the trunk and the buildbots are relatively > happy once more, it seems. > > Should be caught in the next merge. Ah, I see. I fixed it separately in the py3k-struni branch. I'll try to remember the next time I merge. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From g.brandl at gmx.net Wed Jul 18 23:42:56 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 18 Jul 2007 23:42:56 +0200 Subject: [Python-3000] Invalid \U escape in source code give hard-to-trace error In-Reply-To: References: <469D8AA5.1080502@v.loewis.de> Message-ID: Guido van Rossum schrieb: > On 7/17/07, "Martin v. L?wis" wrote: >> > When a source file contains a string literal with an out-of-range \U >> > escape (e.g. "\U12345678"), instead of a syntax error pointing to the >> > offending literal, I get this, without any indication of the file or >> > line: >> > >> > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in >> > position 0-9: illegal Unicode character >> > >> > This is quite hard to track down. >> >> I think the fundamental flaw is that a codec is used to implement >> the Python syntax (or, rather, lexical rules). >> >> Not quite sure what the rationale for this design was; doing it on >> the lexical level is (was) tricky because \u escapes were allowed >> only for Unicode literals, and the lexer had no knowledge of the >> prefix preceding a literal. (In 3k, it's still similar, because >> \U escapes have no effect in bytes and raw literals). >> >> Still, even if it is "only" handled at the parsing level, I >> don't see why it needs to be a codec. Instead, implementing >> escapes in the compiler would still allow for proper diagnostics >> (notice that in the AST the original lexical form of the string >> literal is gone). > > I guess because it was deemed useful to have a codec for this purpose > too, thereby exposing the algorithm to Python code that needs the same > functionality (e.g. the compiler package, RIP). And it still is useful. If you want to convert a string into a printable representation, you can use repr(), but for the inverse you need this codec. (or eval()...) Georg From alexandre at peadrop.com Wed Jul 18 23:43:54 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Wed, 18 Jul 2007 17:43:54 -0400 Subject: [Python-3000] exclusion feature for 2to3? In-Reply-To: References: Message-ID: On 7/15/07, Georg Brandl wrote: > Most obvious would be a special comment, something like > > for x in curiousobject.iteritems(): # 2to3:keep > foo(x) > > Does that make sense? It would be a good idea to define a convention for these special comments. For example, we could define something similar to C's pragma: #pragma