From g.brandl at gmx.net Mon Jan 3 18:30:12 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 03 Jan 2011 18:30:12 +0100 Subject: [Python-porting] Time to celebrate (a bit) Message-ID: We now have over 300 packages supporting Python 3 in the Cheese Shop! (Full development at http://dev.pocoo.org/~gbrandl/py3.html) cheers, Georg From g.brandl at gmx.net Mon Jan 3 18:50:45 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 03 Jan 2011 18:50:45 +0100 Subject: [Python-porting] Time to celebrate (a bit) In-Reply-To: References: Message-ID: Am 03.01.2011 18:30, schrieb Georg Brandl: > We now have over 300 packages supporting Python 3 in the Cheese Shop! Also, the slope is increasing: while the step from 50 to 100 packages to 6.5 months, the step from 250 to 300 packages took only 2.5 months. A simple exponential interpolation shows that we will have 100% Python 3 coverage by April 2016 ;-) Georg From solipsis at pitrou.net Mon Jan 3 20:35:57 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 3 Jan 2011 20:35:57 +0100 Subject: [Python-porting] Time to celebrate (a bit) References: Message-ID: <20110103203557.42dfb67d@pitrou.net> On Mon, 03 Jan 2011 18:30:12 +0100 Georg Brandl wrote: > We now have over 300 packages supporting Python 3 in the Cheese Shop! > > (Full development at http://dev.pocoo.org/~gbrandl/py3.html) Merry Christmas and happy new year to Python 3.2! cheers Antoine. From daniele.varrazzo at gmail.com Tue Jan 11 19:41:17 2011 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Tue, 11 Jan 2011 18:41:17 +0000 Subject: [Python-porting] A few questions about the psycopg2 porting to Python 3 Message-ID: Hello, I've finished the porting of psycopg2 to Python 3. The porting is based on the Martin v. L?wis patch (thank you *very* much) back in 2008; unfortunately at that time it wasn't promptly merged and the code base diverged too much: the library has grown a lot of features since then (and luckily a big fat test suite). Martin's "abstraction layer" of macros in python.h has been the base for the new porting anyway. The code is in available in the python3 branch of my psycopg repos . Something I've done in a different way is the "adapters" area: Martin did much of the processing in str, but there is a critical function provided by the libpq, PQEscapeString , defined char* -> char*, that we use on strings before passing to the backend: Py3 strings and Py2 unicode must be converted to bytes (in the connection encoding) to use it: I feel awkward to go back to unicode after passing through that function and to go back to bytes again when pushing data to the socket, so I've used mostly bytes in the Python->Postgres adaptation code path. OTOH doing my way we need eventually a "format % args" with bytes in both format and args: because this function is not provided by the Python API, I made my own "PyBytes_Format" converting the PyString_Format from the Python 2.7 source code. I understand there is no problem in using parts of the Python (2.7) source code into an LGPL-licensed work: is this right? The resulting file is in and includes the Python license as well: I'd like an opinion about whether the result is "legal" and/or if the license had to be specified some other way. As emerged from the discussion in this ML back in 2008, there is somewhere the need for a python function b('literal') that would evaluate to 'literal' in Py2 and to b'literal' in py3 (we want to keep compatibility with Python 2.4). Currently there is an encode() involved in Py3, so it would be great to have the transformation b('literal') -> b'literal' performed by 2to3 instead. Looking at the other fixers it seems easy, but I haven't found how to register a custom fixer for the use of build_py_2to3 in setup.py. Is there any reference? There may be a few other points still open: they are more specifically psycopg-related as they influence how 3rd party adapters should be written, so they will be probably discussed in the psycopg mailing list before releasing what is currently in the python3 branch. If you want to join you are welcome. Thank you very much. -- Daniele From daniele.varrazzo at gmail.com Wed Jan 12 18:34:12 2011 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Wed, 12 Jan 2011 17:34:12 +0000 Subject: [Python-porting] A few questions about the psycopg2 porting to Python 3 In-Reply-To: References: Message-ID: On Tue, Jan 11, 2011 at 6:41 PM, Daniele Varrazzo wrote: > As emerged from the discussion in this ML back in 2008, there is > somewhere the need for a python function b('literal') that would > evaluate to 'literal' in Py2 and to b'literal' in py3 (we want to keep > compatibility with Python 2.4). Currently there is an encode() > involved in Py3, so it would be great to have the transformation > b('literal') -> b'literal' performed by 2to3 instead. Looking at the > other fixers it seems easy, but I haven't found how to register a > custom fixer for the use of build_py_2to3 in setup.py. Is there any > reference? I got in PM the suggestion to use distribute. My answer is that I would prefer to avoid an extra dependency to solve this problem. I've tested with some nasting monkeypatching to have the fix_b injected. This seems working for instance: diff --git a/setup.py b/setup.py index 926169c..836d3e6 100644 --- a/setup.py +++ b/setup.py @@ -58,6 +58,16 @@ try: from distutils.command.build_py import build_py_2to3 as build_py except ImportError: from distutils.command.build_py import build_py +else: + # Monkeypatch lib2to3 to make it found our custom fixers + import lib2to3.refactor + from lib2to3.refactor import get_fixers_from_package + def get_fixers_from_package_hacked(pkg_name): + rv = get_fixers_from_package(pkg_name) + return rv + ['fix_b'] + + lib2to3.refactor.get_fixers_from_package = get_fixers_from_package_hacked + sys.path.insert(0, 'scripts') try: import configparser Is there a more proper way to use a custom fixer? As b() fixer I've written the following: """Fixer to change b('string') into b'string'.""" # Author: Daniele Varrazzo import token from lib2to3 import fixer_base from lib2to3.pytree import Leaf class FixB(fixer_base.BaseFix): PATTERN = """ power< wrapper='b' trailer< '(' arg=[any] ')' > rest=any* > """ def transform(self, node, results): arg = results['arg'] wrapper = results["wrapper"] if len(arg) == 1 and arg[0].type == token.STRING: b = Leaf(token.STRING, 'b' + arg[0].value, prefix=wrapper.prefix) node.children = [ b ] + results['rest'] It has been obtained by reverse-engineering the other fixers: is it written as is meant to be? Thanks -- Daniele From martin at v.loewis.de Wed Jan 12 21:27:16 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 12 Jan 2011 21:27:16 +0100 Subject: [Python-porting] A few questions about the psycopg2 porting to Python 3 In-Reply-To: References: Message-ID: <4D2E0EA4.90009@v.loewis.de> > I understand there is no problem in using parts of the > Python (2.7) source code into an LGPL-licensed work: is this right? > The resulting file is in and includes the > Python license as well: I'd like an opinion about whether the result > is "legal" and/or if the license had to be specified some other way. I can certainly offer an opinion. Not as a PSF director, but as a legal layman: this sounds all fine to me. > > As emerged from the discussion in this ML back in 2008, there is > somewhere the need for a python function b('literal') that would > evaluate to 'literal' in Py2 and to b'literal' in py3 (we want to keep > compatibility with Python 2.4). Currently there is an encode() > involved in Py3, so it would be great to have the transformation > b('literal') -> b'literal' performed by 2to3 instead. Looking at the > other fixers it seems easy, but I haven't found how to register a > custom fixer for the use of build_py_2to3 in setup.py. Is there any > reference? See the source of distutils.util.Mixin2to3. Overriding fixer_names should do the trick. The default list is computed as get_fixers_from_package('lib2to3.fixes') and you'll have to append to this list. I'm glad this has progress! Regards, Martin From regebro at gmail.com Wed Jan 12 22:35:38 2011 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 12 Jan 2011 22:35:38 +0100 Subject: [Python-porting] A few questions about the psycopg2 porting to Python 3 In-Reply-To: References: Message-ID: On Tue, Jan 11, 2011 at 19:41, Daniele Varrazzo wrote: > other fixers it seems easy, but I haven't found how to register a > custom fixer for the use of build_py_2to3 in setup.py. Is there any > reference? Yes: http://packages.python.org/distribute/python3.html#distributing-python-3-modules setup( ... use_2to3_fixers = ['your.fixers'], ... ) From daniele.varrazzo at gmail.com Thu Jan 13 12:23:09 2011 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Thu, 13 Jan 2011 11:23:09 +0000 Subject: [Python-porting] A few questions about the psycopg2 porting to Python 3 In-Reply-To: <4D2E0EA4.90009@v.loewis.de> References: <4D2E0EA4.90009@v.loewis.de> Message-ID: On Wed, Jan 12, 2011 at 8:27 PM, "Martin v. L?wis" wrote: >> As emerged from the discussion in this ML back in 2008, there is >> somewhere the need for a python function b('literal') that would >> evaluate to 'literal' in Py2 and to b'literal' in py3 (we want to keep >> compatibility with Python 2.4). Currently there is an encode() >> involved in Py3, so it would be great to have the transformation >> b('literal') -> b'literal' performed by 2to3 instead. Looking at the >> other fixers it seems easy, but I haven't found how to register a >> custom fixer for the use of build_py_2to3 in setup.py. Is there any >> reference? > > See the source of distutils.util.Mixin2to3. Overriding fixer_names > should do the trick. The default list is computed as > > ? ?get_fixers_from_package('lib2to3.fixes') > > and you'll have to append to this list. Yes, configuring the mixin by setting the class variable works perfectly. As a reference here it is: try: from distutils.command.build_py import build_py_2to3 as build_py except ImportError: from distutils.command.build_py import build_py else: # Configure distutils to run our custom 2to3 fixers as well from lib2to3.refactor import get_fixers_from_package build_py.fixer_names = get_fixers_from_package('lib2to3.fixes') build_py.fixer_names.append('fix_b') setup(name="yourmodule", cmdclass={ 'build_py': build_py, }, [...] ) -- Daniele From daniele.varrazzo at gmail.com Thu Jan 13 12:40:27 2011 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Thu, 13 Jan 2011 11:40:27 +0000 Subject: [Python-porting] A few questions about the psycopg2 porting to Python 3 In-Reply-To: References: Message-ID: On Wed, Jan 12, 2011 at 9:35 PM, Lennart Regebro wrote: > On Tue, Jan 11, 2011 at 19:41, Daniele Varrazzo > wrote: >> other fixers it seems easy, but I haven't found how to register a >> custom fixer for the use of build_py_2to3 in setup.py. Is there any >> reference? > > Yes: http://packages.python.org/distribute/python3.html#distributing-python-3-modules Isn't "distribute" a third party library? I would be fine using it if there was no chance to do what I want with the stdlib; but if some hacking is enough, even if the stdlib interface is not as polished as distribute's, I prefer to avoid an extra dependency. -- Daniele From regebro at gmail.com Thu Jan 13 12:43:18 2011 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 13 Jan 2011 12:43:18 +0100 Subject: [Python-porting] A few questions about the psycopg2 porting to Python 3 In-Reply-To: References: Message-ID: On Thu, Jan 13, 2011 at 12:40, Daniele Varrazzo wrote: > Isn't "distribute" a third party library? Yes, I answered before I read the part where you say you didn't want to use it. :-) From daniele.varrazzo at gmail.com Mon Jan 24 01:33:28 2011 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Mon, 24 Jan 2011 00:33:28 +0000 Subject: [Python-porting] Details about the psycopg porting Message-ID: Hello, I've written to the Psycopg mailing list about the details in the psycopg2 porting to Python 3. You can also read everything here: . There is a couple of points still open, so if you want to take a look at them I'd be happy to receive comments before releasing the code. Regards, -- Daniele From regebro at gmail.com Mon Jan 24 08:21:02 2011 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 24 Jan 2011 08:21:02 +0100 Subject: [Python-porting] Details about the psycopg porting In-Reply-To: References: Message-ID: On Mon, Jan 24, 2011 at 01:33, Daniele Varrazzo wrote: > Hello, > > I've written to the Psycopg mailing list about the details in the > psycopg2 porting to Python 3. You can also read everything here: > . > > There is a couple of points still open, so if you want to take a look > at them I'd be happy to receive comments before releasing the code. "Is there an interface in Python 3 to know if a file is binary or text?" You can check if it inherits from io.TextIOBase or not. I think that's the official way. correct me if I'm wrong. For the other issues I guess I would have to know psycopg2 to be able to help. :-) //Lennart From solipsis at pitrou.net Mon Jan 24 16:20:48 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 24 Jan 2011 15:20:48 +0000 (UTC) Subject: [Python-porting] Details about the psycopg porting References: Message-ID: Hello, > I've written to the Psycopg mailing list about the details in the > psycopg2 porting to Python 3. You can also read everything here: > . > > There is a couple of points still open, so if you want to take a look > at them I'd be happy to receive comments before releasing the code. >From your article: > the data (bytes) from the libpq are passed to file.write() using > PyObject_CallFunction(func, "s#", buffer, len)? You shouldn't use "s#" as it will implicitly decode the buffer to unicode. Instead, use "y#" to write bytes. > Is there an interface in Python 3 to know if a file is binary or text? `isinstance(myfile, io.TextIOBase)` should do the trick. Or the corresponding C call, using PyObject_IsInstance(). > In binary mode the file always returns bytes (str in py2, unicode in py3) I suppose you mean "str in py2, bytes in py3". > bytea fields are returned as MemoryView, from which is easy to get bytes Is this because it is easier for you to return a memoryview? Otherwise it would make more sense to return a bytes object. Regards Antoine. From daniele.varrazzo at gmail.com Mon Jan 24 17:10:30 2011 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Mon, 24 Jan 2011 16:10:30 +0000 Subject: [Python-porting] Details about the psycopg porting In-Reply-To: References: Message-ID: On Mon, Jan 24, 2011 at 3:20 PM, Antoine Pitrou wrote: > > Hello, > >> I've written to the Psycopg mailing list about the details in the >> psycopg2 porting to Python 3. You can also read everything here: >> . >> >> There is a couple of points still open, so if you want to take a look >> at them I'd be happy to receive comments before releasing the code. > > From your article: > >> the data (bytes) from the libpq are passed to file.write() using >> PyObject_CallFunction(func, "s#", buffer, len)? > > You shouldn't use "s#" as it will implicitly decode the buffer to unicode. > Instead, use "y#" to write bytes. Yes, the #s is a leftover from before the conversion: I just have to decide whether it's better to always emit bytes and break on text files or if to check for the file capability. Because text mode is the default for open() I think the former would be surprising: I'll go for the second option if not overly complex (seems trivial if PyTextIOBase_Type is available in C without the need of importing anything from Python, annoying otherwise). >> In binary mode the file always returns bytes (str in py2, unicode in py3) > > I suppose you mean "str in py2, bytes in py3". Yes: fixed, thanks. >> bytea fields are returned as MemoryView, from which is easy to get bytes > > Is this because it is easier for you to return a memoryview? Otherwise it would > make more sense to return a bytes object. In Py2 bytea is converted to buffer objects, passing through a "chunk" object implementing the buffer interface. so yes, MemoryView is a more direct port. -- Daniele From solipsis at pitrou.net Mon Jan 24 17:21:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 24 Jan 2011 16:21:03 +0000 (UTC) Subject: [Python-porting] Details about the psycopg porting References: Message-ID: Daniele Varrazzo writes: > >> the data (bytes) from the libpq are passed to file.write() using > >> PyObject_CallFunction(func, "s#", buffer, len)? > > > > You shouldn't use "s#" as it will implicitly decode the buffer to unicode. > > Instead, use "y#" to write bytes. > > Yes, the #s is a leftover from before the conversion: I just have to > decide whether it's better to always emit bytes and break on text > files or if to check for the file capability. Because text mode is the > default for open() I think the former would be surprising: I'll go for > the second option if not overly complex (seems trivial if > PyTextIOBase_Type is available in C without the need of importing > anything from Python, annoying otherwise). No, you'll have to import. The actual TextIOBase ABC is declared in Python. (see Lib/io.py if you are curious) > >> bytea fields are returned as MemoryView, from which is easy to get bytes > > > > Is this because it is easier for you to return a memoryview? Otherwise it would > > make more sense to return a bytes object. > > In Py2 bytea is converted to buffer objects, passing through a "chunk" > object implementing the buffer interface. so yes, MemoryView is a more > direct port. Well, does it point to some external memory managed by pgsql itself? Otherwise bytes or bytearray would still be a better choice IMO (as in better-known and more practical). In 3.x there's no confusion between 8-bit strings and unicode strings, so use of an obscure type such as buffer() shouldn't be necessary. Regards Antoine. From daniele.varrazzo at gmail.com Tue Jan 25 01:24:11 2011 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Tue, 25 Jan 2011 00:24:11 +0000 Subject: [Python-porting] Details about the psycopg porting In-Reply-To: References: Message-ID: On Mon, Jan 24, 2011 at 4:21 PM, Antoine Pitrou wrote: > Daniele Varrazzo writes: >> >> the data (bytes) from the libpq are passed to file.write() using >> >> PyObject_CallFunction(func, "s#", buffer, len)? >> > >> > You shouldn't use "s#" as it will implicitly decode the buffer to unicode. >> > Instead, use "y#" to write bytes. >> >> Yes, the #s is a leftover from before the conversion: I just have to >> decide whether it's better to always emit bytes and break on text >> files or if to check for the file capability. Because text mode is the >> default for open() I think the former would be surprising: I'll go for >> the second option if not overly complex (seems trivial if >> PyTextIOBase_Type is available in C without the need of importing >> anything from Python, annoying otherwise). > > No, you'll have to import. The actual TextIOBase ABC is declared in Python. > (see Lib/io.py if you are curious) Annoying, then :) Will give it a try. >> >> bytea fields are returned as MemoryView, from which is easy to get bytes >> > >> > Is this because it is easier for you to return a memoryview? Otherwise it > would >> > make more sense to return a bytes object. >> >> In Py2 bytea is converted to buffer objects, passing through a "chunk" >> object implementing the buffer interface. so yes, MemoryView is a more >> direct port. > > Well, does it point to some external memory managed by pgsql itself? Otherwise > bytes or bytearray would still be a better choice IMO (as in better-known and > more practical). In 3.x there's no confusion between 8-bit strings and unicode > strings, so use of an obscure type such as buffer() shouldn't be necessary. Reviewing the code, the buffer object was probably used initially because the memory is handled by the libpq. I will have a talk with some heavy user of the bytea types (I am not, but people such as the gnumed developers are) about what would be best choice for the library users. I want to avoid to introduce unnecessary changes for Py2 users, so the buffer should stay unless we decide there are better options and it's time for an uncompatible change. Having a radically different interface for Py3 I fear would be a problem for people migrating from Py2. Thank you very much. -- Daniele