From barry at python.org Wed Dec 5 01:12:01 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 4 Dec 2012 19:12:01 -0500 Subject: [Python-porting] main() -> Py_SetProgramName() Message-ID: <20121204191201.3589a972@limelight.wooz.org> One gotcha with porting embedded Python 3 is the mismatch between main()'s signature and Py_SetProgramName() and PySys_SetArgv(). In Python 2, everything was easy. You got char*'s from main() and could pass them directly to these two calls. Not in Python 3, because they now take wchar_t*'s instead. I get why these signatures have changed, but that doesn't make life very easy for porters. Take a look at main() in Modules/python.c to see the headaches Python itself goes through do the conversions. I think we're doing a disservice to embedders not to provide convenience functions, alternative APIs, or at the very least code examples for helping them do the argument conversions. This is not easy code, it's error prone, and folks shouldn't have to roll their own every time they need to do this. Using the algorithm in main() is probably not the best recommendation either, because it uses non-public API methods such as _Py_char2wchar(). Perhaps these should be promoted to a public method, or we should add a method to get from main()'s char** to a wchar_t**. For now, I've tried to use mbsrtowcs(), though I haven't done extensive testing on the code. I think Python ultimately uses mbstowcs() down deep in its bowels. There was some discussion of this back in 2009 IIRC, but nothing ever came of it. I think MvL at the time was against adding any convenience or alternative API to Python. Has anybody else encountered this while porting embedded Python applications to Python 3? How did you solve it? I'm happy to bring this up on python-dev, but I also don't want to have to wait until Python 3.4 to have a nice solution. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From mal at egenix.com Wed Dec 5 09:16:44 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 05 Dec 2012 09:16:44 +0100 Subject: [Python-porting] main() -> Py_SetProgramName() In-Reply-To: <20121204191201.3589a972@limelight.wooz.org> References: <20121204191201.3589a972@limelight.wooz.org> Message-ID: <50BF02EC.9000804@egenix.com> On 05.12.2012 01:12, Barry Warsaw wrote: > One gotcha with porting embedded Python 3 is the mismatch between main()'s > signature and Py_SetProgramName() and PySys_SetArgv(). > > In Python 2, everything was easy. You got char*'s from main() and could pass > them directly to these two calls. Not in Python 3, because they now take > wchar_t*'s instead. I get why these signatures have changed, but that doesn't > make life very easy for porters. > > Take a look at main() in Modules/python.c to see the headaches Python itself > goes through do the conversions. I think we're doing a disservice to > embedders not to provide convenience functions, alternative APIs, or at the > very least code examples for helping them do the argument conversions. This > is not easy code, it's error prone, and folks shouldn't have to roll their own > every time they need to do this. > > Using the algorithm in main() is probably not the best recommendation either, > because it uses non-public API methods such as _Py_char2wchar(). Perhaps > these should be promoted to a public method, or we should add a method to get > from main()'s char** to a wchar_t**. > > For now, I've tried to use mbsrtowcs(), though I haven't done extensive > testing on the code. I think Python ultimately uses mbstowcs() down deep in > its bowels. There's also another issue with the approach, since changing the **argv from within Python is no longer possible on non-Windows platforms. This doesn't only affect embedded uses of Python, but all other uses as well, e.g. it's no longer possible to change the ps output under Unix for daemons and the like. I think that we should have APIs going from the original char **argv to the Py_Main() wchar_t **argv one, as well as APIs that allow changing or at least accessing the original char **argv from within Python (on non-Windows platforms). That said, I don't think this is going to happen in a patch level release... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 05 2012) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-11-28: Released eGenix mx Base 3.2.5 ... http://egenix.com/go36 2013-01-22: Python Meeting Duesseldorf ... 48 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From martin at v.loewis.de Wed Dec 5 13:53:01 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Dec 2012 13:53:01 +0100 Subject: [Python-porting] main() -> Py_SetProgramName() In-Reply-To: <20121204191201.3589a972@limelight.wooz.org> References: <20121204191201.3589a972@limelight.wooz.org> Message-ID: <50BF43AD.90401@v.loewis.de> > Using the algorithm in main() is probably not the best recommendation either, > because it uses non-public API methods such as _Py_char2wchar(). Perhaps > these should be promoted to a public method, or we should add a method to get > from main()'s char** to a wchar_t**. > > For now, I've tried to use mbsrtowcs(), though I haven't done extensive > testing on the code. I think Python ultimately uses mbstowcs() down deep in > its bowels. > > There was some discussion of this back in 2009 IIRC, but nothing ever came of > it. I think MvL at the time was against adding any convenience or alternative > API to Python. If I said that, I may not have meant it this way. I may have been opposed to a convenience function that implicitly calls setlocale, which in turn would be necessary before mbsrtowcs can do anything useful (for non-ASCII characters). > I'm happy to bring this up on python-dev, but I also don't want to have to > wait until Python 3.4 to have a nice solution. In which case a stand-alone convenience function could be provided, to be included in every project facing this issue. Regards, Martin From solipsis at pitrou.net Sat Dec 8 10:40:32 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Dec 2012 09:40:32 +0000 (UTC) Subject: [Python-porting] =?utf-8?q?main=28=29_-=3E_Py=5FSetProgramName=28?= =?utf-8?q?=29?= References: <20121204191201.3589a972@limelight.wooz.org> <50BF02EC.9000804@egenix.com> Message-ID: M.-A. Lemburg writes: > > There's also another issue with the approach, since changing the > **argv from within Python is no longer possible on non-Windows > platforms. > > This doesn't only affect embedded uses of Python, but all other > uses as well, e.g. it's no longer possible to change the ps output > under Unix for daemons and the like. setproctitle is your friend: http://pypi.python.org/pypi/setproctitle Regards Antoine. From mal at egenix.com Sat Dec 8 12:57:53 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 08 Dec 2012 12:57:53 +0100 Subject: [Python-porting] main() -> Py_SetProgramName() In-Reply-To: References: <20121204191201.3589a972@limelight.wooz.org> <50BF02EC.9000804@egenix.com> Message-ID: <50C32B41.8000300@egenix.com> On 08.12.2012 10:40, Antoine Pitrou wrote: > M.-A. Lemburg writes: >> >> There's also another issue with the approach, since changing the >> **argv from within Python is no longer possible on non-Windows >> platforms. >> >> This doesn't only affect embedded uses of Python, but all other >> uses as well, e.g. it's no longer possible to change the ps output >> under Unix for daemons and the like. > > setproctitle is your friend: > http://pypi.python.org/pypi/setproctitle Thanks for the pointer, but I think this is more than enough proof that something should be done to make the situations in Py3 easier for everyone. Here's the hack he's using to find the original argv areas by walking backwards from environ[0]... https://github.com/dvarrazzo/py-setproctitle/blob/master/src/spt_setup.c#L139 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 08 2012) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-12-05: Released eGenix pyOpenSSL 0.13 ... http://egenix.com/go37 2012-11-28: Released eGenix mx Base 3.2.5 ... http://egenix.com/go36 2013-01-22: Python Meeting Duesseldorf ... 45 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/