From report at bugs.python.org Sat Oct 1 00:07:26 2011 From: report at bugs.python.org (Tom Christiansen) Date: Fri, 30 Sep 2011 22:07:26 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1317414642.85.0.0881821462071.issue12753@psf.upfronthosting.co.za> Message-ID: <29624.1317420430@chthon> Tom Christiansen added the comment: >Ezio Melotti added the comment: > Leaving named sequences for unicodedata.lookup() only (and not for > \N{}) makes sense. There are certainly advantages to that strategy: you don't have to deal with [\N{sequence}] issues. If the argument to unicode.lookup() and be any of name, alias, or sequence, that seems ok. \N{} should still do aliases, though, since those don't have the complication that sequences have. You may wish unicode.name() to return the alias in preference, however. That's what we do. And of course, there is no issue of sequences there. The rest of this perhaps painfully long message is just elaboration and icing on what I've said above. --tom > The list of aliases is so small (11 entries) that I'm not sure using a > binary search for it would bring any advantage. Having a single > lookup algorithm that looks in both tables doesn't work because the > aliases lookup must be in _getcode for \N{...} to work, whereas the > lookup of named sequences will happen in unicodedata_lookup > (Modules/unicodedata.c:1187). I think we can leave the for loop over > aliases in _getcode and implement a separate (and binary) search in > unicodedata_lookup for the named sequences. Does that sound fine? If you mean, is it ok to add just the aliases and not the named sequences to \N{}, it is certainly better than not doing so at all. Plus that way you do *not* have to figure out what in the world to to do with [^a-c\N{sequence}], since that would have be something like (?!\N{sequence})[^a-c]), which is hardly obvious, especially if \N{sequence} actually starts with [a-c]. However, because the one namespace comprises all three of names, aliases, and named sequences, it might be best to have a functional (meaning, non-regex) API that allows one to do a fetch on the whole namespace, or on each individual component. The ICU library supports this sort of thing. In ICU4J's Java bindings, we find this: static int getCharFromExtendedName(String name) [icu] Find a Unicode character by either its name and return its code point value. static int getCharFromName(String name) [icu] Finds a Unicode code point by its most current Unicode name and return its code point value. static int getCharFromName1_0(String name) [icu] Find a Unicode character by its version 1.0 Unicode name and return its code point value. static int getCharFromNameAlias(String name) [icu] Find a Unicode character by its corrected name alias and return its code point value. The first one obviously has a bug in its definition, as the English doesn't scan. Looking at the full definition is even worse. Rather than dig out the src jar, I looked at ICU4C, but its own bindings are completely different. There you have only one function, with an enum to say what namespace to access: UChar32 u_charFromName ( UCharNameChoice nameChoice, const char * name, UErrorCode * pErrorCode ) The UCharNameChoice enum tells what sort of thing you want: U_UNICODE_CHAR_NAME, U_UNICODE_10_CHAR_NAME, U_EXTENDED_CHAR_NAME, U_CHAR_NAME_ALIAS, U_CHAR_NAME_CHOICE_COUNT Looking at the src for the Java is no more immediately illuminating, but I think that "extended" may refer to a union of the old 1.0 names with the current names. Now I'll tell you what Perl does. I do this not to say it is "right", but just to show you one possible strategy. I also am in the middle of writing about this for the Camel, so it is in my head. Perl does not provide the old 1.0 names at all. We don't have a Unicode 1.0 legacy to support, which makes this cleaner. However, we do provide for the names of the C0 and C1 Control Codes, because apart from Unicode 1.0, they don't condescend to name the ASCII or Latin1 control codes. We also provide for certain well known aliases from the Names file: anything that says "* commonly abbreviated as ...", so things like LRO and ZWJ and such. Perl makes no distinction between anything in the namespace when using the \N{} form for string and regex escapes. That means when you use "\N{...}" or /\N{...}/, you don't know which it is, nor can you. (And yes, the bracketed character class issue is annoying and unsolved.) However, the "functional" API does make a slight distinction. -- charnames::vianame() takes a name or alias (as a string) and returns a single integer code point. eg: This therefore converts "LATIN SMALL LETTER A" into 0x61. It also converts both BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS and BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS into 0x1D0C5. See below. -- charnames::string_vianame() takes a string name, alias, *or* sequence, and gives back a string. eg: This therefore converts "LATIN SMALL LETTER A" into "a". Since it has a string return instead of an int, it now also handles everything from NamedSequences file as well. (See below.) -- charnames::viacode() takes an integer can gives back the official alias if there is one, and the official name if there is not. eg: This converts 0x61 into "LATIN SMALL LETTER A". It also converts 0x1D0C5 into "BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS". Consider BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS That was an error, and there is an official alias fixing it: BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS (That's FHTORA vs FTHORA.) You may use either as the name, and if you reverse the code point to name, you get the replacement alias. % perl -mcharnames -wle 'printf "%04X\n", charnames::vianame("BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS")' 1D0C5 % perl -mcharnames -wle 'printf "%04X\n", charnames::vianame("BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS")' 1D0C5 % perl -mcharnames -wle 'print charnames::viacode(charnames::vianame("BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS"))' BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS So on round-tripping, I gave it the "wrong" one (the original) and it gave me back the "right" one (the replacement). Using the \N{} thing, it again doesn't matter: % perl -mcharnames=:full -wle 'printf "%04X\n", ord "\N{BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS}"' 1D0C5 % perl -mcharnames=:full -wle 'printf "%04X\n", ord "\N{BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS}"' 1D0C5 The interesting thing is the named sequences. string_vianame() works just fine on those: % perl -mcharnames -wle 'print length charnames::string_vianame("LATIN CAPITAL LETTER A WITH MACRON AND GRAVE")' 2 % perl -mcharnames -wle 'printf "U+%v04X\n", charnames::string_vianame("LATIN CAPITAL LETTER A WITH MACRON AND GRAVE")' U+0100.0300 And that works fine with \N{} as well (provided you don't try charclasses): % perl -mcharnames=:full -wle 'print "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}"' ?? % perl -mcharnames=:full -wle 'print "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}"' | uniquote -v \N{LATIN CAPITAL LETTER A WITH MACRON}\N{COMBINING GRAVE ACCENT} % perl -mcharnames=:full -wle 'print length "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}"' 2 % perl -mcharnames=:full -wle 'printf "U+%v04X\n", "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}"' U+0100.0300 It's kinda sad that for \N{} and sequneces you can't just "do the right thing" with strings and say that charclass stuff just isn't supported. But my guess is that this simply won't work because you don't have first class regexes. If you pass both of these to the regex engine, they should behave the same (and would, assuming the regex compiler knows about \N{} escapes): "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}" r'\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}' However, that falls part if you do "[^\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}]" r'[^\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}]' Because the compiler will do the substitution early on the first one but not the second. This seems a problem, eh? So I guess you can't do it at all? Or could you document it? I think there is no good solution here. Perl can and does actually do something quite reasonable in the noncharclass case, but that is because we know that we are compiling a regex in virtually all scenarios. % perl -Mcharnames=:full -le 'print qr/\N{LATIN SMALL LETTER A}/' (?^u:\N{U+61}) % perl -Mcharnames=:full -le 'print qr/\N{LATIN CAPITAL LETTER A WITH MACRON}/' (?^u:\N{U+100}) % perl -Mcharnames=:full -le 'print qr/\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}/' (?^u:\N{U+100.300}) So you can do: % perl -Mcharnames=:full -le 'print "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}" =~ /\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}/' 1 And it is just fine. The issue is that there are ways for you to get yoruself into trouble if you do string-string stuff: % perl -Mcharnames=:full -le 'print "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}" =~ "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}"' 1 % perl -Mcharnames=:full -le 'print "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}" =~ "^[\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}]+\$"' 1 That works, but only accidentally, because of course U+0100.0300 contains nothing but either U+0100 or U+0300. This is not a solved problem. I hope this helps. --tom ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 00:18:39 2011 From: report at bugs.python.org (Terry J. Reedy) Date: Fri, 30 Sep 2011 22:18:39 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317421119.87.0.0227784627219.issue13071@psf.upfronthosting.co.za> Terry J. Reedy added the comment: IDLE with Py3.2.2 works fine on Win 7 for me (desktop) and daughter (laptop), so there is something peculiar with your system. ---------- nosy: +terry.reedy _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 01:54:18 2011 From: report at bugs.python.org (Victor Semionov) Date: Fri, 30 Sep 2011 23:54:18 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317426858.54.0.0535941427661.issue13070@psf.upfronthosting.co.za> Victor Semionov added the comment: Thanks Charles-Fran?ois, I tested your patch with "make test" and with my program. Both work fine. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 02:24:38 2011 From: report at bugs.python.org (Alexander) Date: Sat, 01 Oct 2011 00:24:38 +0000 Subject: [issue13082] Can't open new window in python Message-ID: <1317428678.05.0.613085535741.issue13082@psf.upfronthosting.co.za> New submission from Alexander : When I try to open a new window in python to actually type a program and not just make single python instructions (this is while running on IDLE), python stops working and I have to force quit ---------- assignee: ronaldoussoren components: Macintosh messages: 144711 nosy: Reason2Rage, ronaldoussoren priority: normal severity: normal status: open title: Can't open new window in python type: crash versions: Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 03:17:59 2011 From: report at bugs.python.org (Mike Hoy) Date: Sat, 01 Oct 2011 01:17:59 +0000 Subject: [issue13075] PEP-0001 contains dead links In-Reply-To: <1317374485.54.0.239839552518.issue13075@psf.upfronthosting.co.za> Message-ID: <1317431879.92.0.681828900965.issue13075@psf.upfronthosting.co.za> Mike Hoy added the comment: I'm working on making a patch for this. I just want to confirm that the information I found is correct: Article Name OLD URL ARCHIVED URL How Python is Developed OLD: http://www.python.org/dev/intro/ ARCHIVE: http://www.etsimo.uniovi.es/python/dev/intro/ Python's Development Process http://www.python.org/dev/process/ http://www.etsimo.uniovi.es/python/dev/process/ Why Develop Python? http://www.python.org/dev/why/ http://www.etsimo.uniovi.es/python/dev/why/ Development Tools http://www.python.org/dev/tools/ http://www.etsimo.uniovi.es/python/dev/tools/ Frequently Asked Questions for Developers http://www.python.org/dev/faq/ http://www.etsimo.uniovi.es/python/dev/faq/ ---------- nosy: +mikehoy _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 03:19:07 2011 From: report at bugs.python.org (Mike Hoy) Date: Sat, 01 Oct 2011 01:19:07 +0000 Subject: [issue13075] PEP-0001 contains dead links In-Reply-To: <1317374485.54.0.239839552518.issue13075@psf.upfronthosting.co.za> Message-ID: <1317431947.96.0.55975689313.issue13075@psf.upfronthosting.co.za> Mike Hoy added the comment: Of course I would be creating new articles based on the archived pages. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 03:50:51 2011 From: report at bugs.python.org (STINNER Victor) Date: Sat, 01 Oct 2011 01:50:51 +0000 Subject: [issue13083] _sre: getstring() releases the buffer before using it Message-ID: <1317433851.47.0.137164360321.issue13083@psf.upfronthosting.co.za> New submission from STINNER Victor : getstring() of the _sre module contains the following code: ------------- ... buffer = Py_TYPE(string)->tp_as_buffer; if (!buffer || !buffer->bf_getbuffer || (*buffer->bf_getbuffer)(string, &view, PyBUF_SIMPLE) < 0) { PyErr_SetString(PyExc_TypeError, "expected string or buffer"); return NULL; } /* determine buffer size */ bytes = view.len; ptr = view.buf; /* Release the buffer immediately --- possibly dangerous but doing something else would require some re-factoring */ PyBuffer_Release(&view); ... ------------- getstring() is used to initialize a state or a pattern. State and pattern have destructors (pattern_dealloc() and state_fini()), so it should be possible to keep the view active and call PyBuffer_Release() in the destructor. ---------- components: Library (Lib) messages: 144714 nosy: haypo, pitrou priority: normal severity: normal status: open title: _sre: getstring() releases the buffer before using it versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 03:52:00 2011 From: report at bugs.python.org (Nick Coghlan) Date: Sat, 01 Oct 2011 01:52:00 +0000 Subject: [issue13075] PEP-0001 contains dead links In-Reply-To: <1317374485.54.0.239839552518.issue13075@psf.upfronthosting.co.za> Message-ID: <1317433920.76.0.141121490091.issue13075@psf.upfronthosting.co.za> Nick Coghlan added the comment: These pages are all still on python.org - the links just need to be updated to point to the devguide equivalents (under http://docs.python.org/devguide) ---------- nosy: +ncoghlan _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 04:15:43 2011 From: report at bugs.python.org (Ezio Melotti) Date: Sat, 01 Oct 2011 02:15:43 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1313430514.3.0.983525514499.issue12753@psf.upfronthosting.co.za> Message-ID: <1317435343.47.0.967017479927.issue12753@psf.upfronthosting.co.za> Ezio Melotti added the comment: Attached a new patch that adds support for named sequences (still needs some test and can probably be improved). > There are certainly advantages to that strategy: you don't have to > deal with [\N{sequence}] issues. I assume with [] you mean a regex character class, right? > If the argument to unicode.lookup() and be any of name, alias, or > sequence, that seems ok. With my latest patch, all 3 are supported. > \N{} should still do aliases, though, since those don't have the > complication that sequences have. \N{} will only support names and aliases (maybe this can go in 2.7/3.2 too). > You may wish unicode.name() to return the alias in preference, > however. That's what we do. And of course, there is no issue of > sequences there. This can be done for 3.3, but I wonder if it might create problems. People might use unicodedata.name() to get a name and use it elsewhere, and the other side might not be aware of aliases. ---------- Added file: http://bugs.python.org/file23280/issue12753-2.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 08:39:35 2011 From: report at bugs.python.org (Stefan Krah) Date: Sat, 01 Oct 2011 06:39:35 +0000 Subject: [issue13080] test_email fails in refleak mode In-Reply-To: <1317411097.99.0.612727168161.issue13080@psf.upfronthosting.co.za> Message-ID: <1317451175.5.0.233175581313.issue13080@psf.upfronthosting.co.za> Stefan Krah added the comment: I think this is a duplicate of #12788. ---------- nosy: +skrah resolution: -> duplicate stage: -> committed/rejected status: open -> closed superseder: -> test_email fails with -R _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 11:48:23 2011 From: report at bugs.python.org (Stefan Krah) Date: Sat, 01 Oct 2011 09:48:23 +0000 Subject: [issue13084] test_signal failure Message-ID: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> New submission from Stefan Krah : Got this failure on Debian lenny amd64: [1/1] test_signal test test_signal failed -- Traceback (most recent call last): File "/home/stefan/cpython/Lib/test/test_signal.py", line 339, in test_pending """, *signals) File "/home/stefan/cpython/Lib/test/test_signal.py", line 263, in check_wakeup assert_python_ok('-c', code) File "/home/stefan/cpython/Lib/test/script_helper.py", line 50, in assert_python_ok return _assert_python(True, *args, **env_vars) File "/home/stefan/cpython/Lib/test/script_helper.py", line 42, in _assert_python "stderr follows:\n%s" % (rc, err.decode('ascii', 'ignore'))) AssertionError: Process return code is 1, stderr follows: Traceback (most recent call last): File "", line 41, in File "", line 16, in check_signum Exception: (10, 12) != (12, 10) 1 test failed: test_signal [103837 refs] ---------- components: Tests messages: 144718 nosy: skrah priority: normal severity: normal status: open title: test_signal failure versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 11:51:39 2011 From: report at bugs.python.org (Stefan Krah) Date: Sat, 01 Oct 2011 09:51:39 +0000 Subject: [issue13085] : memory leaks Message-ID: <1317462699.45.0.870961708747.issue13085@psf.upfronthosting.co.za> New submission from Stefan Krah : I think a couple of leaks were introduced by the pep-393 changes (see the patch). ---------- components: Interpreter Core files: pep-393-leaks.diff keywords: patch messages: 144719 nosy: haypo, loewis, skrah priority: normal severity: normal stage: patch review status: open title: : memory leaks type: resource usage versions: Python 3.3 Added file: http://bugs.python.org/file23281/pep-393-leaks.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 11:52:15 2011 From: report at bugs.python.org (Stefan Krah) Date: Sat, 01 Oct 2011 09:52:15 +0000 Subject: [issue13085] pep-393: memory leaks In-Reply-To: <1317462699.45.0.870961708747.issue13085@psf.upfronthosting.co.za> Message-ID: <1317462735.78.0.198685401056.issue13085@psf.upfronthosting.co.za> Changes by Stefan Krah : ---------- title: : memory leaks -> pep-393: memory leaks _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 11:54:05 2011 From: report at bugs.python.org (Stefan Krah) Date: Sat, 01 Oct 2011 09:54:05 +0000 Subject: [issue13084] test_signal failure In-Reply-To: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> Message-ID: <1317462845.66.0.0454078462025.issue13084@psf.upfronthosting.co.za> Changes by Stefan Krah : ---------- type: -> behavior _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 12:19:05 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Sat, 01 Oct 2011 10:19:05 +0000 Subject: [issue13084] test_signal failure In-Reply-To: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> Message-ID: <1317464345.91.0.210701626116.issue13084@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: See http://bugs.python.org/issue12469, specifically http://bugs.python.org/issue12469#msg139831 """ > > When signals are unblocked, pending signal ared delivered in the reverse order > > of their number (also on Linux, not only on FreeBSD 6). > > I don't like this. > POSIX doesn't make any guarantee about signal delivery order, except > for real-time signals. > It might work on FreeBSD and Linux, but that's definitely not > documented, and might break with new kernel releases, or other > kernels. It looks like it works like this on most OSes (Linux, Mac OS X, Solaris, FreeBSD): I don't see any test_signal failure on 3.x buildbots. If we have a failure, we can use set() again, but only for test_pending: signal order should be reliable if signals are not blocked. """ Looks like we now have a failure :-) ---------- nosy: +haypo, neologix _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 12:57:42 2011 From: report at bugs.python.org (Larry Hastings) Date: Sat, 01 Oct 2011 10:57:42 +0000 Subject: [issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0 Message-ID: <1317466662.3.0.331555988129.issue13086@psf.upfronthosting.co.za> New submission from Larry Hastings : The title of howto/cporting.rst is "Porting Extension Modules To 3.0". It then talks about 3.0 in a whole bunch of places. Considering that we're working on 3.3, and considering that 3.0 is end-of-lifed (not even meriting a branch in hg), wouldn't it be better for the document to talk about "3.x"? It already talks about "2.x" in several places, so it's not like this would confuse the reader. Alternatively, we could remove the ".0" (and maybe the ".x"s) so the document talks about porting from "Python 2" to "Python 3". I'd be happy to make the patch / check in the change. ---------- assignee: larry components: Documentation messages: 144721 nosy: larry priority: low severity: normal status: open title: Update howto/cporting.rst so it talks about 3.x instead of 3.0 type: feature request versions: Python 3.1, Python 3.2, Python 3.3, Python 3.4 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 12:59:48 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Sat, 01 Oct 2011 10:59:48 +0000 Subject: [issue12737] str.title() is overzealous by upcasing combining marks inappropriately In-Reply-To: <26418.1317386261@chthon> Message-ID: <4E86F2A2.9020107@v.loewis.de> Martin v. L?wis added the comment: > * Word characters are Alphabetic + Mn+Mc+Me + Nd + Pc. Where did you get that definition from? UTS#18 defines "", which is Alphabetic + U+200C + U+200D (i.e. not including marks, but including those > I think you are looking for here are Word characters without > Nd + Pc, so just Alphabetic + Mn+Mc+Me. > > Is that right? With your definition of "Word character" above, yes, that's right. Marks won't start a word, though. As for terminology: I think the documentation should continue to speak about "words" and "letters", and then define what is meant in this context. It's not that the Unicode consortium invented the term "letter", so we should use it more liberally than just referring to the L* categories. ---------- title: str.title() is overzealous by upcasing combining marks inappropriately -> str.title() is overzealous by upcasing combining marks inappropriately _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 13:07:49 2011 From: report at bugs.python.org (Tom Christiansen) Date: Sat, 01 Oct 2011 11:07:49 +0000 Subject: [issue12737] str.title() is overzealous by upcasing combining marks inappropriately In-Reply-To: <4E86F2A2.9020107@v.loewis.de> Message-ID: <32317.1317467261@chthon> Tom Christiansen added the comment: Martin v. L?wis wrote on Sat, 01 Oct 2011 10:59:48 -0000: >> * Word characters are Alphabetic + Mn+Mc+Me + Nd + Pc. > Where did you get that definition from? UTS#18 defines > "", which is Alphabetic + U+200C + U+200D > (i.e. not including marks, but including those >From UTS#18 RL1.2A in Annex C, where a \p{word} or \w character is defined to be \p{alpha} \p{gc=Mark} \p{digit} \p{gc=Connector_Punctuation} >> I think you are looking for here are Word characters without >> Nd + Pc, so just Alphabetic + Mn+Mc+Me. >> >> Is that right? > > With your definition of "Word character" above, yes, that's right. It's not mine. It's tr18's. > Marks won't start a word, though. That's the smarter boundary thing they talk about. I'm not myself familiar with \pM > As for terminology: I think the documentation should continue to > speak about "words" and "letters", and then define what is meant > in this context. It's not that the Unicode consortium invented > the term "letter", so we should use it more liberally than just > referring to the L* categories. I really don't think it wise to have private definitions of these. If Letter doesn't mean L?, things get too weird. That's why there are separate definitions of alphabetic, word, etc. --tom ---------- title: str.title() is overzealous by upcasing combining marks inappropriately -> str.title() is overzealous by upcasing combining marks inappropriately _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 13:13:56 2011 From: report at bugs.python.org (Larry Hastings) Date: Sat, 01 Oct 2011 11:13:56 +0000 Subject: [issue13053] Add Capsule migration documentation to "cporting" In-Reply-To: <1317224248.53.0.263122473581.issue13053@psf.upfronthosting.co.za> Message-ID: <1317467636.96.0.0753461105779.issue13053@psf.upfronthosting.co.za> Larry Hastings added the comment: Attached is a patch against trunk branch "2.7" (rev dec00ae64ca8) adding documentation on how to migrate CObjects to Capsules. Delta the inevitable formatting bikeshedding, this should be ready to go. I've smoke-tested the "capsulethunk.h" locally and it works fine. When accepted, I'll check this in to the 2.7 branch, then merge into the 3.1, 3.2, and trunk branches. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 13:14:37 2011 From: report at bugs.python.org (Larry Hastings) Date: Sat, 01 Oct 2011 11:14:37 +0000 Subject: [issue13053] Add Capsule migration documentation to "cporting" In-Reply-To: <1317224248.53.0.263122473581.issue13053@psf.upfronthosting.co.za> Message-ID: <1317467677.08.0.487945214.issue13053@psf.upfronthosting.co.za> Larry Hastings added the comment: Whoops, forgot to attach. *Here's* the patch. ---------- keywords: +patch Added file: http://bugs.python.org/file23282/larry.cporting.capsules.r1.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 13:39:15 2011 From: report at bugs.python.org (Ezio Melotti) Date: Sat, 01 Oct 2011 11:39:15 +0000 Subject: [issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0 In-Reply-To: <1317466662.3.0.331555988129.issue13086@psf.upfronthosting.co.za> Message-ID: <1317469155.24.0.940219483241.issue13086@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +ezio.melotti stage: -> needs patch versions: -Python 3.1, Python 3.4 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 13:47:32 2011 From: report at bugs.python.org (Larry Hastings) Date: Sat, 01 Oct 2011 11:47:32 +0000 Subject: [issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0 In-Reply-To: <1317466662.3.0.331555988129.issue13086@psf.upfronthosting.co.za> Message-ID: <1317469652.17.0.802115057196.issue13086@psf.upfronthosting.co.za> Larry Hastings added the comment: Why shouldn't I check this in to the 2.7 / 3.1 branches? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 13:52:38 2011 From: report at bugs.python.org (Georg Brandl) Date: Sat, 01 Oct 2011 11:52:38 +0000 Subject: [issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0 In-Reply-To: <1317466662.3.0.331555988129.issue13086@psf.upfronthosting.co.za> Message-ID: <1317469958.43.0.105880138486.issue13086@psf.upfronthosting.co.za> Georg Brandl added the comment: 3.1 because it won't have any effect; it's in security-fix mode. For 2.7 go ahead. ---------- nosy: +georg.brandl _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 13:58:07 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sat, 01 Oct 2011 11:58:07 +0000 Subject: [issue13072] Getting a buffer from a Unicode array uses invalid format In-Reply-To: <1317341390.65.0.255779328054.issue13072@psf.upfronthosting.co.za> Message-ID: <1317470287.77.0.789193837648.issue13072@psf.upfronthosting.co.za> Changes by Antoine Pitrou : ---------- nosy: +mark.dickinson, skrah _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 14:02:49 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sat, 01 Oct 2011 12:02:49 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317470569.15.0.467009188866.issue13070@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Shouldn't the test use "self.BufferedRWPair" instead of "io.BufferedRWPair"? Also, is it ok to just return NULL or should the error state also be set? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 14:17:07 2011 From: report at bugs.python.org (STINNER Victor) Date: Sat, 01 Oct 2011 12:17:07 +0000 Subject: [issue13084] test_signal failure In-Reply-To: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> Message-ID: <1317471427.31.0.485011570963.issue13084@psf.upfronthosting.co.za> STINNER Victor added the comment: WakeupSignalTests.test_pending() doesn't really check our signal handler but more the operating system, especially pthread_sigmask(SIG_UNBLOCK). I don't think that Python should test the signal order delivered by the operating systems when SIG_UNBLOCK. Anyone motivated to write a patch to use again a set()? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 15:43:59 2011 From: report at bugs.python.org (etuardu) Date: Sat, 01 Oct 2011 13:43:59 +0000 Subject: [issue13077] Unclear behavior of daemon threads on main thread exit In-Reply-To: <1317396642.03.0.535538271179.issue13077@psf.upfronthosting.co.za> Message-ID: <1317476639.29.0.288812849627.issue13077@psf.upfronthosting.co.za> etuardu added the comment: Let me put it this way: the definition of daemon thread describes the behaviour of the Python program running it (its exit condition in particular) instead of going straight to the point describing the behaviour of the daemon thread itself first, and finally add other considerations. Specifically, I think a situation like the following is not quite clear from the given definition: - I have a main thread and a secondary thread, both printing to stdout. - At some point, I press Ctrl+c raising an unhandled KeyboardInterrupt exception in the main thread, which kills it. This is what I get using a daemon thread: etuardu at subranu:~/Desktop$ python foo.py # other = daemon other thread main thread other thread main thread ^C Traceback [...] KeyboardInterrupt etuardu at subranu:~/Desktop$ # process terminates This is what I get using a non-daemon thread: etuardu at subranu:~/Desktop$ python foo.py # other = non-daemon other thread main thread other thread main thread ^C Traceback [...] KeyboardInterrupt other thread other thread other thread ... (process still running) So, to explain the significance of the "daemon" flag, I'd say something like: A daemon thread is shut down when the main thread and all others non-daemon threads end. This means a Python program runs as long as non-daemon threads, such as the main thread, are running. When only daemon threads are left, the Python program exits. Of course this can be understood from the current definition (?the entire Python program exits when only daemon threads are left?), still it looks a bit sybilline to me. ---------- Added file: http://bugs.python.org/file23283/foo.py _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:24:10 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Sat, 01 Oct 2011 14:24:10 +0000 Subject: [issue13084] test_signal failure In-Reply-To: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> Message-ID: <1317479050.55.0.125396587147.issue13084@psf.upfronthosting.co.za> Changes by Charles-Fran?ois Natali : ---------- keywords: +patch Added file: http://bugs.python.org/file23284/check_signum_order.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:35:49 2011 From: report at bugs.python.org (Roundup Robot) Date: Sat, 01 Oct 2011 14:35:49 +0000 Subject: [issue13085] pep-393: memory leaks In-Reply-To: <1317462699.45.0.870961708747.issue13085@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 1b203e741fb2 by Martin v. L?wis in branch 'default': Issue 13085: Fix some memory leaks. Patch by Stefan Krah. http://hg.python.org/cpython/rev/1b203e741fb2 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:36:37 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Sat, 01 Oct 2011 14:36:37 +0000 Subject: [issue13085] pep-393: memory leaks In-Reply-To: <1317462699.45.0.870961708747.issue13085@psf.upfronthosting.co.za> Message-ID: <1317479797.26.0.623219116204.issue13085@psf.upfronthosting.co.za> Martin v. L?wis added the comment: Thanks for the patch! ---------- resolution: -> fixed status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:37:39 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Sat, 01 Oct 2011 14:37:39 +0000 Subject: [issue13086] Update howto/cporting.rst so it talks about 3.x instead of 3.0 In-Reply-To: <1317466662.3.0.331555988129.issue13086@psf.upfronthosting.co.za> Message-ID: <1317479859.54.0.853932632552.issue13086@psf.upfronthosting.co.za> Martin v. L?wis added the comment: I like "Python 2" more than "2.x". ---------- nosy: +loewis _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:37:50 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sat, 01 Oct 2011 14:37:50 +0000 Subject: [issue12804] make test should not enable the urlfetch resource In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: <1317479870.49.0.708137548835.issue12804@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Please consider reverting this patch. If you have flaky network connection, you can override the test flags yourself. ---------- nosy: +brett.cannon, pitrou status: closed -> open _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:42:35 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Sat, 01 Oct 2011 14:42:35 +0000 Subject: [issue12737] str.title() is overzealous by upcasing combining marks inappropriately In-Reply-To: <32317.1317467261@chthon> Message-ID: <4E8726D9.2040604@v.loewis.de> Martin v. L?wis added the comment: >> As for terminology: I think the documentation should continue to >> speak about "words" and "letters", and then define what is meant >> in this context. It's not that the Unicode consortium invented >> the term "letter", so we should use it more liberally than just >> referring to the L* categories. > > I really don't think it wise to have private definitions of these. > > If Letter doesn't mean L?, things get too weird. That's why > there are separate definitions of alphabetic, word, etc. But I won't be using the word "Letter", but "letter" (lower case). Nobody will assume that this refers to the Unicode standard; people would rather expect that this is [A-Za-z] (i.e. not expect non-ASCII characters to be considered at all). So elaboration is necessary, anyway. I take the risk of confusing the 10 people that ever read UTS#18 :-) ---------- title: str.title() is overzealous by upcasing combining marks inappropriately -> str.title() is overzealous by upcasing combining marks inappropriately _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:45:26 2011 From: report at bugs.python.org (Roundup Robot) Date: Sat, 01 Oct 2011 14:45:26 +0000 Subject: [issue12804] make test should not enable the urlfetch resource In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 7fabd75a6ae4 by Antoine Pitrou in branch 'default': Backout of changeset 228fd2bd83a5 by Nadeem Vawda in branch 'default': http://hg.python.org/cpython/rev/7fabd75a6ae4 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 16:48:28 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sat, 01 Oct 2011 14:48:28 +0000 Subject: [issue12804] make test should not enable the urlfetch resource In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: <1317480508.44.0.991334430589.issue12804@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Change reverted. "make test" should run a comprehensive test of Python's facilities, and that includes network facilities. We only exclude functionality where testing is hostile to the user (largefile,audio,gui). You could add "make offlinetest" if you care, though. ---------- components: +Tests priority: normal -> low resolution: fixed -> stage: committed/rejected -> type: -> behavior _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 17:04:40 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Sat, 01 Oct 2011 15:04:40 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1317414642.85.0.0881821462071.issue12753@psf.upfronthosting.co.za> Message-ID: <4E872C06.8080506@v.loewis.de> Martin v. L?wis added the comment: > Does that sound fine? Yes, that's fine as well. ---------- title: \N{...} neglects formal aliases and named sequences from Unicode charnames namespace -> \N{...} neglects formal aliases and named sequences from Unicode charnames namespace _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 17:17:52 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Sat, 01 Oct 2011 15:17:52 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <29624.1317420430@chthon> Message-ID: <4E872F1E.6050604@v.loewis.de> Martin v. L?wis added the comment: > You may wish unicode.name() to return the alias in preference, however. -1. .name() is documented (and users familiar with it expect it) as returning the name of the character from the UCD. It doesn't really matter much to me if it's non-sensical - it's just a label. Notice that many characters have names like "CJK UNIFIED IDEOGRAPH-4E20", which isn't very descriptive, either. What does matter is that the name returned matches the same name in many other places in the net, which (rightfully) all use the UCD name (they might provide the alias as well if they are aware of aliases, but often don't). > If you mean, is it ok to add just the aliases and not the named sequences to > \N{}, it is certainly better than not doing so at all. Plus that way you do > *not* have to figure out what in the world to to do with [^a-c\N{sequence}], Python doesn't use regexes in the language parser, but does do \N escapes in the parser. So there is no way this transformation could possibly be made - except when you are talking about escapes in regexes, and not escapes in Unicode strings. > Perl does not provide the old 1.0 names at all. We don't have a Unicode > 1.0 legacy to support, which makes this cleaner. However, we do provide > for the names of the C0 and C1 Control Codes, because apart from Unicode > 1.0, they don't condescend to name the ASCII or Latin1 control codes. If there would be a reasonably official source for these names, and one that guarantees that there is no collision with UCD names, I could accept doing so for Python as well. > We also provide for certain well known aliases from the Names file: > anything that says "* commonly abbreviated as ...", so things like LRO > and ZWJ and such. -1. Readability counts, writability not so much (I know this is different for Perl :-). If there is too much aliasing, people will wonder what these codes actually mean. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 19:26:03 2011 From: report at bugs.python.org (Roundup Robot) Date: Sat, 01 Oct 2011 17:26:03 +0000 Subject: [issue13034] Python does not read Alternative Subject Names from SSL certificates larger than 1024 bits In-Reply-To: <1316778793.21.0.229691764484.issue13034@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 65e7f40fefd4 by Antoine Pitrou in branch '3.2': Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. http://hg.python.org/cpython/rev/65e7f40fefd4 New changeset 90a06fbb1f85 by Antoine Pitrou in branch 'default': Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. http://hg.python.org/cpython/rev/90a06fbb1f85 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 19:34:39 2011 From: report at bugs.python.org (Roundup Robot) Date: Sat, 01 Oct 2011 17:34:39 +0000 Subject: [issue13034] Python does not read Alternative Subject Names from SSL certificates larger than 1024 bits In-Reply-To: <1316778793.21.0.229691764484.issue13034@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 8e6694387c98 by Antoine Pitrou in branch '2.7': Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. http://hg.python.org/cpython/rev/8e6694387c98 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 19:35:20 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sat, 01 Oct 2011 17:35:20 +0000 Subject: [issue13034] Python does not read Alternative Subject Names from SSL certificates larger than 1024 bits In-Reply-To: <1316778793.21.0.229691764484.issue13034@psf.upfronthosting.co.za> Message-ID: <1317490520.91.0.406089580227.issue13034@psf.upfronthosting.co.za> Antoine Pitrou added the comment: This should be fixed now. ---------- resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 19:35:45 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sat, 01 Oct 2011 17:35:45 +0000 Subject: [issue13034] Python does not read Alternative Subject Names from some SSL certificates In-Reply-To: <1316778793.21.0.229691764484.issue13034@psf.upfronthosting.co.za> Message-ID: <1317490545.78.0.75203820451.issue13034@psf.upfronthosting.co.za> Antoine Pitrou added the comment: (fixing the title) ---------- title: Python does not read Alternative Subject Names from SSL certificates larger than 1024 bits -> Python does not read Alternative Subject Names from some SSL certificates _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 19:47:03 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Sat, 01 Oct 2011 17:47:03 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317491223.79.0.275640510789.issue13070@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: > Shouldn't the test use "self.BufferedRWPair" instead of > "io.BufferedRWPair"? Yes. > Also, is it ok to just return NULL or should the error state also be > set? Well, I'm not sure, that why I made you and Amaury noisy :-) AFAICT, this is the only case where _check_closed can encounter a NULL self->writer. And this specific situation is not an error (nothing prevents the rwpair from being garbaged collected before the textio) ,and _PyIOBase_finalize() explicitely clears any error returned: """ /* If `closed` doesn't exist or can't be evaluated as bool, then the object is probably in an unusable state, so ignore. */ res = PyObject_GetAttr(self, _PyIO_str_closed); if (res == NULL) PyErr_Clear(); else { closed = PyObject_IsTrue(res); Py_DECREF(res); if (closed == -1) PyErr_Clear(); } """ Furthermore, I'm not sure about what kind of error would make sense here. ---------- Added file: http://bugs.python.org/file23285/buffered_closed_gc-2.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 19:47:18 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Sat, 01 Oct 2011 17:47:18 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317491238.2.0.838011927947.issue13070@psf.upfronthosting.co.za> Changes by Charles-Fran?ois Natali : Removed file: http://bugs.python.org/file23279/buffered_closed_gc-1.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 20:21:03 2011 From: report at bugs.python.org (Dan Kenigsberg) Date: Sat, 01 Oct 2011 18:21:03 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: <1317493263.8.0.90067564446.issue4147@psf.upfronthosting.co.za> Dan Kenigsberg added the comment: Here's another take on fixing this bug, with an accompanying unit test. Personally, I'm monkey-patching xml.dom.minidom in order to avoid it, but please consider fixing it properly upstream. ---------- Added file: http://bugs.python.org/file23286/minidom-Text-toprettyxml.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 21:18:11 2011 From: report at bugs.python.org (Stefan Krah) Date: Sat, 01 Oct 2011 19:18:11 +0000 Subject: [issue6632] Include more fullwidth chars in the decimal codec In-Reply-To: <1249317285.35.0.709481915004.issue6632@psf.upfronthosting.co.za> Message-ID: <1317496691.08.0.0585901595676.issue6632@psf.upfronthosting.co.za> Changes by Stefan Krah : ---------- nosy: +skrah _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 22:49:15 2011 From: report at bugs.python.org (Roundup Robot) Date: Sat, 01 Oct 2011 20:49:15 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 086ca132e161 by R David Murray in branch '3.2': #4147: minidom's toprettyxml no longer adds whitespace to text nodes. http://hg.python.org/cpython/rev/086ca132e161 New changeset fa0b1e50270f by R David Murray in branch 'default': merge #4147: minidom's toprettyxml no longer adds whitespace to text nodes. http://hg.python.org/cpython/rev/fa0b1e50270f ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 22:49:38 2011 From: report at bugs.python.org (Roundup Robot) Date: Sat, 01 Oct 2011 20:49:38 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 406c5b69cb1b by R David Murray in branch '2.7': #4147: minidom's toprettyxml no longer adds whitespace to text nodes. http://hg.python.org/cpython/rev/406c5b69cb1b ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 22:53:01 2011 From: report at bugs.python.org (R. David Murray) Date: Sat, 01 Oct 2011 20:53:01 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: <1317502381.15.0.708404454341.issue4147@psf.upfronthosting.co.za> R. David Murray added the comment: This looks correct to me, and it tested out fine on the test suite (and the provided test failed without the provided fix), so I committed it. I have a small concern that the change in output might be a bit radical for a bug fix release, but it does seem to me that this is clearly a bug. If people think it shouldn't go in the bug fix releases let me know and I'll back it out. Thanks for the patch, Dan. ---------- nosy: +r.david.murray stage: test needed -> committed/rejected status: open -> closed type: feature request -> behavior versions: +Python 2.7, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sat Oct 1 22:59:19 2011 From: report at bugs.python.org (Stefan Krah) Date: Sat, 01 Oct 2011 20:59:19 +0000 Subject: [issue10156] Initialization of globals in unicodeobject.c In-Reply-To: <1287595627.82.0.548953621137.issue10156@psf.upfronthosting.co.za> Message-ID: <1317502759.72.0.0483075027418.issue10156@psf.upfronthosting.co.za> Stefan Krah added the comment: The PEP-393 changes apparently fix this leak; at least I can't reproduce it in default any longer (but still in 3.2). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 00:08:42 2011 From: report at bugs.python.org (STINNER Victor) Date: Sat, 01 Oct 2011 22:08:42 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317506922.75.0.289810147628.issue13081@psf.upfronthosting.co.za> STINNER Victor added the comment: I suppose that the application uses extensions written in C and one on these extensions is buggy. Can you write a script to reproduce the bug without the application? If not, we cannot help you :-( You may try the faulthandler to get more information: https://github.com/haypo/faulthandler/wiki ---------- nosy: +haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 00:32:49 2011 From: report at bugs.python.org (John O'Connor) Date: Sat, 01 Oct 2011 22:32:49 +0000 Subject: [issue13087] C BufferedReader seek() is inconsistent with UnsupportedOperation for unseekable streams Message-ID: <1317508369.4.0.585699022019.issue13087@psf.upfronthosting.co.za> New submission from John O'Connor : The C implementation of BufferedReader.seek() does not throw an UnsupportedOperation exception when its underlying stream is unseekable IF the current buffer can accommodate the seek in memory. It probably saves a few cycles for the seekable streams but, I think currently, it is inconsistent with the _pyio implementation and documentation. ---------- components: IO files: unseekable.patch keywords: patch messages: 144751 nosy: haypo, jcon, pitrou priority: normal severity: normal status: open title: C BufferedReader seek() is inconsistent with UnsupportedOperation for unseekable streams type: behavior versions: Python 3.2, Python 3.3 Added file: http://bugs.python.org/file23287/unseekable.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 00:41:33 2011 From: report at bugs.python.org (John O'Connor) Date: Sat, 01 Oct 2011 22:41:33 +0000 Subject: [issue12807] Optimization/refactoring for {bytearray, bytes, unicode}.strip() In-Reply-To: <1313981941.49.0.99491814096.issue12807@psf.upfronthosting.co.za> Message-ID: <1317508893.41.0.642593920744.issue12807@psf.upfronthosting.co.za> John O'Connor added the comment: The patch no longer applies cleanly. Is there enough interest in this to justify rebasing? ---------- title: Optimizations for {bytearray,bytes,unicode}.strip() -> Optimization/refactoring for {bytearray,bytes,unicode}.strip() _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 01:14:49 2011 From: report at bugs.python.org (Ned Deily) Date: Sat, 01 Oct 2011 23:14:49 +0000 Subject: [issue13082] Can't open new window in python In-Reply-To: <1317428678.05.0.613085535741.issue13082@psf.upfronthosting.co.za> Message-ID: <1317510889.76.0.760958056683.issue13082@psf.upfronthosting.co.za> Ned Deily added the comment: >From the symptoms you describe, you are almost certainly trying to use a version of IDLE with the Cocoa Tcl/Tk 8.5 supplied by Apple in Mac OS X 10.6. That version of Tcl/Tk is known to be buggy. If you installed a 64-bit/32-bin version of Python 3.2 using an installer downloaded from python.org, the download pages and the installer splash screen and IDLE itself all warn you to read the up-to-date information here: http://www.python.org/download/mac/tcltk/ If, after installing a current version of ActiveTcl 8.5 or a 32-bit version of Python, the problem persists, please re-open with appropriate information about versions and how to reproduce. ---------- assignee: ronaldoussoren -> ned.deily nosy: +ned.deily resolution: -> out of date status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 01:32:57 2011 From: report at bugs.python.org (STINNER Victor) Date: Sat, 01 Oct 2011 23:32:57 +0000 Subject: [issue13088] Add Py_hexdigits constant: use one unique constant to format a digit to hexadecimal Message-ID: <1317511977.11.0.738521421258.issue13088@psf.upfronthosting.co.za> New submission from STINNER Victor : CPython source code contains a lot of duplicate "0123456789abcdef" constants, declared as static variables. Attached patch uses one unique variable. Use also Py_hexdigit instead of ((c>9) ? c+'a'-10 : c + '0') in binascii, _hashopenssl, md5, sha1, sha256 and sha512 modules. ---------- files: hexdigits.patch keywords: patch messages: 144754 nosy: haypo priority: normal severity: normal status: open title: Add Py_hexdigits constant: use one unique constant to format a digit to hexadecimal versions: Python 3.3 Added file: http://bugs.python.org/file23288/hexdigits.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 03:14:16 2011 From: report at bugs.python.org (Ezio Melotti) Date: Sun, 02 Oct 2011 01:14:16 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: <1317518056.81.0.194361185901.issue4147@psf.upfronthosting.co.za> Ezio Melotti added the comment: The patch seems wrong to me: >>> d = minidom.parseString('AAABBBCCC') >>> print(d.toprettyxml()) AAA BBB CCC Even if the newlines are gone, the indentation before the closing tag is preserved. Also a newline is added before the text node BBB. It would be good to check what the XML standard says about the whitespace. I'm pretty sure HTML has well defined rules about it, but I don't know if that's the same for XML. FWIW the link in msg102247 contains a different fix (not sure if it's any better), and also a link to an article about XML and whitespace: http://www.oracle.com/technetwork/articles/wang-whitespace-092897.html (the link seems broken in the page). ---------- nosy: +ezio.melotti -BreamoreBoy stage: committed/rejected -> test needed status: closed -> open _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 05:29:04 2011 From: report at bugs.python.org (Mike Hoy) Date: Sun, 02 Oct 2011 03:29:04 +0000 Subject: [issue13075] PEP-0001 contains dead links In-Reply-To: <1317374485.54.0.239839552518.issue13075@psf.upfronthosting.co.za> Message-ID: <1317526144.22.0.878821995463.issue13075@psf.upfronthosting.co.za> Mike Hoy added the comment: Added links under Resources to the new Dev Guide. Added a link to the Guide itself and a link to the faq. ---------- keywords: +patch Added file: http://bugs.python.org/file23289/pep-0001-broken-links.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 07:33:41 2011 From: report at bugs.python.org (Tom Christiansen) Date: Sun, 02 Oct 2011 05:33:41 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <4E872F1E.6050604@v.loewis.de> Message-ID: <6829.1317533598@chthon> Tom Christiansen added the comment: >> Perl does not provide the old 1.0 names at all. We don't have a Unicode >> 1.0 legacy to support, which makes this cleaner. However, we do provide >> for the names of the C0 and C1 Control Codes, because apart from Unicode >> 1.0, they don't condescend to name the ASCII or Latin1 control codes. =20 > If there would be a reasonably official source for these names, and one > that guarantees that there is no collision with UCD names, I could > accept doing so for Python as well. The C0 and C1 control code names don't change. There is/was one stability issue where they screwed up, because they ended up having a UAX (required) and a UTS (not required) fighting because of the dumb stuff they did with the Emoji names. They neglected to prefix them with "Emoji ..." or some such, the way things like "GREEK ... LETTER ..." or "MATHEMATICAL ..." or "MUSICAL ..." did. The problem is they stole BELL without calling it EMOJI BELL. This is C0 name for Control-G. Dimwits. The problem with official names is that they have things in them that you are not expected in names. Do you really and truly mean to tell me you think it is somehow **good** that people are forced to write \N{LINE FEED (LF)} Rather than the more obvious pair of \N{LINE FEED} \N{LF} ?? If so, then I don't understand that. Nobody in their right mind prefers "\N{LINE FEED (LF)}" over "\N{LINE FEED}" -- do they? % perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{LINE FEED}"' U+000A % perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{LF}"' U+000A % perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{LINE FEED (LF)}"' U+000A % perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{NEXT LINE}"' U+0085 % perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{NEL}"' U+0085 % perl -Mcharnames=:full -le 'printf "U+%04X\n", ord "\N{NEXT LINE (NEL)}"' U+0085 >> We also provide for certain well known aliases from the Names file: >> anything that says "* commonly abbreviated as ...", so things like LRO >> and ZWJ and such. > -1. Readability counts, writability not so much (I know this is > different for Perl :-). I actually very strongly resent and rebuff that entire mindset in the most extreme way possible. Well-written Perl code is perfectly readable by people who speak that langauge. If you find Perl code that isn't readable, it is by definition not well-written. *PLEASE* don't start. Yes, I just got done driving 16 hours and am overtired, but it's something I've been fighting against all of professional career. It's a "leyenda negra". > If there is too much aliasing, people will > wonder what these codes actually mean. There are 15 "commonly abbreviated as" aliases in the Names.txt file. * commonly abbreviated as NBSP * commonly abbreviated as SHY * commonly abbreviated as CGJ * commonly abbreviated ZWSP * commonly abbreviated ZWNJ * commonly abbreviated ZWJ * commonly abbreviated LRM * commonly abbreviated RLM * commonly abbreviated LRE * commonly abbreviated RLE * commonly abbreviated PDF * commonly abbreviated LRO * commonly abbreviated RLO * commonly abbreviated NNBSP * commonly abbreviated WJ All of the standards documents *talk* about things like LRO and ZWNJ. I guess the standards aren't "readable" then, right? :) >From the charnames manpage, which shows that we really don't just make these up as we feel like (although we could; see below). They're all from this or that standard: ALIASES A few aliases have been defined for convenience: instead of having to use the official names LINE FEED (LF) FORM FEED (FF) CARRIAGE RETURN (CR) NEXT LINE (NEL) (yes, with parentheses), one can use LINE FEED FORM FEED CARRIAGE RETURN NEXT LINE LF FF CR NEL All the other standard abbreviations for the controls, such as "ACK" for "ACKNOWLEDGE" also can be used. One can also use BYTE ORDER MARK BOM and these abbreviations Abbreviation Full Name CGJ COMBINING GRAPHEME JOINER FVS1 MONGOLIAN FREE VARIATION SELECTOR ONE FVS2 MONGOLIAN FREE VARIATION SELECTOR TWO FVS3 MONGOLIAN FREE VARIATION SELECTOR THREE LRE LEFT-TO-RIGHT EMBEDDING LRM LEFT-TO-RIGHT MARK LRO LEFT-TO-RIGHT OVERRIDE MMSP MEDIUM MATHEMATICAL SPACE MVS MONGOLIAN VOWEL SEPARATOR NBSP NO-BREAK SPACE NNBSP NARROW NO-BREAK SPACE PDF POP DIRECTIONAL FORMATTING RLE RIGHT-TO-LEFT EMBEDDING RLM RIGHT-TO-LEFT MARK RLO RIGHT-TO-LEFT OVERRIDE SHY SOFT HYPHEN VS1 VARIATION SELECTOR-1 . . . VS256 VARIATION SELECTOR-256 WJ WORD JOINER ZWJ ZERO WIDTH JOINER ZWNJ ZERO WIDTH NON-JOINER ZWSP ZERO WIDTH SPACE For backward compatibility one can use the old names for certain C0 and C1 controls old new FILE SEPARATOR INFORMATION SEPARATOR FOUR GROUP SEPARATOR INFORMATION SEPARATOR THREE HORIZONTAL TABULATION CHARACTER TABULATION HORIZONTAL TABULATION SET CHARACTER TABULATION SET HORIZONTAL TABULATION WITH JUSTIFICATION CHARACTER TABULATION WITH JUSTIFICATION PARTIAL LINE DOWN PARTIAL LINE FORWARD PARTIAL LINE UP PARTIAL LINE BACKWARD RECORD SEPARATOR INFORMATION SEPARATOR TWO REVERSE INDEX REVERSE LINE FEED UNIT SEPARATOR INFORMATION SEPARATOR ONE VERTICAL TABULATION LINE TABULATION VERTICAL TABULATION SET LINE TABULATION SET but the old names in addition to giving the character will also give a warning about being deprecated. And finally, certain published variants are usable, including some for controls that have no Unicode names: name character END OF PROTECTED AREA END OF GUARDED AREA, U+0097 HIGH OCTET PRESET U+0081 HOP U+0081 IND U+0084 INDEX U+0084 PAD U+0080 PADDING CHARACTER U+0080 PRIVATE USE 1 PRIVATE USE ONE, U+0091 PRIVATE USE 2 PRIVATE USE TWO, U+0092 SGC U+0099 SINGLE GRAPHIC CHARACTER INTRODUCER U+0099 SINGLE-SHIFT 2 SINGLE SHIFT TWO, U+008E SINGLE-SHIFT 3 SINGLE SHIFT THREE, U+008F START OF PROTECTED AREA START OF GUARDED AREA, U+0096 perl v5.14.0 2011-05-07 2 Those are the defaults. They are overridable. That's because we feel that people should be able to name their character constants however they feel makes sense for them. If they get tired of typing \N{LATIN SMALL LETTER U WITH DIAERESIS} let alone \N{LATIN CAPITAL LETTER THORN WITH STROKE THROUGH DESCENDER} then they can, because there is a mechanism for making aliases: use charnames ":full", ":alias" => { U_uml => "LATIN CAPITAL LETTER U WITH DIAERESIS", u_uml => "LATIN SMALL LETTER U WITH DIAERESIS", }; That way you can do s/\N{U_uml}/UE/; s/\N{u_uml}/ue/; This is probably not as persuasive as the private-use case described below. It is important to remember that all charname bindings in Perl are attached to a *lexically-scoped declaration. It is completely constrained to operate only within that lexical scope. That's why the compiler replaces things like use charnames ":full", ":alias" => { U_uml => "LATIN CAPITAL LETTER U WITH DIAERESIS", u_uml => "LATIN SMALL LETTER U WITH DIAERESIS", }; my $find_u_uml = qr/\N{u_uml}/i; print "Seach pattern is: $find_u_uml\n"; Which dutifully prints out: Seach pattern is: (?^ui:\N{U+FC}) So charname bindings are never "hard to read" because the effect is completely lexically constrained, and can never leak outside of the scope. I realize (or at least, believe) that Python has no notion of nested lexical scopes, and like many things, this sort of thing can therefore never work there because of that. The most persuasive use-case for user-defined names is for private-use area code points. These will never have an official name. But it is just fine to use them. Don't they deserve a better name, one that makes sense within your own program that uses them? Of course they do. For example, Apple has a bunch of private-use glyphs they use all the time. In the 8-bit MacRoman encoding, the byte 0xF0 represents the Apple corporate logo/glyph thingie of an apple with a bite taken out of it. (Microsoft also has a bunch of these.) If you upgrade MacRoman to Unicode, you will find that that 0xF0 maps to code point U+F8FF using the regular converter. Now what are you supposed to do in your program when you want a named character there? You certainly do not want to make users put an opaque magic number as a Unicode escape. That is always really lame, because the whole reason we have \N{...} escapes is so we don't have to put mysterious unreadable magic numbers in our code!! So all you do is use charnames ":alias" => { "APPLE LOGO" => 0xF8FF, }; and now you can use \N{APPLE LOGO} anywhere within that lexical scope. The compiler will dutifully resolve it to U+F8FF, since all name lookups happen at compile-time. And it cannot leak out of the scope. I assert that this facility makes your program more readable, and its absence makes your program less readable. Private use characters are important in Asian texts, but they are also important for other things. For example, Unicode intends to get around to allocating Tengwar up the the SMP. However, lots of stupid old code can't use full Unicode, being constrained to UCS-2 only. So many Tengwar fonts start at a different base, and put it in the private use area instead or the SMP. Here are two constants: use constant { TB_CONSCRIPT_UNICODE_REGISTRY => 0x00_E000, # private use TB_UNICODE_CONSORTIIUM => 0x01_6080, # where it will really go }; I have an entire Tengwar module that makes heavy use of named private-use characters. All I do is this: use constant TENGWAR_BASE => TB_CONSCRIPT_UNICODE_REGISTRY; use charnames ":alias" => { reverse ( (TENGWAR_BASE + 0x00) => "TENGWAR LETTER TINCO", (TENGWAR_BASE + 0x01) => "TENGWAR LETTER PARMA", (TENGWAR_BASE + 0x02) => "TENGWAR LETTER CALMA", (TENGWAR_BASE + 0x03) => "TENGWAR LETTER QUESSE", (TENGWAR_BASE + 0x04) => "TENGWAR LETTER ANDO", .... ) }; Now you can write \N{TENGWAR LETTER TINCO} etc. See how slick that is? Consider the alternative. Magic numbers. Worse, magic numbers with funny calculations in them. That is just so wrong that it completely justifies letting people name things how they want to, so long as they don't make other people do the same. What people do in the privacy of their own lexical scope is their own business. It gets better. Perl lets you define your character properties, too. Therefore I can write things like \p{Is_Tengwar_Decimal} and such. Right now I have these properties: In_Tengwar, Is_Tengwar In_Tengwar_Alphanumerics In_Tengwar_Consonants, In_Tengwar_Vowels, In_Tengwar_Alphabetics In_Tengwar_Numerals, Is_Tengwar_Decimal, Is_Tengwar_Duodecimal In_Tengwar_Punctuation In_Tengwar_Marks So I have code in my Tengwar module that does stuff like this, using my own named characters (which again, are compile-time resolved and work only within this lexical scope): chr( $1 + ord("\N{TENGWAR DIGIT ZERO}") ) Not to mention this using my own properties: $TENGWAR_GRAPHEME_RX = qr/(?:(?=\p{In_Tengwar})\P{In_Tengwar_Marks}\p{In_Tengwar_Marks}*)|\p{In_Tengwar_Marks}/x; Actually, I'm fibbing. I *never* write regexes all on one line like that: they are abhorrent to me. The pattern really looks like this in the code: $TENGWAR_GRAPHEME_RX = qr{ (?: (?= \p{In_Tengwar} ) \P{In_Tengwar_Marks} # Either one basechar... \p{In_Tengwar_Marks} * # ... plus 0 or more marks ) | \p{In_Tengwar_Marks} # or else a naked unpaired mark. }x; People who write patterns without whitespace for cognitive chunking (plus comments for explanation) are wicked wicked wicked. Frankly I'm surprised Python doesn't require it. :)/2 Anyway, do you see how much better that is than opaque unreadable magic numbers? Can you just imagine the sheer horror of writing that sort of code without the ability to define your own named characters *and* your own character properties? It's beautiful, simple, clean, and readable. I'll even go so far as to call it intuitive. No, I don't expect Python to do this sort of thing. You don't have proper scoping, so you can't ever do it cleanly the way Perl can. I just wanted to give a concrete example where flexibility leads to a much more readable program than inflexibility ever can. --tom "We hates magic numberses. We hates them forevers!" --Sm?agol the Hacker ---------- title: \N{...} neglects formal aliases and named sequences from Unicode charnames namespace -> \N{...} neglects formal aliases and named sequences from Unicode charnames namespace _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 08:46:26 2011 From: report at bugs.python.org (Ezio Melotti) Date: Sun, 02 Oct 2011 06:46:26 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1313430514.3.0.983525514499.issue12753@psf.upfronthosting.co.za> Message-ID: <1317537986.58.0.797304932484.issue12753@psf.upfronthosting.co.za> Ezio Melotti added the comment: > The problem with official names is that they have things in them that > you are not expected in names. Do you really and truly mean to tell > me you think it is somehow **good** that people are forced to write > \N{LINE FEED (LF)} > Rather than the more obvious pair of > \N{LINE FEED} > \N{LF} > ?? Actually Python doesn't seem to support \N{LINE FEED (LF)}, most likely because that's a Unicode 1 name, and nowadays these codepoints are simply marked as ''. > If so, then I don't understand that. Nobody in their right > mind prefers "\N{LINE FEED (LF)}" over "\N{LINE FEED}" -- do they? They probably don't, but they just write \n anyway. I don't think we need to support any of these aliases, especially if they are not defined in the Unicode standard. I'm also not sure humans use \N{...}: you don't want to write 'R\N{LATIN SMALL LETTER E WITH ACUTE}sum\N{LATIN SMALL LETTER E WITH ACUTE}' and you would need to look up the exact name somewhere anyway before using it (unless you know them by heart). If 'R\xe9sum\xe9' or 'R\u00e9sum\u00e9' are too obscure and/or magic, you can always print() them and get 'R?sum?' (or just write 'R?sum?' directly in the source). > All of the standards documents *talk* about things like LRO and ZWNJ. > I guess the standards aren't "readable" then, right? :) Right, I had to read down till the table with the meanings before figuring out what they were (and I already forgot it). > The most persuasive use-case for user-defined names is for private-use > area code points. These will never have an official name. But it is > just fine to use them. Don't they deserve a better name, one that > makes sense within your own program that uses them? Of course they do. > > For example, Apple has a bunch of private-use glyphs they use all the time. > In the 8-bit MacRoman encoding, the byte 0xF0 represents the Apple corporate > logo/glyph thingie of an apple with a bite taken out of it. (Microsoft > also has a bunch of these.) If you upgrade MacRoman to Unicode, you will > find that that 0xF0 maps to code point U+F8FF using the regular converter. > > Now what are you supposed to do in your program when you want a named character > there? You certainly do not want to make users put an opaque magic number > as a Unicode escape. That is always really lame, because the whole reason > we have \N{...} escapes is so we don't have to put mysterious unreadable magic > numbers in our code!! > > So all you do is > use charnames ":alias" => { > "APPLE LOGO" => 0xF8FF, > }; > > and now you can use \N{APPLE LOGO} anywhere within that lexical scope. The > compiler will dutifully resolve it to U+F8FF, since all name lookups happen > at compile-time. And it cannot leak out of the scope. This is actually a good use case for \N{..}. One way to solve that problem is doing: apples = { 'APPLE': '\uF8FF', 'GREEN APPLE': '\U0001F34F', 'RED APPLE': '\U0001F34E', } and then: print('I like {GREEN APPLE} and {RED APPLE}, but not {APPLE}.'.format(**apples)) This requires the format call for each string and it's a workaround, but at least is readable (I hope you don't have too many apples in your strings). I guess we could add some way to define a global list of names, and that would probably be enough for most applications. Making it per-module would be more complicated and maybe not too elegant. > People who write patterns without whitespace for cognitive chunking (plus > comments for explanation) are wicked wicked wicked. Frankly I'm surprised > Python doesn't require it. :)/2 I actually find those *less* readable. If there's something fancy in the regex, a comment *before* it is welcomed, but having to read a regex divided on several lines and remove meaningless whitespace and redundant comments just makes the parsing more difficult for me. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 09:32:21 2011 From: report at bugs.python.org (Ben Hayden) Date: Sun, 02 Oct 2011 07:32:21 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: <1317540741.36.0.628580760096.issue13073@psf.upfronthosting.co.za> Ben Hayden added the comment: I added in docs for the method from the actual method docstring from the http.client module. ---------- keywords: +patch nosy: +beardedp Added file: http://bugs.python.org/file23290/issue13073.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 09:34:21 2011 From: report at bugs.python.org (Ezio Melotti) Date: Sun, 02 Oct 2011 07:34:21 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1313430514.3.0.983525514499.issue12753@psf.upfronthosting.co.za> Message-ID: <1317540861.82.0.203778046548.issue12753@psf.upfronthosting.co.za> Ezio Melotti added the comment: Attached a new patch with more tests and doc. ---------- Added file: http://bugs.python.org/file23291/issue12753-3.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 10:18:23 2011 From: report at bugs.python.org (Ezio Melotti) Date: Sun, 02 Oct 2011 08:18:23 +0000 Subject: [issue13076] Bad links to 'time' in datetime documentation In-Reply-To: <1317390308.93.0.874341309843.issue13076@psf.upfronthosting.co.za> Message-ID: <1317543503.59.0.981181094898.issue13076@psf.upfronthosting.co.za> Ezio Melotti added the comment: The broken links seem to be only in the "time objects" section, and only in the body of attribute/method directives. The attached patch fixes the issue by using :class:`~datetime.time` explicitly where the links are broken. Georg, is this a bug in Sphinx? ---------- assignee: docs at python -> ezio.melotti keywords: +patch nosy: +ezio.melotti, georg.brandl stage: needs patch -> patch review Added file: http://bugs.python.org/file23292/issue13076.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 10:23:18 2011 From: report at bugs.python.org (Georg Brandl) Date: Sun, 02 Oct 2011 08:23:18 +0000 Subject: [issue13076] Bad links to 'time' in datetime documentation In-Reply-To: <1317390308.93.0.874341309843.issue13076@psf.upfronthosting.co.za> Message-ID: <1317543798.27.0.509906321551.issue13076@psf.upfronthosting.co.za> Georg Brandl added the comment: No, it's not, it's how Sphinx works. Use :class:`.time` to refer to the datetime class. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 10:50:16 2011 From: report at bugs.python.org (Stefan Krah) Date: Sun, 02 Oct 2011 08:50:16 +0000 Subject: [issue13089] parsetok.c: memory leak Message-ID: <1317545416.55.0.717898890924.issue13089@psf.upfronthosting.co.za> New submission from Stefan Krah : Seen in test_mailbox: ==31621== 6 bytes in 2 blocks are definitely lost in loss record 27 of 10,370 ==31621== at 0x4C2154B: malloc (vg_replace_malloc.c:236) ==31621== by 0x5271A5: parsetok (parsetok.c:179) ==31621== by 0x526E8A: PyParser_ParseStringFlagsFilenameEx (parsetok.c:67) ==31621== by 0x4BC385: PyParser_ASTFromString (pythonrun.c:1898) ==31621== by 0x4BC1E1: Py_CompileStringExFlags (pythonrun.c:1842) ==31621== by 0x478AB8: builtin_compile (bltinmodule.c:626) ==31621== by 0x5759F3: PyCFunction_Call (methodobject.c:84) ==31621== by 0x48F2CF: ext_do_call (ceval.c:4292) ==31621== by 0x489992: PyEval_EvalFrameEx (ceval.c:2646) ==31621== by 0x48E67B: fast_function (ceval.c:4068) ==31621== by 0x48E3C7: call_function (ceval.c:4001) ==31621== by 0x4895E1: PyEval_EvalFrameEx (ceval.c:2605) ---------- components: Interpreter Core messages: 144763 nosy: skrah priority: normal severity: normal stage: needs patch status: open title: parsetok.c: memory leak type: resource usage versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 11:10:29 2011 From: report at bugs.python.org (Stefan Krah) Date: Sun, 02 Oct 2011 09:10:29 +0000 Subject: [issue13090] posix_read: memory leak Message-ID: <1317546629.79.0.861373881282.issue13090@psf.upfronthosting.co.za> New submission from Stefan Krah : Seen in test_multiprocessing: ==31662== 37 bytes in 1 blocks are definitely lost in loss record 629 of 10,548 ==31662== at 0x4C2154B: malloc (vg_replace_malloc.c:236) ==31662== by 0x53BBE9: PyBytes_FromStringAndSize (bytesobject.c:98) ==31662== by 0x4E2363: posix_read (posixmodule.c:6980) ==31662== by 0x5759D8: PyCFunction_Call (methodobject.c:81) ==31662== by 0x48E294: call_function (ceval.c:3980) ==31662== by 0x4895E1: PyEval_EvalFrameEx (ceval.c:2605) ==31662== by 0x48C54F: PyEval_EvalCodeEx (ceval.c:3355) ==31662== by 0x48E786: fast_function (ceval.c:4078) ==31662== by 0x48E3C7: call_function (ceval.c:4001) ==31662== by 0x4895E1: PyEval_EvalFrameEx (ceval.c:2605) ==31662== by 0x48C54F: PyEval_EvalCodeEx (ceval.c:3355) ==31662== by 0x48E786: fast_function (ceval.c:4078) ---------- components: Extension Modules messages: 144764 nosy: skrah priority: normal severity: normal stage: needs patch status: open title: posix_read: memory leak type: resource usage versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 11:11:58 2011 From: report at bugs.python.org (Stefan Krah) Date: Sun, 02 Oct 2011 09:11:58 +0000 Subject: [issue13091] ctypes: memory leak Message-ID: <1317546718.25.0.742978918789.issue13091@psf.upfronthosting.co.za> New submission from Stefan Krah : Seen in test_multiprocessing: ==31662== 44 bytes in 1 blocks are definitely lost in loss record 687 of 10,548 ==31662== at 0x4C2154B: malloc (vg_replace_malloc.c:236) ==31662== by 0x41CC27: PyMem_Malloc (object.c:1699) ==31662== by 0x127D9F51: resize (callproc.c:1664) ==31662== by 0x5759D8: PyCFunction_Call (methodobject.c:81) ==31662== by 0x48E294: call_function (ceval.c:3980) ==31662== by 0x4895E1: PyEval_EvalFrameEx (ceval.c:2605) ==31662== by 0x48E67B: fast_function (ceval.c:4068) ==31662== by 0x48E3C7: call_function (ceval.c:4001) ==31662== by 0x4895E1: PyEval_EvalFrameEx (ceval.c:2605) ==31662== by 0x48C54F: PyEval_EvalCodeEx (ceval.c:3355) ==31662== by 0x48E786: fast_function (ceval.c:4078) ==31662== by 0x48E3C7: call_function (ceval.c:4001) ---------- components: Extension Modules messages: 144765 nosy: skrah priority: normal severity: normal stage: needs patch status: open title: ctypes: memory leak type: resource usage versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 11:47:24 2011 From: report at bugs.python.org (Roundup Robot) Date: Sun, 02 Oct 2011 09:47:24 +0000 Subject: [issue13076] Bad links to 'time' in datetime documentation In-Reply-To: <1317390308.93.0.874341309843.issue13076@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 854e31d80151 by Ezio Melotti in branch '2.7': #13076: fix links to datetime.time. http://hg.python.org/cpython/rev/854e31d80151 New changeset 95689ed69097 by Ezio Melotti in branch '3.2': #13076: fix links to datetime.time and datetime.datetime. http://hg.python.org/cpython/rev/95689ed69097 New changeset 175cd2a51ea9 by Ezio Melotti in branch 'default': #13076: merge with 3.2. http://hg.python.org/cpython/rev/175cd2a51ea9 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 11:49:05 2011 From: report at bugs.python.org (Stefan Krah) Date: Sun, 02 Oct 2011 09:49:05 +0000 Subject: [issue13092] pep-393: memory leaks #2 Message-ID: <1317548945.05.0.236513746453.issue13092@psf.upfronthosting.co.za> New submission from Stefan Krah : I found a couple of additional leaks related to the PEP-393 changes. ---------- components: Interpreter Core files: pep-393-leaks-2.diff keywords: patch messages: 144767 nosy: loewis, skrah priority: normal severity: normal stage: patch review status: open title: pep-393: memory leaks #2 type: resource usage versions: Python 3.3 Added file: http://bugs.python.org/file23293/pep-393-leaks-2.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 11:49:51 2011 From: report at bugs.python.org (Ezio Melotti) Date: Sun, 02 Oct 2011 09:49:51 +0000 Subject: [issue13076] Bad links to 'time' in datetime documentation In-Reply-To: <1317390308.93.0.874341309843.issue13076@psf.upfronthosting.co.za> Message-ID: <1317548991.65.0.778030714562.issue13076@psf.upfronthosting.co.za> Ezio Melotti added the comment: This should be fixed now, thanks for the report. FTR with Sphinx 1.0 all the links to :class:`time` and also :class:`datetime` needed to be fixed because they were pointing to the modules, with 0.6 only the :class:`time` in the body of attribute/method directives were broken. ---------- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 11:59:54 2011 From: report at bugs.python.org (Stefan Krah) Date: Sun, 02 Oct 2011 09:59:54 +0000 Subject: [issue13084] test_signal failure In-Reply-To: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> Message-ID: <1317549594.77.0.441464525613.issue13084@psf.upfronthosting.co.za> Stefan Krah added the comment: Patch looks good to me (and it fixes the problem). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 12:51:43 2011 From: report at bugs.python.org (Lance Hepler) Date: Sun, 02 Oct 2011 10:51:43 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317552703.0.0.59446883722.issue7689@psf.upfronthosting.co.za> Changes by Lance Hepler : ---------- nosy: +nlhepler _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 14:03:19 2011 From: report at bugs.python.org (Dan Kenigsberg) Date: Sun, 02 Oct 2011 12:03:19 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: <1317556999.25.0.183199780575.issue4147@psf.upfronthosting.co.za> Dan Kenigsberg added the comment: Oh dear. Thanks, Enzio, for pointing out that former patch is wrong. It is also quite naive, since the whole NATURE of toprettyprint() is to add whitespace to Text nodes. Tomas Lee's http://bugs.python.org/file11832/minidom-toprettyxml-01.patch made an effort to touch only "simple" Text nodes, that are confined within a single . I did not expect http://bugs.python.org/file23286/minidom-Text-toprettyxml.patch to get in so quickly, after the former one spent several years on queue. However now is time to fix it, possible by my second patch. ---------- Added file: http://bugs.python.org/file23294/minidom-Text-toprettyxml-02.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 14:58:00 2011 From: report at bugs.python.org (Dan Kenigsberg) Date: Sun, 02 Oct 2011 12:58:00 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: <1317560280.44.0.350953952281.issue4147@psf.upfronthosting.co.za> Dan Kenigsberg added the comment: btw, http://www.w3.org/TR/xml/#sec-white-space is a bit vague on how should a parser deal with whitespace, and seems to allow non-preservation of text nodes. Preserving "simple" text nodes is allowed, too, and is more polite to applications reading the prettyxml. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 15:07:53 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Sun, 02 Oct 2011 13:07:53 +0000 Subject: [issue12804] make test should not enable the urlfetch resource In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: <1317560873.33.0.874228604028.issue12804@psf.upfronthosting.co.za> ?ric Araujo added the comment: I don?t have a flaky connection, I have none at all; until this change I could always run just use ?make test? for all Python versions. OTOH, I agree with your point that testing networking facilities in the standard test suite makes sense, as most people probably have network access. If it is easy to detect network availability programmatically, we could just use the skip system. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 16:14:32 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Sun, 02 Oct 2011 14:14:32 +0000 Subject: [issue3902] Packages containing only extension modules have to contain __init__.py In-Reply-To: <1221761506.23.0.180859211428.issue3902@psf.upfronthosting.co.za> Message-ID: <1317564872.31.0.584948185524.issue3902@psf.upfronthosting.co.za> ?ric Araujo added the comment: We?re working on a patch on the core-mentorship list. ---------- components: +Distutils nosy: +alexis stage: needs patch -> patch review versions: +Python 2.7, Python 3.2, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 17:11:20 2011 From: report at bugs.python.org (Meador Inge) Date: Sun, 02 Oct 2011 15:11:20 +0000 Subject: [issue13089] parsetok.c: memory leak In-Reply-To: <1317545416.55.0.717898890924.issue13089@psf.upfronthosting.co.za> Message-ID: <1317568280.25.0.194906812849.issue13089@psf.upfronthosting.co.za> Changes by Meador Inge : ---------- nosy: +meador.inge _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 17:12:12 2011 From: report at bugs.python.org (Meador Inge) Date: Sun, 02 Oct 2011 15:12:12 +0000 Subject: [issue13091] ctypes: memory leak In-Reply-To: <1317546718.25.0.742978918789.issue13091@psf.upfronthosting.co.za> Message-ID: <1317568332.1.0.586734618512.issue13091@psf.upfronthosting.co.za> Changes by Meador Inge : ---------- components: +ctypes nosy: +amaury.forgeotdarc, belopolsky, meador.inge _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 17:39:15 2011 From: report at bugs.python.org (Stefan Krah) Date: Sun, 02 Oct 2011 15:39:15 +0000 Subject: [issue13093] Redundant code in PyUnicode_EncodeDecimal() Message-ID: <1317569955.86.0.21646060057.issue13093@psf.upfronthosting.co.za> New submission from Stefan Krah : I can't see what this code is supposed to accomplish (see patch): while (collend < end) { if ((0 < *collend && *collend < 256) || !Py_UNICODE_ISSPACE(*collend) || Py_UNICODE_TODECIMAL(*collend)) break; } Since 'collend' and 'end' don't change in the loop, it would be infinite if the 'if' condition evaluated to false. But the 'if' condition is always true. ---------- components: Interpreter Core files: encode_decimal_redundant_code.diff keywords: needs review, patch messages: 144774 nosy: skrah priority: normal severity: normal stage: patch review status: open title: Redundant code in PyUnicode_EncodeDecimal() type: behavior versions: Python 3.3 Added file: http://bugs.python.org/file23295/encode_decimal_redundant_code.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 18:07:55 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sun, 02 Oct 2011 16:07:55 +0000 Subject: [issue13090] posix_read: memory leak In-Reply-To: <1317546629.79.0.861373881282.issue13090@psf.upfronthosting.co.za> Message-ID: <1317571675.31.0.431159272919.issue13090@psf.upfronthosting.co.za> Changes by Antoine Pitrou : ---------- nosy: +haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 18:08:19 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sun, 02 Oct 2011 16:08:19 +0000 Subject: [issue13092] pep-393: memory leaks #2 In-Reply-To: <1317548945.05.0.236513746453.issue13092@psf.upfronthosting.co.za> Message-ID: <1317571699.48.0.778042837917.issue13092@psf.upfronthosting.co.za> Changes by Antoine Pitrou : ---------- nosy: +haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 18:33:36 2011 From: report at bugs.python.org (Roundup Robot) Date: Sun, 02 Oct 2011 16:33:36 +0000 Subject: [issue13084] test_signal failure In-Reply-To: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset e4f4272479d0 by Charles-Fran?ois Natali in branch 'default': Issue #13084: Fix a test_signal failure: the delivery order is only defined for http://hg.python.org/cpython/rev/e4f4272479d0 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 18:43:10 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Sun, 02 Oct 2011 16:43:10 +0000 Subject: [issue13001] test_socket.testRecvmsgTrunc failure on FreeBSD 7.2 buildbot In-Reply-To: <1316340627.52.0.056452746704.issue13001@psf.upfronthosting.co.za> Message-ID: Charles-Fran?ois Natali added the comment: > @requires_freebsd_version should be factorized with > @requires_linux_version. Patches attached. > Can we workaround FreeBSD (< 8) bug in C/Python? Not really. > Or should we remove the function on FreeBSD < 8? There's really no reason to do that (and it's really a minor bug). ---------- Added file: http://bugs.python.org/file23296/freebsd_msgtrunc-1.diff Added file: http://bugs.python.org/file23297/requires_unix_version.diff _______________________________________ Python tracker _______________________________________ -------------- next part -------------- diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -1659,6 +1659,9 @@ def _testRecvmsgShorter(self): self.sendToServer(MSG) + # FreeBSD < 8 doesn't always set the MSG_TRUNC flag when a truncated + # datagram is received (issue #13001). + @support.requires_freebsd_version(8) def testRecvmsgTrunc(self): # Receive part of message, check for truncation indicators. msg, ancdata, flags, addr = self.doRecvmsg(self.serv_sock, @@ -1668,6 +1671,7 @@ self.assertEqual(ancdata, []) self.checkFlags(flags, eor=False) + @support.requires_freebsd_version(8) def _testRecvmsgTrunc(self): self.sendToServer(MSG) -------------- next part -------------- diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -44,8 +44,8 @@ "Error", "TestFailed", "ResourceDenied", "import_module", "verbose", "use_resources", "max_memuse", "record_original_stdout", "get_original_stdout", "unload", "unlink", "rmtree", "forget", - "is_resource_enabled", "requires", "requires_linux_version", - "requires_mac_ver", "find_unused_port", "bind_port", + "is_resource_enabled", "requires", "requires_freebsd_version", + "requires_linux_version", "requires_mac_ver", "find_unused_port", "bind_port", "IPV6_ENABLED", "is_jython", "TESTFN", "HOST", "SAVEDCWD", "temp_cwd", "findfile", "create_empty_file", "sortdict", "check_syntax_error", "open_urlresource", "check_warnings", "CleanImport", "EnvironmentVarGuard", "TransientResource", @@ -312,17 +312,17 @@ msg = "Use of the %r resource not enabled" % resource raise ResourceDenied(msg) -def requires_linux_version(*min_version): - """Decorator raising SkipTest if the OS is Linux and the kernel version is - less than min_version. +def _requires_unix_version(sysname, min_version): + """Decorator raising SkipTest if the OS is `sysname` and the version is less + than `min_version`. - For example, @requires_linux_version(2, 6, 35) raises SkipTest if the Linux - kernel version is less than 2.6.35. + For example, @_requires_unix_version('FreeBSD', (7, 2)) raises SkipTest if + the FreeBSD version is less than 7.2. """ def decorator(func): @functools.wraps(func) def wrapper(*args, **kw): - if sys.platform == 'linux': + if platform.system() == sysname: version_txt = platform.release().split('-', 1)[0] try: version = tuple(map(int, version_txt.split('.'))) @@ -332,13 +332,29 @@ if version < min_version: min_version_txt = '.'.join(map(str, min_version)) raise unittest.SkipTest( - "Linux kernel %s or higher required, not %s" - % (min_version_txt, version_txt)) - return func(*args, **kw) - wrapper.min_version = min_version + "%s version %s or higher required, not %s" + % (sysname, min_version_txt, version_txt)) return wrapper return decorator +def requires_freebsd_version(*min_version): + """Decorator raising SkipTest if the OS is FreeBSD and the FreeBSD version is + less than `min_version`. + + For example, @requires_freebsd_version(7, 2) raises SkipTest if the FreeBSD + version is less than 7.2. + """ + return _requires_unix_version('FreeBSD', min_version) + +def requires_linux_version(*min_version): + """Decorator raising SkipTest if the OS is Linux and the Linux version is + less than `min_version`. + + For example, @requires_linux_version(2, 6, 32) raises SkipTest if the Linux + version is less than 2.6.32. + """ + return _requires_unix_version('Linux', min_version) + def requires_mac_ver(*min_version): """Decorator raising SkipTest if the OS is Mac OS X and the OS X version if less than min_version. From report at bugs.python.org Sun Oct 2 18:45:33 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Sun, 02 Oct 2011 16:45:33 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1316813717.26.0.696772580838.issue10141@psf.upfronthosting.co.za> Message-ID: Charles-Fran?ois Natali added the comment: So, Victor, what do you think of the last version? This patch has been lingering for quite some time, and it's really a cool feature. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 18:55:14 2011 From: report at bugs.python.org (R. David Murray) Date: Sun, 02 Oct 2011 16:55:14 +0000 Subject: [issue4147] xml.dom.minidom toprettyxml: omit whitespace for text-only elements In-Reply-To: <1224430404.43.0.0672391440841.issue4147@psf.upfronthosting.co.za> Message-ID: <1317574514.57.0.743346754185.issue4147@psf.upfronthosting.co.za> R. David Murray added the comment: Heh, you happened to post your patch at a time when I wanted something to do as a break from something I didn't want to do...and I *thought* I understood the problem, after reading the various links. But clearly I didn't. We don't have someone who has stepped forward to be xml maintainer, so I just went ahead and committed it. I should find time to look at your new patch some time today, or perhaps Ezio will have time. (Clearly minidom doesn't have enough tests.) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 18:58:12 2011 From: report at bugs.python.org (lesmana) Date: Sun, 02 Oct 2011 16:58:12 +0000 Subject: [issue12458] Tracebacks should contain the first line of continuation lines In-Reply-To: <1309499207.17.0.676241559437.issue12458@psf.upfronthosting.co.za> Message-ID: <1317574692.03.0.231370396881.issue12458@psf.upfronthosting.co.za> Changes by lesmana : ---------- nosy: +lesmana _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 19:00:59 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Sun, 02 Oct 2011 17:00:59 +0000 Subject: [issue13084] test_signal failure In-Reply-To: <1317462503.63.0.0518543153457.issue13084@psf.upfronthosting.co.za> Message-ID: <1317574859.63.0.0629543325795.issue13084@psf.upfronthosting.co.za> Changes by Charles-Fran?ois Natali : ---------- resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 20:41:12 2011 From: report at bugs.python.org (Tom Christiansen) Date: Sun, 02 Oct 2011 18:41:12 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1317537986.58.0.797304932484.issue12753@psf.upfronthosting.co.za> Message-ID: <32145.1317580838@chthon> Tom Christiansen added the comment: Ezio Melotti wrote on Sun, 02 Oct 2011 06:46:26 -0000: > Actually Python doesn't seem to support \N{LINE FEED (LF)}, most likely bec= > ause that's a Unicode 1 name, and nowadays these codepoints are simply mark= > ed as ''. Yes, but there are a lot of them, 65 of them in fact. I do not care to see people being forced to use literal control characters or inscrutable magic numbers. It really bothers me that you have all these defined code points with properties and all that have no name. People do use these. Some of them a lot. I don't mind \n and such -- and in fact, prefer them even -- but I feel I should not have scratch my head over character \033, \0177, and brethren. The C0 and C1 standards are not just inventions, so we use them. Far better than one should write \N{ESCAPE} for \033 or \N{DELETE} for \0177, don't you think? >> If so, then I don't understand that. Nobody in their right=20 >> mind prefers "\N{LINE FEED (LF)}" over "\N{LINE FEED}" -- do they? > They probably don't, but they just write \n anyway. I don't think we need = > to support any of these aliases, especially if they are not defined in the = > Unicode standard. If you look at Names.txt, there are significant "aliases" there for the C0/C1 stuff. My bottom line is that I don't like to be forced to use magic numbers. I prefer to name my abstactions. It is more readable and more maintainble that way. There are still "holes" of course. Code point 128 has no name even in C1. But something is better than nothing. Plus at least in Perl we *can* give things names if we want, per the APPLE LOGO example for U+F8FF. So nothing needs to remain nameless. Why, you can even name your Kanji if you want, using whatever Romanization you prefer. I think the private-use case example is really motivating, but I have no idea how to do this for Python because there is no lexical scope. I suppose you could attach it to the module, but that still doesn't really work because of how things get evaluated. With a Perl compile-time use, we can change the compiler's ideas about things, like adding function prototypes and even extending the base types: % perl -Mbigrat -le 'print 1/2 + 2/3 * 4/5' 31/30 % perl -Mbignum -le 'print 21->is_odd' 1 % perl -Mbignum -le 'print 18->is_odd' 0 % perl -Mbignum -le 'print substr(2**5000, -3)' 376 % perl -Mbignum -le 'print substr(2**5000-1, -3)' 375 % perl -Mbignum -le 'print length(2**5000)' 1506 % perl -Mbignum -le 'print length(10**5000)' 5001 % perl -Mbignum -le 'print ref 10**5000' Math::BigInt % perl -Mbigrat -le 'print ref 1/3' Math::BigRat I recognize that redefining what sort of object the compiler treats some of its constants as is never going to happen in Python, but we actually did manage that with charnames without having to subclass our strings: the hook for \N{...} doesn't require object games like the ones above. But it still has to happen at compile time, of course, so I don't know what you could do in Python. Is there any way to change how the compiler behaves even vaguely along these lines? The run-time looks of Python's unicodedata.lookup (like Perl's charnames::viacode) and unicodedata.name (like Perl's charnames::viacode on the ord) could be managed with a hook, but the compile-time lookups of \N{...} I don't see any way around. But I don't know anything about Python's internals, so don't even know what is or is not possible. I do note that if you could extend \N{...} the way we do with charname aliases for private-use characters, the user could load something that did the C0 and C1 control if they wanted to. I just don't know how to do that early enough that the Python compiler would see it. Your import happens at run-time or at compile-time? This would be some sort of compile-time binding of constants. d=20 >> Python doesn't require it. :)/2 > I actually find those *less* readable. If there's something fancy in the r= > egex, a comment *before* it is welcomed, but having to read a regex divided= > on several lines and remove meaningless whitespace and redundant comments = > just makes the parsing more difficult for me. Really? White space makes things harder to read? I thought Pythonistas believed the opposite of that. Whitespace is very useful for cognitive chunking: you see how things logically group together. Inomorewantaregexwithoutwhitespacethananyothercodeortext. :) I do grant you that chatty comments may be a separate matter. White space in patterns is also good when you have successive patterns across multiple lines that have parts that are the same and parts that are different, as in most of these, which is from a function to render an English headline/book/movie/etc title into its proper casing: # put into lowercase if on our stop list, else titlecase s/ ( \pL [\pL']* ) /$stoplist{$1} ? lc($1) : ucfirst(lc($1))/xge; # capitalize a title's last word and its first word s/^ ( \pL [\pL']* ) /\u\L$1/x; s/ ( \pL [\pL']* ) $/\u\L$1/x; # treat parenthesized portion as a complete title s/ \( ( \pL [\pL']* ) /(\u\L$1/x; s/ ( \pL [\pL']* ) \) /\u\L$1)/x; # capitalize first word following colon or semi-colon s/ ( [:;] \s+ ) ( \pL [\pL']* ) /$1\u\L$2/x; Now, that isn't good code for all *kinds* of reasons, but white space is not one of them. Perhaps what it is best at demonstrating is why Python goes about this the right way and that Perl does not. Oh drat, I'm about to attach this to the wrong bug. But it was the dumb code above that made me think about the following. By virtue of having a "titlecase each word's first letter and lowercase the rest" function in Python, you can put the logic in just one place, and therefore if a bug is found, you can fix all code all at one. But because Perl has always made it easy to grab "words" (actually, traditional programming language identifiers) and diddle their case, people write this all the time: s/(\w+)/\u\L$1/g; all the time, and that has all kind of problems. If you prefer the functional approach, that is really s/(\w+)/ucfirst(lc($1))/ge; but that is still wrong. 1. Too much code duplication. Yes, it's nice to see \pL[\pL']* stand out on each line, but shouldn't that be in a variable, like $word = qr/\pL[\pL']*/; 2. What is a "word"? That code above is better than \w because it avoids numbers and underscores; however, it still uses letters only, not letters and marks, let alone number letters like Roman numerals. 3. I see the apostrophe there, which is a good start, but what if it is a RIGHT SINGLE QUOTATION MARK, as in "Henry?s"? And what about hyphens? Those should not trigger capitalization in normal titles. 4. It turns out that all code that does a titlecase on the first character of a string it has already converted to lowercase has irreversibly lost information. Unicode casing it not reversable. Using \w for convenience, these can do different things: s/(\w+)/\u\L$1/g; s/(\w)(\w*)/\u$1\L$2/g; or in the functional approach, s/(\w+)/ucfirst(lc($1))/ge; s/(\w)(\w*)/ucfirst($1) . lc($2)/ge; Now while it is true that only these code points alone do the wrong thing using the na?ve approach under Unicode 6.0: % unichars -gas 'ucfirst ne ucfirst lc' ? U+00130 GC=Lu SC=Latin LATIN CAPITAL LETTER I WITH DOT ABOVE ? U+003F4 GC=Lu SC=Greek GREEK CAPITAL THETA SYMBOL ? U+01E9E GC=Lu SC=Latin LATIN CAPITAL LETTER SHARP S ? U+02126 GC=Lu SC=Greek OHM SIGN K U+0212A GC=Lu SC=Latin KELVIN SIGN ? U+0212B GC=Lu SC=Latin ANGSTROM SIGN But it is still the wrong thing, and we never know what might happen in the future. I think Python is being smarter than Perl in simply providing people with a titlecase-each-word('s-first-letterand-lowercase-the-rest)in-the-whole- string function, because this means people won't be tempted to write s/(\w+)/ucfirst(lc($1))/ge; all the time. However, as I have written elsewhere, I question a lot of its underlying assumptions. It's clear that a "word" must in general include not just Letters but also Marks, or else you get different results in NFD and NFC, and the Unicode Standard is very against that. However, the problem is that what a word is cannot be considered independent of language. Words in English can contain apostrophes (whether written as an APOSTROPHE or as RIGHT SINGLE QUOTATION MARK) and hyphens (written as HYPHEN-MINUS, HYPHEN, and rarely even EN DASH). Each of these is a single word: ?tisn?t anti?intellectual earth?moon The capitalization there should be ?Tisn?t Anti?intellectual Earth?Moon Notice how you can't do the same with the first apostrophe+t as with the second on "?Tisn?t"". That is all challenging to code correctly (did you notice the EN DASH?), especially when you find something like red?violet?colored. You problably want that to be Red?violet?colored, because it is not an equal compound like earth?moon or yin?yang, which in correct orthography take an EN DASH not a HYPHEN, just as occurs when you hyphenate an already hyphenated word like red?violet against colored, as in a red?violet?colored flower. English titling rules only capitalize the first word in hyphenated words, which is why it's Anti?intellectual not Anti-Intellectual. And of course, you can't actually create something in true English titlecase without knowing having a stop list of articles and (short) prepositions, and paying attention to whether it is the first or last word in the title, and whether it follows a colon or semicolon. Consider that phrasal verbs are construed to take adverbs not prepositions, and so "Bringing In the Sheaves" would be the correct capitalization of that song, since "to bring in" is a phrasal verb, but "A Ringing in My Ears" would be right for that. It is remarkably complicated. With English titlecasing, you have to respect what your publishing house considers a "short" preposition. A common cut-off is that short preps have 4 or fewer characters, but I have seen longer cutoffs. Here is one rather exhaustive list of English prepositions sorted by length: 2: as at by in of on to up vs 3: but for off out per pro qua via 4: amid atop down from into like near next onto over pace past plus sans save than till upon with 5: about above after among below circa given minus round since thru times under until worth 6: across amidst around before behind beside beside beyond during except inside toward unlike versus within 7: against barring beneath besides between betwixt despite failing outside through thruout towards without 10: throughout underneath The thing is that prepositions become adverbs in phrasal verbs, like "to go out" or "to come in", and all adverbs are capitalized. So a complete solution requires actual parsing of English!!!! Just say no -- or stronger. Merely getting something like this right: the lord of the rings: the fellowship of the ring # Unicode lowercase THE LORD OF THE RINGS: THE FELLOWSHIP OF THE RING # Unicode uppercase The Lord of the Rings: The Fellowship of the Ring # English titlecase is going to take a bit of work. So is the sad tale of king henry ? and caterina de arag?n # Unicode lowercase THE SAD TALE OF KING HENRY ? AND CATERINA DE ARAG?N # Unicode uppercase The Sad Tale of King Henry ? and Caterina de Arag?n # English titlecase (and that must give the same answer in NFC vs NFD, of course.) Plus what to do with something like num2ascii is ill-defined in English, because having digits in the middle of a word is a very new phenomenon. Yes, Y2K gets caps, but that is for another reason. There is no agreement on what one should do with num2ascii or people42see. A function name shouldn't be capitalized at all of course. And that is just English. Other languages have completely different rules. For example, per Wikipedia's entry on the colon: In Finnish and Swedish, the colon can appear inside words in a manner similar to the English apostrophe, between a word (or abbreviation, especially an acronym) and its grammatical (mostly genitive) suffixes. In Swedish, it also occurs in names, for example Antonia Ax:son Johnson (Ax:son for Axelson). In Finnish it is used in loanwords and abbreviations; e.g., USA:han for the illative case of "USA". For loanwords ending orthographically in a consonant but phonetically in a vowel, the apostrophe is used instead: e.g. show'n for the genitive case of the English loan "show" or Versailles'n for the French place name Versailles. Isn't that tricky! I guess that you would have to treat punctuation that has a word character immediately following it (and immediately preceding it) as being part of the word, and that it doesn't signal that a change in case is merited. I'm really not sure. It is not obvious what the right thing to do here. I do believe that Python's titlecase function can and should be fixed to work correctly with Unicode. There really is no excuse for turning Arago?n into Arago?N, for example, or not doing the right thing with ? and ? . I fear the only thing you can do with the confusion of Unicode titlecase and English titlecase is to explain that properly rendering English titles and headlines is a much more complicated job which you will not even attempt. (And shoudln't. English titelcase is clear too specialized for a general function.) However, I'm still bothered by things with apostrophes though. can't isn't woudn't've Bill's 'tisn't since I can't countenance the obviously wrong: Can'T Isn'T Woudn'T'Ve Bill'S 'Tisn'T with the last the hardest to get right. I do have code that correctly handles English words and code that correctly handles English titles, but it is much tricker the titlecase() function. And Swedes might be upset seeing Antonia Ax:Son Johnson instead of Antonia Ax:son Johnson. Maybe we should just go back to the Pythonic equivalent of s/(\w)(\w*)/ucfirst($1) . lc($2)/ge; where \w is specifically per tr18's Annex C, and give up on punctuation altogether, with a footnoted caveat or something. I wouldn't complain about that. The rest is just too, too hard. Wouldn't you agree? Thank you very much for all your hard work -- and patience with me. --tom ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 22:43:50 2011 From: report at bugs.python.org (Vlad Riscutia) Date: Sun, 02 Oct 2011 20:43:50 +0000 Subject: [issue5001] Remove assertion-based checking in multiprocessing In-Reply-To: <1232382762.58.0.610641171059.issue5001@psf.upfronthosting.co.za> Message-ID: <1317588230.22.0.452780658169.issue5001@psf.upfronthosting.co.za> Vlad Riscutia added the comment: I attached a patch which replaces all asserts with checks that raise exceptions. I used my judgement in determining exception types but I might have been off in some places. Also, this patch replaces ALL asserts. It is possible that some of the internal functions should be kept using asserts but I am not familiar enough with the module to say which. I figured I'd better do the whole thing than reviewers can say where lines should be reverted to asserts. ---------- keywords: +patch nosy: +vladris Added file: http://bugs.python.org/file23298/issue5001.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 23:26:27 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sun, 02 Oct 2011 21:26:27 +0000 Subject: [issue5001] Remove assertion-based checking in multiprocessing In-Reply-To: <1232382762.58.0.610641171059.issue5001@psf.upfronthosting.co.za> Message-ID: <1317590787.08.0.584271293637.issue5001@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Thank you. I've attached some comments, click on the "review" link to read them. ---------- assignee: jnoller -> nosy: +pitrou -BreamoreBoy stage: needs patch -> patch review versions: +Python 3.3 -Python 3.1 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 23:46:38 2011 From: report at bugs.python.org (Nadeem Vawda) Date: Sun, 02 Oct 2011 21:46:38 +0000 Subject: [issue1625] bz2.BZ2File doesn't support multiple streams In-Reply-To: <1197624030.23.0.503522059328.issue1625@psf.upfronthosting.co.za> Message-ID: <1317591998.92.0.34006048175.issue1625@psf.upfronthosting.co.za> Nadeem Vawda added the comment: > This is all fine and well, but this is clearly a bug and not a feature. No, it is not at all clear that this is a bug. I agree that this is a desirable capability to have, but nowhere does the module claim to support multi-stream files. Nor is it an inherent feature of the underlying bzip2 library that we are failing to expose to users. > [...] python 2.x users will never be able to extract multiple-stream bz2 files. Incorrect. It is perfectly possible to extract a multi-stream bz2 file in 2.x - you just need to open it with open() and decompress the data using BZ2Decompressor (pretty much the same way that 3.3's BZ2File does it). If there is really a large demand for these facilities in 2.x, I would be willing to port 3.3's BZ2File implementation to 2.x and make it available on PyPI. But this patch is not going in to the 2.7 stdlib; it is simply not the sort of behavior change that is acceptable in a bugfix release. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Sun Oct 2 23:56:59 2011 From: report at bugs.python.org (Terry J. Reedy) Date: Sun, 02 Oct 2011 21:56:59 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <32145.1317580838@chthon> Message-ID: <4E88DE0D.2020800@udel.edu> Terry J. Reedy added the comment: > Really? White space makes things harder to read? I thought Pythonistas > believed the opposite of that. I was surprised at that too ;-). One person's opinion in a specific context. Don't generaliza. > English titling rules > only capitalize the first word in hyphenated words, which is why it's > Anti?intellectual not Anti-Intellectual. Except that I can imagine someone using the latter as a noun to make the work more officious or something. There are no official English titling rules and as you noted, publishers vary. I agree that str.title should do something sensible based on Unicode, with the improvements you mentioned. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 00:13:54 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Sun, 02 Oct 2011 22:13:54 +0000 Subject: [issue13091] ctypes: memory leak In-Reply-To: <1317546718.25.0.742978918789.issue13091@psf.upfronthosting.co.za> Message-ID: <1317593634.85.0.642585577102.issue13091@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: How did you obtain this? the resize() function is not called by test_multiprocessing. And are you sure that it's not some kind of reference leak? (this pointer is tied to a CDataObject; its tp_alloc should free the memory) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 00:27:40 2011 From: report at bugs.python.org (Nadeem Vawda) Date: Sun, 02 Oct 2011 22:27:40 +0000 Subject: [issue12804] "make test" fails on systems without internet access In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: <1317594460.75.0.686221808347.issue12804@psf.upfronthosting.co.za> Nadeem Vawda added the comment: > Change reverted. "make test" should run a comprehensive test of > Python's facilities Fair enough. > If it is easy to detect network availability programmatically, we could > just use the skip system. +1. I don't know if there is a reasonable way to do this, but if so, that would be the best solution. ---------- stage: -> needs patch title: make test should not enable the urlfetch resource -> "make test" fails on systems without internet access _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 00:31:53 2011 From: report at bugs.python.org (Nick Coghlan) Date: Sun, 02 Oct 2011 22:31:53 +0000 Subject: [issue13053] Add Capsule migration documentation to "cporting" In-Reply-To: <1317224248.53.0.263122473581.issue13053@psf.upfronthosting.co.za> Message-ID: <1317594713.75.0.531206922598.issue13053@psf.upfronthosting.co.za> Nick Coghlan added the comment: Mostly looks good - couple of minor comments in Reitveld. As far as the patch flow goes, the 2.x and 3.x branches are actually handled independently (they're too divergent for merging to make sense). So 2.7 and 3.2 will be independent commits, then the changes will be merged into default from the 3.2 branch. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 00:45:53 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sun, 02 Oct 2011 22:45:53 +0000 Subject: [issue12804] "make test" fails on systems without internet access In-Reply-To: <1317594460.75.0.686221808347.issue12804@psf.upfronthosting.co.za> Message-ID: <1317595344.3562.4.camel@localhost.localdomain> Antoine Pitrou added the comment: > > If it is easy to detect network availability programmatically, we could > > just use the skip system. > > +1. I don't know if there is a reasonable way to do this, but if so, that > would be the best solution. Actually, the skip system is already supposed to work for that if used properly (see test.support.transient_internet()). However, perhaps it actually doesn't work in all situations. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 00:49:23 2011 From: report at bugs.python.org (Nadeem Vawda) Date: Sun, 02 Oct 2011 22:49:23 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317595763.49.0.480994691997.issue6715@psf.upfronthosting.co.za> Nadeem Vawda added the comment: Thanks for investigating the Windows situation. > - liblzma can't be compiled by Visual Studio: too many C99 isms, mostly > variables declared in the middle of a block. It's doable for sure, but it's a > lot of work. I don't think that creating our own MSVC-friendly fork of liblzma is really an option. Over and above the work of porting it in the first place (and all the opportunities for bugs to creep in along the way), we'd also have to worry about keeping up to date with upstream changes. I believe we currently do something similar with libffi (for ctypes), and the impression I've gotten is that it's caused a lot of trouble. > - The way recommended by XZ is to use a precompiled liblzma.dll; Then it > should be easy to build an extension module, but its would be the first time > that we distribute an extension module which needs a non-system DLL. Is it > enough to copy it next to _lzma.pyd? Is there some work to do in the > installer? I would guess that this is sufficient, but my knowledge of how Windows DLLs work is minimal. Could someone with more platform knowledge weigh in on whether this would work (and if there are any problems it might cause)? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 00:55:06 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Sun, 02 Oct 2011 22:55:06 +0000 Subject: [issue12911] Expose a private accumulator C API In-Reply-To: <1315311609.99.0.705451675521.issue12911@psf.upfronthosting.co.za> Message-ID: <1317596106.47.0.26052755268.issue12911@psf.upfronthosting.co.za> Antoine Pitrou added the comment: New patch implementing Martin's suggested optimization (only instantiate the large list when necessary). ---------- Added file: http://bugs.python.org/file23299/accu3.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 00:55:42 2011 From: report at bugs.python.org (Dan Stromberg) Date: Sun, 02 Oct 2011 22:55:42 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317595763.49.0.480994691997.issue6715@psf.upfronthosting.co.za> Message-ID: Dan Stromberg added the comment: On Sun, Oct 2, 2011 at 3:49 PM, Nadeem Vawda wrote: > > Nadeem Vawda added the comment: > > Thanks for investigating the Windows situation. > > > - liblzma can't be compiled by Visual Studio: too many C99 isms, mostly > > variables declared in the middle of a block. It's doable for sure, but > it's a > > lot of work. > > I don't think that creating our own MSVC-friendly fork of liblzma is really > an > option. Over and above the work of porting it in the first place (and all > the > opportunities for bugs to creep in along the way), we'd also have to worry > about > keeping up to date with upstream changes. I believe we currently do > something > similar with libffi (for ctypes), and the impression I've gotten is that > it's > caused a lot of trouble. It's much better to contribute patches upstream. > > - The way recommended by XZ is to use a precompiled liblzma.dll; Then it > > should be easy to build an extension module, but its would be the first > time > > that we distribute an extension module which needs a non-system DLL. Is > it > > enough to copy it next to _lzma.pyd? Is there some work to do in the > > installer? > > I would guess that this is sufficient, but my knowledge of how Windows DLLs > work > is minimal. Could someone with more platform knowledge weigh in on whether > this > would work (and if there are any problems it might cause)? I've not done much with windows dll's, but I've heard they're pretty similar to AIX shared libraries which I've done some work with. AIX shared libraries don't deal with versioning well - if you have two version of the same library on a system, you have to pop them into two different loader domains, or suffer unresolved externals at runtime. Then your applications are in of the two loader domains, and if they're in the wrong one, you again suffer unresolved externals at runtime. ---------- Added file: http://bugs.python.org/file23300/unnamed _______________________________________ Python tracker _______________________________________ -------------- next part --------------
On Sun, Oct 2, 2011 at 3:49 PM, Nadeem Vawda <report at bugs.python.org> wrote:

Nadeem Vawda <nadeem.vawda at gmail.com> added the comment:

Thanks for investigating the Windows situation.

> - liblzma can't be compiled by Visual Studio: too many C99 isms, mostly
> variables declared in the middle of a block. ??It's doable for sure, but it's a
> lot of work.

I don't think that creating our own MSVC-friendly fork of liblzma is really an
option. Over and above the work of porting it in the first place (and all the
opportunities for bugs to creep in along the way), we'd also have to worry about
keeping up to date with upstream changes. I believe we currently do something
similar with libffi (for ctypes), and the impression I've gotten is that it's
caused a lot of trouble.

It's much better to contribute patches upstream.
??
> - The way recommended by XZ is to use a precompiled liblzma.dll; Then it
> should be easy to build an extension module, but its would be the first time
> that we distribute an extension module which needs a non-system DLL. ??Is it
> enough to copy it next to _lzma.pyd? ??Is there some work to do in the
> installer?

I would guess that this is sufficient, but my knowledge of how Windows DLLs work
is minimal. Could someone with more platform knowledge weigh in on whether this
would work (and if there are any problems it might cause)?

I've not done much with windows dll's, but I've heard they're pretty similar to AIX shared libraries which I've done some work with. ??AIX shared libraries don't deal with versioning well - if you have two version of the same library on a system, you have to pop them into two different loader domains, or suffer unresolved externals at runtime. ??Then your applications are in of the two loader domains, and if they're in the wrong one, you again suffer unresolved externals at runtime.
??
--
Dan Stromberg
From report at bugs.python.org Mon Oct 3 00:58:36 2011 From: report at bugs.python.org (Meador Inge) Date: Sun, 02 Oct 2011 22:58:36 +0000 Subject: [issue13091] ctypes: memory leak In-Reply-To: <1317546718.25.0.742978918789.issue13091@psf.upfronthosting.co.za> Message-ID: <1317596316.01.0.352885503928.issue13091@psf.upfronthosting.co.za> Meador Inge added the comment: I can reproduce this with: valgrind --tool=memcheck --log-file=leaks.txt --leak-check=full --suppressions=Misc/valgrind-python.supp ./python -m test test_ctypes Where as: valgrind --tool=memcheck --log-file=leaks.txt --leak-check=full --suppressions=Misc/valgrind-python.supp ./python -m test test_multiprocessing turns up nothing in 'ctypes.resize'. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 01:03:31 2011 From: report at bugs.python.org (Nadeem Vawda) Date: Sun, 02 Oct 2011 23:03:31 +0000 Subject: [issue12804] "make test" fails on systems without internet access In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: <1317596611.07.0.224669598146.issue12804@psf.upfronthosting.co.za> Nadeem Vawda added the comment: Oh, neat. I'll take a look at that when I get a chance. ---------- assignee: -> nadeem.vawda _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 01:23:55 2011 From: report at bugs.python.org (Larry Hastings) Date: Sun, 02 Oct 2011 23:23:55 +0000 Subject: [issue13053] Add Capsule migration documentation to "cporting" In-Reply-To: <1317224248.53.0.263122473581.issue13053@psf.upfronthosting.co.za> Message-ID: <1317597835.01.0.401483616141.issue13053@psf.upfronthosting.co.za> Larry Hastings added the comment: Attached is r2 of the patch, incorporating Nick's suggestions. Base revision hasn't changed. ---------- Added file: http://bugs.python.org/file23301/larry.cporting.capsules.r2.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 01:30:05 2011 From: report at bugs.python.org (Larry Hastings) Date: Sun, 02 Oct 2011 23:30:05 +0000 Subject: [issue13053] Add Capsule migration documentation to "cporting" In-Reply-To: <1317224248.53.0.263122473581.issue13053@psf.upfronthosting.co.za> Message-ID: <1317598205.65.0.0406181880199.issue13053@psf.upfronthosting.co.za> Larry Hastings added the comment: In case you're curious, here's how I tested "capsulethunk.h". I added the file to Python 2.7 (hg head), 3.0.0 (tarball), and 3.1.0 (tarball). For 2.7 ad 3.0.0 I quickly hacked four files to use the Capsule API instead of CObjects: * Python/compile.c * Python/getargs.c * Modules/_ctypes/callproc.c * Modules/_ctypes/cfield.c (For 3.1 I simply included the file in those four files, as they already use the Capsule API.) I then built and ran regrtest.py. While developing capsulethunk.h, I had a more thorough test suite; sadly that's on a laptop that is shut off, and I'm on vacation across the Atlantic and can't get at it. But everything was working fine last I checked ;-) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 01:37:30 2011 From: report at bugs.python.org (Vlad Riscutia) Date: Sun, 02 Oct 2011 23:37:30 +0000 Subject: [issue5001] Remove assertion-based checking in multiprocessing In-Reply-To: <1232382762.58.0.610641171059.issue5001@psf.upfronthosting.co.za> Message-ID: <1317598650.93.0.651603368224.issue5001@psf.upfronthosting.co.za> Vlad Riscutia added the comment: Thanks for the quick review! I attached second iteration addressing feedback + changed all occurrences of checks like "type(x) is y" to "isinstance(x, y)". I would appreciate a second look because this patch has many small changes and even though I ran full test suit which passed, I'm afraid I made a typo somewhere :) ---------- Added file: http://bugs.python.org/file23302/issue5001_v2.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 01:39:31 2011 From: report at bugs.python.org (Senthil Kumaran) Date: Sun, 02 Oct 2011 23:39:31 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: <1317598771.59.0.292415701319.issue13073@psf.upfronthosting.co.za> Senthil Kumaran added the comment: This is fixed the following changesets. changeset a3f2dba93743 changeset 1ed413b52af3 changeset 277688052c5a Thanks for the patch, Ben Hayden. ---------- resolution: -> fixed stage: needs patch -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 02:22:15 2011 From: report at bugs.python.org (Meador Inge) Date: Mon, 03 Oct 2011 00:22:15 +0000 Subject: [issue13091] ctypes: memory leak In-Reply-To: <1317546718.25.0.742978918789.issue13091@psf.upfronthosting.co.za> Message-ID: <1317601335.6.0.176142368064.issue13091@psf.upfronthosting.co.za> Meador Inge added the comment: > this pointer is tied to a CDataObject; its tp_alloc should free the > memory The free in 'PyCData_clear' is conditional: if ((self->b_needsfree) && ((size_t)dict->size > sizeof(self->b_value))) PyMem_Free(self->b_ptr); As written, 'PyCData_clear' has no way of knowing that memory has been {m,re}alloc'd in 'resize'. So in some cases memory will leak. Here is a small reproduction case extracted from 'test_varsize_struct.py'. from ctypes import * class X(Structure): _fields_ = [("item", c_int), ("array", c_int * 1)] x = X() x.item = 42 x.array[0] = 100 new_size = sizeof(X) + sizeof(c_int) * 5 resize(x, new_size) One potential fix is: diff --git a/Modules/_ctypes/_ctypes.c b/Modules/_ctypes/_ctypes.c --- a/Modules/_ctypes/_ctypes.c +++ b/Modules/_ctypes/_ctypes.c @@ -2440,7 +2440,7 @@ PyCData_clear(CDataObject *self) assert(dict); /* Cannot be NULL for CDataObject instances */ Py_CLEAR(self->b_objects); if ((self->b_needsfree) - && ((size_t)dict->size > sizeof(self->b_value))) + && (self->b_ptr != (char *)&self->b_value)) PyMem_Free(self->b_ptr); self->b_ptr = NULL; Py_CLEAR(self->b_base); I need to think about that more, though. ---------- versions: +Python 2.7, Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 03:36:06 2011 From: report at bugs.python.org (Meador Inge) Date: Mon, 03 Oct 2011 01:36:06 +0000 Subject: [issue13062] Introspection generator and function closure state In-Reply-To: <1317318190.56.0.437971601638.issue13062@psf.upfronthosting.co.za> Message-ID: <1317605766.23.0.467816301133.issue13062@psf.upfronthosting.co.za> Meador Inge added the comment: Here is a first cut at a patch. There is one slight deviation from the original spec: > some nice error checking for when the generator's frame is already gone > (or the supplied object isn't a generator iterator). The attached patch returns empty mappings for these cases. I can easily add the error checks, but in what cases is it useful to know *exactly* why a mapping could not be created? Having an empty mapping for all invalid cases is simpler and seems more robust. ---------- keywords: +patch stage: needs patch -> patch review Added file: http://bugs.python.org/file23303/issue13062.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 04:02:59 2011 From: report at bugs.python.org (Nick Coghlan) Date: Mon, 03 Oct 2011 02:02:59 +0000 Subject: [issue13062] Introspection generator and function closure state In-Reply-To: <1317318190.56.0.437971601638.issue13062@psf.upfronthosting.co.za> Message-ID: <1317607379.86.0.272110831639.issue13062@psf.upfronthosting.co.za> Nick Coghlan added the comment: Because a generator can legitimately have no locals: >>> def gen(): ... yield 1 ... >>> g = gen() >>> g.gi_frame.f_locals {} Errors should be reported as exceptions - AttributeError or TypeError if there's no gi_frame and then ValueError or RuntimeError if gi_frame is None. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 04:09:09 2011 From: report at bugs.python.org (Meador Inge) Date: Mon, 03 Oct 2011 02:09:09 +0000 Subject: [issue12943] tokenize: add python -m tokenize support back In-Reply-To: <1315537867.15.0.614423357455.issue12943@psf.upfronthosting.co.za> Message-ID: <1317607749.78.0.985477200064.issue12943@psf.upfronthosting.co.za> Meador Inge added the comment: Fixed a few more nits pointed out in review. ---------- Added file: http://bugs.python.org/file23304/issue12943-6.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 04:12:03 2011 From: report at bugs.python.org (Nick Coghlan) Date: Mon, 03 Oct 2011 02:12:03 +0000 Subject: [issue13062] Introspection generator and function closure state In-Reply-To: <1317318190.56.0.437971601638.issue13062@psf.upfronthosting.co.za> Message-ID: <1317607923.92.0.651978518474.issue13062@psf.upfronthosting.co.za> Nick Coghlan added the comment: The function case is simpler - AttributeError or TypeError if there's no __closure__ attribute, empty mapping if there's no closure. I've also changed my mind on the "no frame" generator case - since that mapping will evolve over time as the generator executes anyway, the empty mapping accurately reflects the "no locals currently defined" that applies when the generator either hasn't been started yet or has finished. People can use getgeneratorstate() to find that information if they need to know. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 04:25:23 2011 From: report at bugs.python.org (Tom Christiansen) Date: Mon, 03 Oct 2011 02:25:23 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <4E88DE0D.2020800@udel.edu> Message-ID: <9072.1317605488@chthon> Tom Christiansen added the comment: >> Really? White space makes things harder to read? I thought Pythonistas >> believed the opposite of that. > I was surprised at that too ;-). One person's opinion in a specific > context. Don't generalize. The example I initially showed probably wasn't the best for that. Mostly I was trying to demonstrate how useful it is to have user-defined properties is all. But I have no asked for that (I have asked for properties, though). >> English titling rules >> only capitalize the first word in hyphenated words, which is why it's >> Anti?intellectual not Anti-Intellectual. > Except that I can imagine someone using the latter as a noun to make the > work more officious or something. If Good-Looking looks more officous than Good-looking, I bet GOOD-LOOKING is better still. :) > There are no official English titling rules and as you noted, > publishers vary. If there aren't any rules, then how come all book and movie titles always look the same? :) I don't think anyone would argue with these two: 1. Capitalize the first word, the last word, and the word right after a colon (or semicolon). 2. Capitalize all intervening words except for articles (a, an, the) and short prepositions. Those are the basic rules. The main problem is that "short" isn't well defined--and indeed, there are even places where "preposition" isn't well defined either. English has sentence casing (only the first word) and headline casing (most of them). It's problematic that computer people call capitalizing each word titlecasing, since in English, this is never correct. http://www.chicagomanualofstyle.org/CMS_FAQ/CapitalizationTitles/CapitalizationTitles23.html Although Chicago style lowercases prepositions (but see CMOS 8.157 for exceptions), some style guides uppercase them. Ask your editor for a style guide. I myself usually fall back to the Chicago Manual of Style or the Oxford Guide to Style. I don't think I do anything neither of them says to do. But I completely agree that this should *not* be in the titlecase() function. I think the docs for the function might perhaps say something about how it does not mean correct English headline case when it says titlecase, but that's largely just nitpicking. > I agree that str.title should do something sensible > based on Unicode, with the improvements you mentioned. One of the goals of Unicode is that casing not be language dependent. And they almost got there, too. The Turkic I is the most notable exception. Did you know there is a problem with all the case stuff in Python? It was clearly put in before they had realized that they needed to have things other the Lu/Lt/Ll have casing properties. That's why there is a difference betwen GC=Ll and the Lowercase property. str.islower() Return true if all cased characters in the string are lowercase and there is at least one cased character, false otherwise. Cased characters are those with general category property being one of ?Lu?, ?Ll?, or ?Lt? and lowercase characters are those with general category property ?Ll?. http://docs.python.org/release/3.2/library/stdtypes.html That really isn't right. A cased character is one with the Unicode "Cased" property, and a lowercase character is one wiht the Unicode "Lowercase" property. The General Category is actually immaterial here. I've spent all bloody day trying to model Python's islower, isupper, and istitle functions, but I get all kinds of errors, both in the definitions and in the models of the definitions. Under both 2.7 and 3.2, I get all these bugs: ? not islower() but has at least one cased character with all cased characters lowercase! ? not islower() but has at least one cased character with all cased characters lowercase! ? not islower() but has at least one cased character with all cased characters lowercase! ? not islower() but has at least one cased character with all cased characters lowercase! ? not isupper() but has at least one cased character with all cased characters uppercase! ? not istitle() but should be ? not islower() but has at least one cased character with all cased characters lowercase! 2?? not islower() but has at least one cased character with all cased characters lowercase! 2?? not islower() but has at least one cased character with all cased characters lowercase! ?? isupper() but fails to have at least one cased character with all cased characters uppercase! ThisIsInTitleCaseYouKnow not istitle() but should be M? isupper() but fails to have at least one cased character with all cased characters uppercase! ?M isupper() but fails to have at least one cased character with all cased characters uppercase! ?M istitle() but should not be M?KINLEY isupper() but fails to have at least one cased character with all cased characters uppercase! I really don't understand. BTW, I feel that M?Kinley is titlecase in that lowercase always follows uppercase and uppercase never follows itself. And Python agrees with me. But that same definition should vet ThisIsInTitleCaseYouKnow, but Python disagrees. I really don't understand any of these functions. I'm very sad. I think they are wrong, but maybe I am. It is extremely confusing. Shall I file a separate bug report? --tom from __future__ import unicode_literals from __future__ import print_function import regex VERBOSE = 0 data = [ # first test the problem cases just one at a time "\N{MODIFIER LETTER SMALL C}", "\N{SUPERSCRIPT LATIN SMALL LETTER N}", "\N{MODIFIER LETTER CAPITAL D}", "\N{CIRCLED LATIN SMALL LETTER K}", "\N{COMBINING GREEK YPOGEGRAMMENI}", "\N{ROMAN NUMERAL EIGHT}", "\N{SMALL ROMAN NUMERAL EIGHT}", "\N{LATIN CAPITAL LETTER D WITH SMALL LETTER Z}", "\N{LATIN LETTER SMALL CAPITAL R}", # test superscripts "2\N{SUPERSCRIPT LATIN SMALL LETTER N}\N{MODIFIER LETTER SMALL D}", "2\N{MODIFIER LETTER CAPITAL N}\N{MODIFIER LETTER CAPITAL D}", "2\N{FEMININE ORDINAL INDICATOR}", # as in "segunda" # test romans "ROMAN NUMERAL EIGHT IS \N{ROMAN NUMERAL EIGHT}", "roman numeral eight is \N{SMALL ROMAN NUMERAL EIGHT}", # test small caps "\N{LATIN LETTER SMALL CAPITAL R}\N{LATIN LETTER SMALL CAPITAL A}\N{LATIN LETTER SMALL CAPITAL R}\N{LATIN LETTER SMALL CAPITAL E}", # test cased combining mark (this is in titlecase) "\N{GREEK CAPITAL LETTER ALPHA WITH VARIA}\N{COMBINING GREEK YPOGEGRAMMENI}", "\N{GREEK CAPITAL LETTER ALPHA WITH VARIA}\N{COMBINING GREEK YPOGEGRAMMENI} \N{GREEK CAPITAL LETTER SIGMA}\N{GREEK SMALL LETTER TAU}\N{GREEK SMALL LETTER OMICRON} \N{GREEK CAPITAL LETTER DELTA}\N{GREEK SMALL LETTER IOTA}\N{GREEK SMALL LETTER ALPHA WITH TONOS}\N{GREEK SMALL LETTER OMICRON}\N{GREEK SMALL LETTER LAMDA}\N{GREEK SMALL LETTER OMICRON}", # test cased symbols "circle \N{CIRCLED LATIN SMALL LETTER K}", "CIRCLE \N{CIRCLED LATIN CAPITAL LETTER K}", # test titlecased code point 3-way "\N{LATIN CAPITAL LETTER DZ}", "\N{LATIN CAPITAL LETTER DZ}UR", "\N{LATIN CAPITAL LETTER D WITH SMALL LETTER Z}ur", "\N{LATIN CAPITAL LETTER D WITH SMALL LETTER Z}", "\N{LATIN SMALL LETTER DZ}ur", "\N{LATIN SMALL LETTER DZ}", # test titlecase "FBI", "F B I", "F.B.I", "HP Company", "H.P. Company", "ThisIsInTitleCaseYouKnow", "M\N{MODIFIER LETTER SMALL C}", "\N{MODIFIER LETTER SMALL C}M", "M\N{MODIFIER LETTER SMALL C}Kinley", # titlecase "M\N{MODIFIER LETTER SMALL C}KINLEY", # uppercase "m\N{MODIFIER LETTER SMALL C}kinley", # lowercase # Return true if the string is a titlecased string and there # is at least one character, for example uppercase characters may # only follow uncased characters and lowercase characters only # cased ones. Return false otherwise. # Return true if all cased characters in the string are lowercase and there is at least one cased character, ] for s in data: # "Return true if all cased characters in the string are lowercase # and there is at least one cased character" if s.islower(): if not ( regex.search(r'\p{cased}', s) and not regex.search(r'(?=\p{cased})\P{LOWERCASE}', s)): print(s+" islower() but fails to have at least one cased character with all cased characters lowercase!") else: if ( regex.search(r'\p{cased}', s) and not regex.search(r'(?=\p{cased})\P{LOWERCASE}', s)): print(s+" not islower() but has at least one cased character with all cased characters lowercase!") # "Return true if all cased characters in the string are uppercase # and there is at least one cased character" if s.isupper(): if not ( regex.search(r'\p{cased}', s) and not regex.search(r'(?=\p{cased})\P{UPPERCASE}', s)): print(s+" isupper() but fails to have at least one cased character with all cased characters uppercase!") else: if ( regex.search(r'\p{cased}', s) and not regex.search(r'(?=\p{cased})\P{UPPERCASE}', s)): print(s+" not isupper() but has at least one cased character with all cased characters uppercase!") # "Return true if the string is a titlecased string and there is at # least one character, for example uppercase characters may only # follow uncased characters and lowercase characters only cased ones." has_it = s.istitle() want_it1 = ( # at least one title/uppercase regex.search(r'[\p{Lt}\p{uppercase}]', s) and not # plus no title/uppercase follows cased character regex.search(r'(?<=\p{cased})[\p{Lt}\p{uppercase}]', s) and not # plus no lowercase follows uncased character regex.search(r'(?<=\P{CASED})\p{lowercase}', s) ) want_it = regex.search(r'''(?x) ^ (?: \P{CASED} * [\p{Lt}\p{uppercase}] (?! [\p{Lt}\p{uppercase}] ) \p{lowercase} * ) + \P{CASED} * $ ''', s) if VERBOSE: if has_it and want_it: print( s + " istitle() and should be (OK)") if not has_it and not want_it: print( s + " not istitle() and should not be (OK)") if has_it and not want_it: print( s + " istitle() but should not be") if want_it and not has_it: print( s + " not istitle() but should be") ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 06:15:50 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 04:15:50 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1313430514.3.0.983525514499.issue12753@psf.upfronthosting.co.za> Message-ID: <1317615350.8.0.846192018045.issue12753@psf.upfronthosting.co.za> Ezio Melotti added the comment: > But it still has to happen at compile time, of course, so I don't know > what you could do in Python. Is there any way to change how the compiler > behaves even vaguely along these lines? I think things like "from __future__ import ..." do something similar, but I'm not sure it will work in this case (also because you will have to provide the list of aliases somehow). >> Really? White space makes things harder to read? I thought Pythonistas >> believed the opposite of that. Whitespace is very useful for cognitive >> chunking: you see how things logically group together. > I was surprised at that too ;-). One person's opinion in a specific > context. Don't generaliza. Also don't generalize my opinion regarding *where* whitespace makes thing less readable: I was just talking about regex. What I was trying to say here is best summarized by a quote from Paul Graham's article "Succinctness is Power": """ If you're used to reading novels and newspaper articles, your first experience of reading a math paper can be dismaying. It could take half an hour to read a single page. And yet, I am pretty sure that the notation is not the problem, even though it may feel like it is. The math paper is hard to read because the ideas are hard. If you expressed the same ideas in prose (as mathematicians had to do before they evolved succinct notations), they wouldn't be any easier to read, because the paper would grow to the size of a book. """ Try replacing s/novels and newspaper articles|prose/Python code/g s/single page/single regex/ s/math paper/regex/g. To provide an example, I find: # define a function to capitalize s def my_capitalize(s): """This function capitalizes the argument s and returns it""" the_first_letter = s[0] # 0 means the first char the_rest_of_s = s[1:] # 1: means from the second till the end the_first_letter_uppercased = the_first_letter.upper() # upper makes the string uppercase the_rest_of_s_lowercased = the_rest_of_s.lower() # lower makes the string lowercase s_capitalized = the_first_letter_uppercased + the_rest_of_s_lowercased # + concatenates return s_capitalized less readable than: def my_capitalize(s): return s[0].upper() + s[1:].lower() You could argue that the first is much more explicit and in a way clearer, but overall I think you agree with me that is less readable. Also this clearly depends on how well you know the notation you are reading: if you don't know it very well, you might still prefer the commented/verbose/extended/redundant version. Another important thing to mention, is that notation of regular expressions is fairly simple (especially if you leave out look-arounds and Unicode-related things that are not used too often), but having a similar succinct notation for a whole programming language (like Perl) might not work as well (I'm not picking on Perl here, as you said you can write readable programs if you don't abuse the notation, and the succinctness offered by the language has some advantages, but with Python we prefer more readable, even if we have to be a little more verbose). Another example of a trade-off between verbosity and succinctness is the new string formatting mini-language. > That really isn't right. A cased character is one with the Unicode "Cased" > property, and a lowercase character is one wiht the Unicode "Lowercase" > property. The General Category is actually immaterial here. You might want to take a look and possibly add a comment on #12204 about this. > I've spent all bloody day trying to model Python's islower, isupper, and istitle > functions, but I get all kinds of errors, both in the definitions and in the > models of the definitions. If by "model" you mean "trying to figure out how they work", it's probably easier to look at the implementation (I assume you know enough C to understand what they do). You can find the code for str.istitle() at http://hg.python.org/cpython/file/default/Objects/unicodeobject.c#l10358 and the actual implementation of some macros like Py_UNICODE_ISTITLE at http://hg.python.org/cpython/file/default/Objects/unicodectype.c. > I really don't understand any of these functions. I'm very sad. I think they are > wrong, but maybe I am. It is extremely confusing. > Shall I file a separate bug report? If after reading the code and/or the documentation you still think they are broken and/or that they can be improved, then you can open another issue. BTW, instead of writing custom scripts to test things, it might be better to use unittest (see http://docs.python.org/py3k/library/unittest.html#basic-example), or even better write a patch for Lib/test/test_unicode.py. Using unittest has the advantage that is then easy to integrate those tests within our test suite, but on the other hand as soon as something fails the failure is returned without evaluating the following assertions in the method. This as the advantage that ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 07:22:11 2011 From: report at bugs.python.org (jfalskfjdsl;akfdjsa;l laksfj;aslkfdj;sal) Date: Mon, 03 Oct 2011 05:22:11 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317619331.76.0.626152135204.issue13071@psf.upfronthosting.co.za> jfalskfjdsl;akfdjsa;l laksfj;aslkfdj;sal added the comment: Traceback (most recent call last): File "C:\Python32\Lib\idlelib/idle.py", line 11, in idlelib.PyShell.main() File "C:\Python32\Lib\idlelib\PyShell.py",line 1377, in main shell = flist.open_shell() File "C:\Python32\Lib\idlelib\PyShell.py", line 273, in open_shell self.pyshell = PyShell(self) File "C:\Python32\Lib\idlelib\Pyshell.py", line 802, in __init__ OutputWindow.__init__(self,flist, none, none) File "C:\Python32\Lib\idlelib\OutputWindow.py", line 16, in __init__ EditorWindow.__init__(self,*args) File "C:\Python32\Lib\idlelib\EditorWindow.py", line 145, in __init__ self.aply_bindings() File "C:\Python32\Lib\idlelib\EditorWindow.py", line 985, in apply_bindings text.event_add(event, *keylist) File "C:\Python32\Lib\idlelib\MultiCall.py", line 359, in event_add widget.event_add(self, virtual, seq) File C:\Python32\Lib\tkinter\__init__.py", line 1353, in event_add self.tk.call(args) _tkinter.TclError: bad event type or keysym "Alt" ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 07:23:03 2011 From: report at bugs.python.org (jfalskfjdsl;akfdjsa;l laksfj;aslkfdj;sal) Date: Mon, 03 Oct 2011 05:23:03 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317619383.85.0.443772759877.issue13071@psf.upfronthosting.co.za> jfalskfjdsl;akfdjsa;l laksfj;aslkfdj;sal added the comment: That is the traceback given when I run idle.py through windows command prompt ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 08:36:53 2011 From: report at bugs.python.org (Mark Hammond) Date: Mon, 03 Oct 2011 06:36:53 +0000 Subject: [issue7833] Bdist_wininst installers fail to load extensions built with Issue4120 patch In-Reply-To: <1265062373.01.0.461114831555.issue7833@psf.upfronthosting.co.za> Message-ID: <1317623813.09.0.874194510321.issue7833@psf.upfronthosting.co.za> Mark Hammond added the comment: This is biting people (including me :) so I'm going to try hard to get this fixed. One user on the python-win32 mailing list resorts to rebuilding every 3rd party module he uses with this patch to get things working again (although apps which use only builtin modules or pywin32 modules - which already hacks this basic functionality in - don't see the problem.) I'm attaching a different patch that should have the same default effect as Christoph's, but also allows the behaviour to be overridden. Actually overriding it is quite difficult as distutils isn't setup to easily allow such compiler customizations - but at least it *is* possible. To test this out I hacked both the pywin32 and py2erxe build processes to use those customizations and it works fine and allows them both to customize the behaviour to meet various modules' requirements. Finally, this whole thing is still fundamentally broken for extensions which need a manifest (eg, to reference the common controls or the requestedExecutionLevel cruft). These extension will still need to include the CRT reference in their manifest and thus will need a copy of the CRT next to each of them. I *think* this also means they basically get a private copy of the CRT - they are not sharing the CRT with Python, which means they are at risk of hitting problems such as trying to share FILE * objects. In practice, this means such modules are probably better of just embedding the CRT statically. This is the route I've taken for one pywin32 module so the module can have a manifest and still work without a complete, private copy of the CRT needing to live next to it. But even with that problem I think this patch should land. It would be great if someone can review and test this version of the patch and I'll check it in. ---------- versions: +Python 3.3, Python 3.4 -Python 2.6 Added file: http://bugs.python.org/file23305/bug-7833-overridable-manifest-settings.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 10:17:51 2011 From: report at bugs.python.org (Terry J. Reedy) Date: Mon, 03 Oct 2011 08:17:51 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317629871.2.0.459395112092.issue13071@psf.upfronthosting.co.za> Terry J. Reedy added the comment: Are you using the .msi installer from python.org? Or one from activestate or enthought? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 10:20:37 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 08:20:37 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317630037.98.0.973499009545.issue13071@psf.upfronthosting.co.za> Ezio Melotti added the comment: Are you using some "unusual" keyboard layout? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 10:51:45 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Mon, 03 Oct 2011 08:51:45 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317631905.55.0.824742964473.issue13071@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: This issue is very similar to issue5707: it is possible to define a custom key binding to "" or "": just click the Alt box and don't select a letter. There is no check, it's possible to save this buggy key binding, and IDLE won't start anymore. IDLE should: - check the validity of the binding and refuse to save it when it is invalid - gracefully skip invalid bindings from the config file ---------- nosy: +amaury.forgeotdarc _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 12:09:58 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 10:09:58 +0000 Subject: [issue12804] "make test" fails on systems without internet access In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: <1317636598.22.0.929523788925.issue12804@psf.upfronthosting.co.za> Ezio Melotti added the comment: FWIW there's also support.open_urlresource that can be used to download test data. open_urlresouce calls requires('urlfetch') and skips the test when the resource is not enabled. For instance, test_normalization uses it: try: testdata = support.open_urlresource(TESTDATAURL, encoding="utf-8", check=check_version) except (IOError, HTTPException): self.skipTest("Could not retrieve " + TESTDATAURL) self.addCleanup(testdata.close) This also saves the file on the disk and reuses it when the test is run again, so the connection is actually used once and only if available. ---------- nosy: +ezio.melotti _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 12:20:11 2011 From: report at bugs.python.org (Ned Deily) Date: Mon, 03 Oct 2011 10:20:11 +0000 Subject: [issue7425] Improve the robustness of "pydoc -k" in the face of broken modules In-Reply-To: <1259786972.24.0.35920903506.issue7425@psf.upfronthosting.co.za> Message-ID: <1317637211.05.0.491072884624.issue7425@psf.upfronthosting.co.za> Ned Deily added the comment: It turns out that the proposed fix here for pydoc was independently added in the early days of Python 3 but was not backported. That fix for 2.7 plus a fix-in-progress for Issue7367 (for both 2.7 and 3.x) and additional test cases (also in progress) should address most if not all of the pydoc crash issues. I'll also review the other open issues. ---------- assignee: -> ned.deily _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 12:34:34 2011 From: report at bugs.python.org (Stefan Krah) Date: Mon, 03 Oct 2011 10:34:34 +0000 Subject: [issue13072] Getting a buffer from a Unicode array uses invalid format In-Reply-To: <1317341390.65.0.255779328054.issue13072@psf.upfronthosting.co.za> Message-ID: <1317638074.27.0.573358700089.issue13072@psf.upfronthosting.co.za> Stefan Krah added the comment: The automatic conversion of 'u' to 'I' or 'L' causes test_buffer (PEP-3118 repo) to fail: # Not implemented formats. Ugly, but inevitable. This is the same as # issue #2531: equality is also used for membership testing and must # return a result. a = array.array('u', 'xyz') v = memoryview(a) self.assertNotEqual(v, a) self.assertNotEqual(a, v) I don't have a better idea though what to do about 'u' except officially implementing it for struct and memoryview as well. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 12:44:48 2011 From: report at bugs.python.org (Stefan Krah) Date: Mon, 03 Oct 2011 10:44:48 +0000 Subject: [issue13072] Getting a buffer from a Unicode array uses invalid format In-Reply-To: <1317341390.65.0.255779328054.issue13072@psf.upfronthosting.co.za> Message-ID: <1317638688.48.0.173780283507.issue13072@psf.upfronthosting.co.za> Changes by Stefan Krah : ---------- nosy: +meador.inge _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 12:49:35 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 10:49:35 +0000 Subject: [issue12458] Tracebacks should contain the first line of continuation lines In-Reply-To: <1309499207.17.0.676241559437.issue12458@psf.upfronthosting.co.za> Message-ID: <1317638975.4.0.223337846609.issue12458@psf.upfronthosting.co.za> Ezio Melotti added the comment: This is an interesting proposal. The line number comes from Python/traceback.c:120: tb->tb_lineno = PyFrame_GetLineNumber(frame); and this function is defined in Objects/frameobject.c:35: int PyFrame_GetLineNumber(PyFrameObject *f) { if (f->f_trace) return f->f_lineno; else return PyCode_Addr2Line(f->f_code, f->f_lasti); } and documented as "Return the line number that frame is currently executing.", so that would explain why it's pointing to the last line. I'm not sure if there's an easy way to get the line where the beginning of the expression is, but if you find a way to get it, we could try to use it in PyFrame_GetLineNumber and see how it works. ---------- nosy: +ezio.melotti, georg.brandl, ncoghlan stage: -> test needed versions: +Python 3.3 -Python 2.6 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 12:52:43 2011 From: report at bugs.python.org (Stefan Krah) Date: Mon, 03 Oct 2011 10:52:43 +0000 Subject: [issue13072] Getting a buffer from a Unicode array uses invalid format In-Reply-To: <1317341390.65.0.255779328054.issue13072@psf.upfronthosting.co.za> Message-ID: <1317639163.43.0.12362076793.issue13072@psf.upfronthosting.co.za> Stefan Krah added the comment: >It would be better to use a format for a Py_UCS4 string, but struct doesn't support such type. PEP-3118 suggests for the extended struct syntax: 'c' -> ucs-1 (latin-1) encoding 'u' -> ucs-2 'w' -> ucs-4 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 12:59:47 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 10:59:47 +0000 Subject: [issue6632] Include more fullwidth chars in the decimal codec In-Reply-To: <1249317285.35.0.709481915004.issue6632@psf.upfronthosting.co.za> Message-ID: <1317639587.07.0.341077909878.issue6632@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +tchrist _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 13:44:14 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Mon, 03 Oct 2011 11:44:14 +0000 Subject: [issue12911] Expose a private accumulator C API In-Reply-To: <1315311609.99.0.705451675521.issue12911@psf.upfronthosting.co.za> Message-ID: <1317642254.13.0.480486303814.issue12911@psf.upfronthosting.co.za> Martin v. L?wis added the comment: > It's not a container type, just a small C struct that > gets allocated on the stack. Think of it as a library, like stringlib. That's what I call a container type: a structure with a library :-) > That's another possibility. But we'd have to expose a > C API anyway, and this one is as good as any other. No, it's not: it's additional clutter. If new API needs to be added, adding it for existing structures is better. Notice that you don't *need* new API, as you can use StringIO just fine from C also. > Note that StringIO will copy data twice (once when calling > write(), once when calling getvalue()), while ''.join() only once (at > the end, when concatenating all strings). Sounds like a flaw in StringIO to me. It could also manage a list of strings that have been written, rather than only using a flat buffer. Only when someone actually needs a linear buffer, it could convert it (and use a plain string.join when getvalue is called and there is no buffer at all). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 13:58:02 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Mon, 03 Oct 2011 11:58:02 +0000 Subject: [issue12911] Expose a private accumulator C API In-Reply-To: <1317642254.13.0.480486303814.issue12911@psf.upfronthosting.co.za> Message-ID: <1317642873.3670.2.camel@localhost.localdomain> Antoine Pitrou added the comment: > > That's another possibility. But we'd have to expose a > > C API anyway, and this one is as good as any other. > > No, it's not: it's additional clutter. If new API needs to be added, > adding it for existing structures is better. Notice that you don't > *need* new API, as you can use StringIO just fine from C also. Yes, but using StringIO without a dedicated C API is more tedious and quite slower. > > Note that StringIO will copy data twice (once when calling > > write(), once when calling getvalue()), while ''.join() only once (at > > the end, when concatenating all strings). > > Sounds like a flaw in StringIO to me. It could also manage a list of > strings that have been written, rather than only using a flat buffer. > Only when someone actually needs a linear buffer, it could convert it > (and use a plain string.join when getvalue is called and there is no > buffer at all). That's what I thought as well. However, that's probably too much for a bugfix release (and this issue is meant to allow test_bigmem to pass on 3.x). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 15:01:45 2011 From: report at bugs.python.org (Fred L. Drake, Jr.) Date: Mon, 03 Oct 2011 13:01:45 +0000 Subject: [issue670664] HTMLParser.py - more robust SCRIPT tag parsing Message-ID: <1317646905.81.0.00533848408685.issue670664@psf.upfronthosting.co.za> Changes by Fred L. Drake, Jr. : ---------- nosy: -fdrake _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 15:34:27 2011 From: report at bugs.python.org (STINNER Victor) Date: Mon, 03 Oct 2011 13:34:27 +0000 Subject: [issue13072] Getting a buffer from a Unicode array uses invalid format In-Reply-To: <1317638074.27.0.573358700089.issue13072@psf.upfronthosting.co.za> Message-ID: <4E89B9E1.3020801@haypocalc.com> STINNER Victor added the comment: > The automatic conversion of 'u' to 'I' or 'L' causes test_buffer > (PEP-3118 repo) to fail: > > > # Not implemented formats. Ugly, but inevitable. This is the same as > # issue #2531: equality is also used for membership testing and must > # return a result. > a = array.array('u', 'xyz') > v = memoryview(a) > self.assertNotEqual(v, a) > self.assertNotEqual(a, v) I don't understand: a buffer format is a format for the struct module, or for the array module? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 16:00:51 2011 From: report at bugs.python.org (Stefan Krah) Date: Mon, 03 Oct 2011 14:00:51 +0000 Subject: [issue13072] Getting a buffer from a Unicode array uses invalid format In-Reply-To: <4E89B9E1.3020801@haypocalc.com> Message-ID: <20111003135737.GA6212@sleipnir.bytereef.org> Stefan Krah added the comment: STINNER Victor wrote: > > # Not implemented formats. Ugly, but inevitable. This is the same as > > # issue #2531: equality is also used for membership testing and must > > # return a result. > > a = array.array('u', 'xyz') > > v = memoryview(a) > > self.assertNotEqual(v, a) > > self.assertNotEqual(a, v) > > I don't understand: a buffer format is a format for the struct module, > or for the array module? It's like this: memoryview follows the current struct syntax, which doesn't have 'u'. memory_richcompare() does not understand 'u', but is required to return something for __eq__ and __ne__, so it returns 'not equal'. This isn't so important, since I discovered (see my later post) that 'u' and 'w' were scheduled for inclusion in the struct module anyway. So I think we should focus on whether the proposed 'c', 'u' and 'w' format specifiers still make sense after the PEP-393 changes. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 16:39:57 2011 From: report at bugs.python.org (=?utf-8?b?VG9tw6HFoSBEdm/FmcOhaw==?=) Date: Mon, 03 Oct 2011 14:39:57 +0000 Subject: [issue13094] setattr misbehaves when used with lambdas inside for loop Message-ID: <1317652797.3.0.357677492942.issue13094@psf.upfronthosting.co.za> New submission from Tom?? Dvo??k : I have this python script, and run it in python 2.7.2 (installed from EPD_free 7.1-2 (32-bit), but I guess this has nothing to do with EPD. ----8<---fail.py------ class X(object): pass x = X() items = ["foo", "bar", "baz"] for each in items: setattr(x, each, lambda: each) print("foo", x.foo()) print("bar", x.bar()) print("baz", x.baz()) ----8<---fail.py------ I'd naively expect it to print ('foo', 'foo') ('bar', 'bar') ('baz', 'baz') ,but it surprisingly (and annoyingly) outputs ('foo', 'baz') ('bar', 'baz') ('baz', 'baz') Please, tell me that this is a bug :) I'd hate it if this was the intended behaviour. I spent two hours today before I found out this was the cause of my program to fail. Best regards, Tom?? Dvo??k ---------- components: None messages: 144819 nosy: Tom??.Dvo??k priority: normal severity: normal status: open title: setattr misbehaves when used with lambdas inside for loop versions: Python 2.7 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 16:45:28 2011 From: report at bugs.python.org (R. David Murray) Date: Mon, 03 Oct 2011 14:45:28 +0000 Subject: [issue13094] setattr misbehaves when used with lambdas inside for loop In-Reply-To: <1317652797.3.0.357677492942.issue13094@psf.upfronthosting.co.za> Message-ID: <1317653128.63.0.875115518309.issue13094@psf.upfronthosting.co.za> R. David Murray added the comment: Sorry. It is intended behavior. The lambda 'each' is bound to the local 'each', and by the time the lambda's execute, the value of 'each' is 'baz'. I'm going to turn this into a doc bug, because while I'm pretty sure this is documented *somewhere*, I don't see it in the programming FAQ, and it should be there. ---------- assignee: -> docs at python components: +Documentation -None nosy: +docs at python, r.david.murray stage: -> needs patch type: -> behavior versions: +Python 3.2, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 16:46:35 2011 From: report at bugs.python.org (R. David Murray) Date: Mon, 03 Oct 2011 14:46:35 +0000 Subject: [issue13094] Need Programming FAQ entry for the behavior of closures In-Reply-To: <1317652797.3.0.357677492942.issue13094@psf.upfronthosting.co.za> Message-ID: <1317653195.47.0.0454031356019.issue13094@psf.upfronthosting.co.za> Changes by R. David Murray : ---------- title: setattr misbehaves when used with lambdas inside for loop -> Need Programming FAQ entry for the behavior of closures _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 17:03:19 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 15:03:19 +0000 Subject: [issue13094] Need Programming FAQ entry for the behavior of closures In-Reply-To: <1317652797.3.0.357677492942.issue13094@psf.upfronthosting.co.za> Message-ID: <1317654199.49.0.0212202123252.issue13094@psf.upfronthosting.co.za> Ezio Melotti added the comment: To understand better what's going on, try to change the value of 'each' after the 3 prints and then call again the 3 methods: you will see that they now return the new value of each. This is because the lambdas refer to global 'each' (that at the end of the loop is set to 'baz'). If you do setattr(x, each, lambda each=each: each), the each will be local to the lambda, and it will then work as expected. An entry in the FAQ would be useful, I thought it was there already but apparently it's not (I'm pretty sure I saw this already somewhere in the doc, but I can't seem to find where). ---------- assignee: docs at python -> components: +None -Documentation nosy: +ezio.melotti stage: needs patch -> type: behavior -> versions: -Python 3.2, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 17:13:33 2011 From: report at bugs.python.org (=?utf-8?b?VG9tw6HFoSBEdm/FmcOhaw==?=) Date: Mon, 03 Oct 2011 15:13:33 +0000 Subject: [issue13094] Need Programming FAQ entry for the behavior of closures In-Reply-To: <1317652797.3.0.357677492942.issue13094@psf.upfronthosting.co.za> Message-ID: <1317654813.95.0.772062209318.issue13094@psf.upfronthosting.co.za> Tom?? Dvo??k added the comment: Thank you all very much for the super-quick responses. I'm used to smalltalk, so the python variable binding behaviour is unnatural to me, but I guess there must have been some reasons for making it behave this way. Ezio, the "lambda each=each: each" trick works nicely, thanks a lot. But - what does it mean? :) I just don't know how to parse and understand it :-) Best regards, Tom?? Dvo??k ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 17:19:15 2011 From: report at bugs.python.org (Artyom Gavrichenkov) Date: Mon, 03 Oct 2011 15:19:15 +0000 Subject: [issue13045] socket.getsockopt may require custom buffer contents In-Reply-To: <1316970740.88.0.976956666008.issue13045@psf.upfronthosting.co.za> Message-ID: <1317655155.84.0.736571217854.issue13045@psf.upfronthosting.co.za> Changes by Artyom Gavrichenkov : ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 17:22:44 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 15:22:44 +0000 Subject: [issue13094] Need Programming FAQ entry for the behavior of closures In-Reply-To: <1317652797.3.0.357677492942.issue13094@psf.upfronthosting.co.za> Message-ID: <1317655364.34.0.784602471719.issue13094@psf.upfronthosting.co.za> Ezio Melotti added the comment: Maybe with a different name is less confusing: lambda return_value=each: return_value This copies the value of 'each' in a variable called 'return_value' that is local to the lambda. Since the copy happens when the lambdas are defined, 'return_value' has then the right value. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 18:33:08 2011 From: report at bugs.python.org (Eric Snow) Date: Mon, 03 Oct 2011 16:33:08 +0000 Subject: [issue11816] Refactor the dis module to provide better building blocks for bytecode analysis In-Reply-To: <1302394810.0.0.38146154248.issue11816@psf.upfronthosting.co.za> Message-ID: <1317659588.19.0.256725431039.issue11816@psf.upfronthosting.co.za> Changes by Eric Snow : ---------- nosy: +eric.snow _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 18:41:08 2011 From: report at bugs.python.org (Artyom Gavrichenkov) Date: Mon, 03 Oct 2011 16:41:08 +0000 Subject: [issue13045] socket.getsockopt may require custom buffer contents In-Reply-To: <1316970740.88.0.976956666008.issue13045@psf.upfronthosting.co.za> Message-ID: <1317660068.61.0.564487931286.issue13045@psf.upfronthosting.co.za> Changes by Artyom Gavrichenkov : ---------- nosy: +neologix _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 18:48:27 2011 From: report at bugs.python.org (Artyom Gavrichenkov) Date: Mon, 03 Oct 2011 16:48:27 +0000 Subject: [issue13045] socket.getsockopt may require custom buffer contents In-Reply-To: <1316970740.88.0.976956666008.issue13045@psf.upfronthosting.co.za> Message-ID: <1317660507.85.0.606315347557.issue13045@psf.upfronthosting.co.za> Changes by Artyom Gavrichenkov : ---------- nosy: +pitrou _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 19:04:47 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Mon, 03 Oct 2011 17:04:47 +0000 Subject: [issue13045] socket.getsockopt may require custom buffer contents In-Reply-To: <1316970740.88.0.976956666008.issue13045@psf.upfronthosting.co.za> Message-ID: <1317661487.13.0.16150929828.issue13045@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: Hello, method:: socket.getsockopt(level, optname[, optarg]) The overloading of the third parameter is confusing: it can already be an integer value or a buffer size, I don't think that adding a third possibility is a good idea. It might be better to add another optional `buffer` argument (and ignore `buflen` if this argument is provided). Also, it would be nice to have a corresponding unit test: since I doubt this buffer argument is supported by many Unices out there, you can probably reuse a subset of what ipset does (just take care and guard it by @support.requires_linux_version() if applicable). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 19:29:46 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Mon, 03 Oct 2011 17:29:46 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <9072.1317605488@chthon> Message-ID: <4E89F108.8090301@v.loewis.de> Martin v. L?wis added the comment: >> There are no official English titling rules and as you noted, >> publishers vary. > > If there aren't any rules, then how come all book and movie titles always > look the same? :) Can we please leave the English language out of this issue? Else I will ask that Python uses German text-processing rules, just so that this gets fewer comments :-) As a point of order, please all try to stick at the issue at hand. Linguistics discussions or general Unicode discussion have better places than this bug tracker. I just had to stop reading Tom's comments as too verbose (which is more difficult since it's in a foreign language). ---------- title: \N{...} neglects formal aliases and named sequences from Unicode charnames namespace -> \N{...} neglects formal aliases and named sequences from Unicode charnames namespace _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 19:41:07 2011 From: report at bugs.python.org (Roundup Robot) Date: Mon, 03 Oct 2011 17:41:07 +0000 Subject: [issue13001] test_socket.testRecvmsgTrunc failure on FreeBSD 7.2 buildbot In-Reply-To: <1316270071.04.0.179563462632.issue13001@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 4378bae6b8dc by Charles-Fran?ois Natali in branch 'default': Issue #13001: Fix test_socket.testRecvmsgTrunc failure on FreeBSD < 8, which http://hg.python.org/cpython/rev/4378bae6b8dc ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 19:44:44 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 17:44:44 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1313430514.3.0.983525514499.issue12753@psf.upfronthosting.co.za> Message-ID: <1317663884.57.0.332916970151.issue12753@psf.upfronthosting.co.za> Ezio Melotti added the comment: The patch is pretty much complete, it just needs a review (I left some comments on the review page). One thing that can be added is some compression for the names of the named sequences. I'm not sure I can reuse the same compression used for the other names easily. Does the size of the db really matters? Are the new names using too much extra space? ---------- keywords: +needs review _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:29:28 2011 From: report at bugs.python.org (Artyom Gavrichenkov) Date: Mon, 03 Oct 2011 18:29:28 +0000 Subject: [issue13045] socket.getsockopt may require custom buffer contents In-Reply-To: <1316970740.88.0.976956666008.issue13045@psf.upfronthosting.co.za> Message-ID: <1317666568.73.0.957785734202.issue13045@psf.upfronthosting.co.za> Artyom Gavrichenkov added the comment: Hi Charles-Fran?ois, I've attached an update for the previous patch. Now there's no more overloading for the third argument and socket.getsockopt accepts one more optional argument -- a buffer to use as an input to kernel. I can provide a manual sample script, with getsockopt being used this way, that depends on Linux ipset kernel module being installed and modprobe'd. However, automatic unit test is not that easy to implement. Generally ipset requires certain kernel modules to operate, and we either have to install ipset in order to run a unit test, or to implement some mock Linux kernel module purely for testing and support it against all possible kernel versions. Not to mention that such a test should be carried out by root user only. By the way, I don't really think that any POSIX-compliant UNIX out there would treat the buffer given to getsockopt in any way different from what Linux does. It is very easy to copy the buffer from user to kernel and back, and it is so inconvenient to prevent kernel from reading it prior to modification, that I bet no one has ever bothered to do this. ---------- Added file: http://bugs.python.org/file23306/getsockopt_buffer_input_v2.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:30:50 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Mon, 03 Oct 2011 18:30:50 +0000 Subject: [issue12555] PEP 3151 implementation In-Reply-To: <1310595572.06.0.21556538548.issue12555@psf.upfronthosting.co.za> Message-ID: <1317666650.85.0.486187378965.issue12555@psf.upfronthosting.co.za> Antoine Pitrou added the comment: > Should the input of OSError be checked? It could, but pre-PEP it is not, so I assumed it's better to minimize compatibility-breaking changes. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:31:56 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Mon, 03 Oct 2011 18:31:56 +0000 Subject: [issue12555] PEP 3151 implementation In-Reply-To: <1310595572.06.0.21556538548.issue12555@psf.upfronthosting.co.za> Message-ID: <1317666716.74.0.563443938608.issue12555@psf.upfronthosting.co.za> Changes by Antoine Pitrou : Added file: http://bugs.python.org/file23307/554524a74bbe.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:32:48 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Mon, 03 Oct 2011 18:32:48 +0000 Subject: [issue12555] PEP 3151 implementation In-Reply-To: <1310595572.06.0.21556538548.issue12555@psf.upfronthosting.co.za> Message-ID: <1317666768.91.0.761296781545.issue12555@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Patch update against latest default. There shouldn't be anything interesting to see. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:39:55 2011 From: report at bugs.python.org (Cal Leeming) Date: Mon, 03 Oct 2011 18:39:55 +0000 Subject: [issue13095] Support for splitting lists/tuples into chunks Message-ID: <1317667195.74.0.956951285114.issue13095@psf.upfronthosting.co.za> New submission from Cal Leeming : After a while of digging around, I noticed that the core libs don't provide an easy way of splitting a list/tuple into chunks - as per the following discussion: http://www.aspwinhost.com/In-what-way-do-you-split-an-list-into-evenly-sized-chunks-on-Python/ Therefore, I'd like to +1 feature request this. Any thoughts?? Cal ---------- components: Library (Lib) messages: 144831 nosy: sleepycal priority: normal severity: normal status: open title: Support for splitting lists/tuples into chunks type: feature request _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:44:25 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Mon, 03 Oct 2011 18:44:25 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1313430514.3.0.983525514499.issue12753@psf.upfronthosting.co.za> Message-ID: <1317667465.84.0.262627039258.issue12753@psf.upfronthosting.co.za> Martin v. L?wis added the comment: The patch needs to take versioning into account. It seems that NamedSequences where added in 4.1, and NameAliases in 5.0. So for the moment, when using 3.2 (i.e. when self is not NULL), it is fine to lookup neither. Please put an assertion into makeunicodedata that this needs to be reviewed when an old version other than 3.2 needs to be supported. The size of the DB does matter; there are frequent complaints about it. The named sequences take 20kB on my system; not sure whether that's too much. If you want to reduce the size (and also speedup lookup), you could use private-use characters, like so: - add the named sequences as PUA characters to the names table of makeunicodename, in the range(P, P+418) (for some P). - in lookup, check whether the _getcode result is in range(P,P+418). If so, subtract P from the code and use this as an index into _namedsequences. - add a _getcode wrapper that filters out all private use characters, for regular lookup. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:47:00 2011 From: report at bugs.python.org (Ezio Melotti) Date: Mon, 03 Oct 2011 18:47:00 +0000 Subject: [issue13095] Support for splitting lists/tuples into chunks In-Reply-To: <1317667195.74.0.956951285114.issue13095@psf.upfronthosting.co.za> Message-ID: <1317667620.38.0.47815609184.issue13095@psf.upfronthosting.co.za> Ezio Melotti added the comment: This sounds like the grouper() recipe of itertools. You could try to convince Raymond and see if he wants to include it in itertools. ---------- nosy: +ezio.melotti, rhettinger versions: +Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:56:33 2011 From: report at bugs.python.org (Cal Leeming) Date: Mon, 03 Oct 2011 18:56:33 +0000 Subject: [issue13095] Support for splitting lists/tuples into chunks In-Reply-To: <1317667195.74.0.956951285114.issue13095@psf.upfronthosting.co.za> Message-ID: <1317668193.93.0.970561499463.issue13095@psf.upfronthosting.co.za> Cal Leeming added the comment: Oh - and while we are at it - how about having merge_list() and unique_list() as part of the core too?? def unique_list(seq): # Dave Kirby # Order preserving seen = set() return [x for x in seq if x not in seen and not seen.add(x)] def merge_list(seq): merged = [] for s in seq: for x in s: merged.append(x) return merged Raymond - any thoughts on these 3 requests??? Cal ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:57:14 2011 From: report at bugs.python.org (Amorilia) Date: Mon, 03 Oct 2011 18:57:14 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317668234.4.0.377356380025.issue13081@psf.upfronthosting.co.za> Amorilia added the comment: I'm the author of the application. The tool is written in pure Python, and only uses libraries from stdlib. It would be really nice to have a simple standalone script to reproduce the crash, however I am still trying to reproduce it myself. So far no success. ---------- nosy: +amorilia _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 20:57:20 2011 From: report at bugs.python.org (Tom Christiansen) Date: Mon, 03 Oct 2011 18:57:20 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <1317615350.8.0.846192018045.issue12753@psf.upfronthosting.co.za> Message-ID: <9892.1317668226@chthon> Tom Christiansen added the comment: Ezio Melotti wrote on Mon, 03 Oct 2011 04:15:51 -0000: >> But it still has to happen at compile time, of course, so I don't know >> what you could do in Python. Is there any way to change how the compiler >> behaves even vaguely along these lines? > I think things like "from __future__ import ..." do something similar, > but I'm not sure it will work in this case (also because you will have > to provide the list of aliases somehow). Ah yes, that's right. Hm. I bet then it *would* be possible, just perhaps a bit of a run-around to get there. Not a high priority, but interesting. > less readable than: > > def my_capitalize(s): > return s[0].upper() + s[1:].lower() > You could argue that the first is much more explicit and in a way > clearer, but overall I think you agree with me that is less readable. Certainly. It's a bit like the way bug rate per lines of code is invariant across programming languages. When you have more opcodes, it gets harder to understand because there are more interactions and things to remember. >> That really isn't right. A cased character is one with the Unicode "Cased" >> property, and a lowercase character is one wiht the Unicode "Lowercase" >> property. The General Category is actually immaterial here. > You might want to take a look and possibly add a comment on #12204 about this. >> I've spent all bloody day trying to model Python's islower, isupper, and istitle >> functions, but I get all kinds of errors, both in the definitions and in the >> models of the definitions. > If by "model" you mean "trying to figure out how they work", it's > probably easier to look at the implementation (I assume you know > enough C to understand what they do). You can find the code for > str.istitle() at http://hg.python.org/cpython/file/default/Objects/un- > icodeobject.c#l10358 and the actual implementation of some macros like > Py_UNICODE_ISTITLE at > http://hg.python.org/cpython/file/default/Objects/unicodectype.c. Thanks, that helps immensely. I'm completely fluent in C. I've gone and built a tags file of your whole v3.2 source tree to help me navigate. The main underlying problem is that the internal macros are defined in a way that made sense a long time ago, but no longer do ever since (for example) the Unicode lowercase property stopped being synonymous with GC=Ll and started also including all code points with the Other_Lowercase property as well. The originating culprit is Tools/unicode/makeunicodedata.py. It builds your tables only using UnicodeData.txt, which is not enough. For example: if category in ["Lm", "Lt", "Lu", "Ll", "Lo"]: flags |= ALPHA_MASK if category == "Ll": flags |= LOWER_MASK if 'Line_Break' in properties or bidirectional == "B": flags |= LINEBREAK_MASK linebreaks.append(char) if category == "Zs" or bidirectional in ("WS", "B", "S"): flags |= SPACE_MASK spaces.append(char) if category == "Lt": flags |= TITLE_MASK if category == "Lu": flags |= UPPER_MASK It needs to use DerivedCoreProperties.txt to figure out whether something is Other_Uppercase, Other_Lowercase, etc. In particular: Alphabetic := Lu+Ll+Lt+Lm+Lo + Nl + Other_Alphabetic Lowercase := Ll + Other_Lowercase Uppercase := Ll + Other_Uppercase This affects a lot of things, but you should be able to just fix it in Tools/unicode/makeunicodedata.py and have all of them start working correctly. You will probably also want to add Py_UCS4 _PyUnicode_IsWord(Py_UCS4 ch) that uses the UTS#18 Annex C definition, so that you catch marks, too. That definition is: Word := Alphabetic + Mc+Me+Mn + Nd + Pc where Alphabetic is defined above to include Nl and Other_Alphabetic. Soemwhat related is stuff like this: typedef struct { const Py_UCS4 upper; const Py_UCS4 lower; const Py_UCS4 title; const unsigned char decimal; const unsigned char digit; const unsigned short flags; } _PyUnicode_TypeRecord; There are two different bugs here. First, you are missing const Py_UCS4 fold; which is another field from UnicodeData.txt, one that is critical for doing case-insensitive matches correctly. Second, there's also the problem that Py_UCS4 is an int. That means you are stuck with just the character-based simple versions of upper-, title-, lower-, and foldcase. You need to have fields for the full mappings, which are now strings (well, int arrays) not single ints. I'll use ??? for the int-array type that I don't know: const ??? upper_full; const ??? lower_full; const ??? title_full; const ??? fold_full; You will also need to extend the API from just Py_UCS4 _PyUnicode_ToUppercase(Py_UCS4 ch) to something like ??? _PyUnicode_ToUppercase_Full(Py_UCS4 ch) I don't know what the ??? return type is there, but it's whatever the upper_full filed in _PyUnicode_TypeRecord would be. I know that Matthew Barnett has had to cover a bunch of these for his regex module, including generating his own tables. It might be possible to piggy-back on that effort; certainly it would be desirable to try. > I really don't understand any of these functions. I'm very sad. I think they are > wrong, but maybe I am. It is extremely confusing. >> Shall I file a separate bug report? > If after reading the code and/or the documentation you still think > they are broken and/or that they can be improved, then you can open > another issue. I handn't actually *looked* at capitalize yet, because I stumbled over these errors in the way-underlying code that necessarily supports it. The errors in definitions explain a lot of what I was Ok, more bugs. Consider this: static int fixcapitalize(PyUnicodeObject *self) { Py_ssize_t len = self->length; Py_UNICODE *s = self->str; int status = 0; if (len == 0) return 0; if (Py_UNICODE_ISLOWER(*s)) { *s = Py_UNICODE_TOUPPER(*s); status = 1; } s++; while (--len > 0) { if (Py_UNICODE_ISUPPER(*s)) { *s = Py_UNICODE_TOLOWER(*s); status = 1; } s++; } return status; } There are several bugs there. First, you have to use the TITLECASE if there is one, and only use the uppercase if there is no titlecase. Uppercase is wrong. Second, you cannot decide to do the case change only if it starts out as a certain case. You have to do it unconditionally, especially since your tests for whether something is upper or lower are wrong. For example, Roman numerals, the iota subscript, the circled letters, and a few other things all are case-changing but are not themselves Letters in the GC=Ll/Lu/Lt sense. Also, there are also cased letters in the GC=Lm category, which you miss. Unicode has properties like Cased that you should be using to determine whether something is cased. It also have properties like Changes_When_Uppercased (aka CWU) that tell you whether something will change. For example, most of the small capitals are cased code points that are considered lowercase and which do not change when uppercase. However, The LATIN SMALL CAPITAL R (which is a lowercase code point) actually does have an uppercase mapping. Strange but true. Does this help at all? I have to go to a meeting now. --tom ---------- title: \N{...} neglects formal aliases and named sequences from Unicode charnames namespace -> \N{...} neglects formal aliases and named sequences from Unicode charnames namespace _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 21:01:04 2011 From: report at bugs.python.org (Brian Curtin) Date: Mon, 03 Oct 2011 19:01:04 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317668464.52.0.28107553647.issue13081@psf.upfronthosting.co.za> Brian Curtin added the comment: I recently created "minidumper" to write Visual Studio "MiniDump" files of interpreter crashes, but it's currently only available on 3.x. If I port it to 2.x, you could add "import minidumper;minidumper.enable()" to the top of your script, then we could probably get somewhere with it. An additional example script, possibly including sample data to run through it, would be even better. ---------- nosy: +brian.curtin _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 21:12:37 2011 From: report at bugs.python.org (Victor Semionov) Date: Mon, 03 Oct 2011 19:12:37 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317669157.48.0.434474447977.issue13070@psf.upfronthosting.co.za> Victor Semionov added the comment: Any plans to fix this in the next release? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 21:19:40 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Mon, 03 Oct 2011 19:19:40 +0000 Subject: [issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace In-Reply-To: <9892.1317668226@chthon> Message-ID: <4E8A0ACA.7000508@v.loewis.de> Martin v. L?wis added the comment: > The main underlying problem is that the internal macros are defined in a > way that made sense a long time ago, but no longer do ever since (for > example) the Unicode lowercase property stopped being synonymous with > GC=Ll and started also including all code points with the > Other_Lowercase property as well. Tom: PLEASE focus on one issue at a time. This is about formal aliases and named sequences, NOT about upper and lower case. If you want to have a discussion about upper and lower case, please open a separate issue. There I would explain why I think your reasoning is flawed (i.e. just because your interpretation of Unicode differs from Python's implementation doesn't already make Python's implementation incorrect - just different). ---------- title: \N{...} neglects formal aliases and named sequences from Unicode charnames namespace -> \N{...} neglects formal aliases and named sequences from Unicode charnames namespace _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 22:53:21 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Mon, 03 Oct 2011 20:53:21 +0000 Subject: [issue13001] test_socket.testRecvmsgTrunc failure on FreeBSD 7.2 buildbot In-Reply-To: <1316270071.04.0.179563462632.issue13001@psf.upfronthosting.co.za> Message-ID: <1317675201.52.0.0357782599526.issue13001@psf.upfronthosting.co.za> Changes by Charles-Fran?ois Natali : ---------- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 23:18:06 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Mon, 03 Oct 2011 21:18:06 +0000 Subject: [issue12156] test_multiprocessing.test_notify_all() timeout (1 hour) on FreeBSD 7.2 In-Reply-To: <1306147254.92.0.286085135906.issue12156@psf.upfronthosting.co.za> Message-ID: <1317676686.26.0.639651817903.issue12156@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: test_multiprocessing frequently hangs on FreeBSD < 8 buildbots, and this probably has to do with the limit on the max number of POSIX semaphores: """ ====================================================================== ERROR: test_notify_all (test.test_multiprocessing.WithProcessesTestCondition) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_multiprocessing.py", line 777, in test_notify_all cond = self.Condition() File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/multiprocessing/__init__.py", line 189, in Condition return Condition(lock) File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/multiprocessing/synchronize.py", line 198, in __init__ self._lock = lock or RLock() File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/multiprocessing/synchronize.py", line 172, in __init__ SemLock.__init__(self, RECURSIVE_MUTEX, 1, 1) File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/multiprocessing/synchronize.py", line 75, in __init__ sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue) OSError: [Errno 23] Too many open files in system """ There are probably dangling semaphores, since the test doesn't use that much POSIX semaphores. Either way, we can't do much about it... ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 23:25:44 2011 From: report at bugs.python.org (Stefan Krah) Date: Mon, 03 Oct 2011 21:25:44 +0000 Subject: [issue12210] test_smtplib: intermittent failures on FreeBSD In-Reply-To: <1306700978.51.0.107374125859.issue12210@psf.upfronthosting.co.za> Message-ID: <1317677144.07.0.214175759376.issue12210@psf.upfronthosting.co.za> Stefan Krah added the comment: I haven't seen this in a while, so let's assume it's fixed. ---------- resolution: -> out of date stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 23:32:40 2011 From: report at bugs.python.org (jfalskfjdsl;akfdjsa;l laksfj;aslkfdj;sal) Date: Mon, 03 Oct 2011 21:32:40 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317677560.73.0.136480849626.issue13071@psf.upfronthosting.co.za> jfalskfjdsl;akfdjsa;l laksfj;aslkfdj;sal added the comment: ok i have solved the problem it was the same as issue 4765 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Mon Oct 3 23:36:57 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Mon, 03 Oct 2011 21:36:57 +0000 Subject: [issue13071] IDLE refuses to open on windows 7 In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317677817.06.0.424674935975.issue13071@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: What did you do to solve the problem? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 00:08:17 2011 From: report at bugs.python.org (Brent Payne) Date: Mon, 03 Oct 2011 22:08:17 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317679697.41.0.986771333914.issue7689@psf.upfronthosting.co.za> Changes by Brent Payne : ---------- nosy: +Brent.Payne _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 00:09:07 2011 From: report at bugs.python.org (Terry J. Reedy) Date: Mon, 03 Oct 2011 22:09:07 +0000 Subject: [issue4765] IDLE fails to "Delete Custom Key Set" properly In-Reply-To: <1230525520.13.0.417394561266.issue4765@psf.upfronthosting.co.za> Message-ID: <1317679747.31.0.845413080677.issue4765@psf.upfronthosting.co.za> Terry J. Reedy added the comment: Note the config-main.cfg contains all custom configurations and appears if you make any one of them. Mine currently says [EditorWindow] font = lucida sans unicode [General] autosave = 1 So deleting it is a hack workaround until the bug is fixed. ---------- nosy: +terry.reedy _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 00:15:19 2011 From: report at bugs.python.org (Terry J. Reedy) Date: Mon, 03 Oct 2011 22:15:19 +0000 Subject: [issue13071] IDLE accepts, then crashes, on invalid key bindings. In-Reply-To: <1317341122.37.0.866117748717.issue13071@psf.upfronthosting.co.za> Message-ID: <1317680119.5.0.825119778711.issue13071@psf.upfronthosting.co.za> Terry J. Reedy added the comment: While this issue and #4765 are about the same effect, with a similar workaround, Amaury has indentified a separate bug in the custom key mechanism. So I retitled it to refer to that bug. ---------- title: IDLE refuses to open on windows 7 -> IDLE accepts, then crashes, on invalid key bindings. _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 00:23:35 2011 From: report at bugs.python.org (Raymond Hettinger) Date: Mon, 03 Oct 2011 22:23:35 +0000 Subject: [issue13095] Support for splitting lists/tuples into chunks In-Reply-To: <1317667195.74.0.956951285114.issue13095@psf.upfronthosting.co.za> Message-ID: <1317680615.59.0.412900633251.issue13095@psf.upfronthosting.co.za> Raymond Hettinger added the comment: These have been rejected before. There is always a trade-off in adding tools such as this -- it can take more time to learn and remember them than to write a trivial piece of code to do it yourself. Another issue is that people tend to disagree on how to handle an odd sized left-over group -- different use cases require different handling. We're trying to keep the core toolset reasonably small so that python remains simple and learnable. That raises the threshold for adding new tools. ---------- resolution: -> rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 01:08:56 2011 From: report at bugs.python.org (Santoso Wijaya) Date: Mon, 03 Oct 2011 23:08:56 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317683336.46.0.903118608795.issue13081@psf.upfronthosting.co.za> Santoso Wijaya added the comment: Without the aforementioned minidump library, you can also kick off the Python interpreter using a debugger (or have a debugger break into an already-running one) [1]. When the crash happens--presumably the debugger will break at this point--you can export the mini dump into a file for us to look at [2]. [1] I like using windbg (http://msdn.microsoft.com/en-us/windows/hardware/gg463009). [2] It would be something like, `.dump /ma C:\path\to\crash.DMP` ---------- nosy: +santa4nt _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 01:09:02 2011 From: report at bugs.python.org (Santoso Wijaya) Date: Mon, 03 Oct 2011 23:09:02 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317683342.64.0.938641159316.issue13081@psf.upfronthosting.co.za> Changes by Santoso Wijaya : ---------- type: -> crash _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 03:14:46 2011 From: report at bugs.python.org (John O'Connor) Date: Tue, 04 Oct 2011 01:14:46 +0000 Subject: [issue12053] Add prefetch() for Buffered IO (experiment) In-Reply-To: <1305056245.15.0.0737815347229.issue12053@psf.upfronthosting.co.za> Message-ID: <1317690886.94.0.997272218868.issue12053@psf.upfronthosting.co.za> John O'Connor added the comment: Here is an update with the C implementation. I think a working prototype will be helpful before another round on python-dev. I'm not sure how to handle unseekable, non-blocking streams where the read returns before `skip` bytes are exhausted. If prefetch() returns 0, then the caller would then have to use tell() to ensure subsequent reads are sane. In other words it seems prefetch() will leave the stream in an unpredictable state. Antoine, what are your thoughts? ---------- Added file: http://bugs.python.org/file23308/prefetch.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 05:01:44 2011 From: report at bugs.python.org (Matt Joiner) Date: Tue, 04 Oct 2011 03:01:44 +0000 Subject: [issue1887] distutils doesn't support out-of-source builds In-Reply-To: <1200961919.23.0.699852435444.issue1887@psf.upfronthosting.co.za> Message-ID: <1317697304.47.0.459721676277.issue1887@psf.upfronthosting.co.za> Changes by Matt Joiner : ---------- nosy: +anacrolix _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 05:40:21 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 03:40:21 +0000 Subject: [issue12881] ctypes: segfault with large structure field names In-Reply-To: <1314935452.01.0.397268917791.issue12881@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset aa3ebc2dfc15 by Meador Inge in branch '2.7': Issue #12881: ctypes: Fix segfault with large structure field names. http://hg.python.org/cpython/rev/aa3ebc2dfc15 New changeset d05350c14e77 by Meador Inge in branch '3.2': Issue #12881: ctypes: Fix segfault with large structure field names. http://hg.python.org/cpython/rev/d05350c14e77 New changeset 2eab632864f6 by Meador Inge in branch 'default': Issue #12881: ctypes: Fix segfault with large structure field names. http://hg.python.org/cpython/rev/2eab632864f6 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 05:47:39 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 03:47:39 +0000 Subject: [issue13096] ctypes: segfault with large POINTER type names Message-ID: <1317700059.13.0.375039668353.issue13096@psf.upfronthosting.co.za> New submission from Meador Inge : Reproducible in 2.7 and tip: [meadori at motherbrain cpython]$ ./python Python 3.3.0a0 (default:61de28fa5537+d05350c14e77+, Oct 3 2011, 21:47:04) [GCC 4.6.0 20110603 (Red Hat 4.6.0-10)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from ctypes import * >>> T = type('x' * 2 ** 25, (Structure,), {}) >>> p = POINTER(T) Segmentation fault (core dumped) ---------- components: Extension Modules, ctypes messages: 144850 nosy: amaury.forgeotdarc, belopolsky, meador.inge priority: normal severity: normal stage: needs patch status: open title: ctypes: segfault with large POINTER type names type: crash versions: Python 2.7, Python 3.2, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 05:51:56 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 03:51:56 +0000 Subject: [issue13096] ctypes: segfault with large POINTER type names In-Reply-To: <1317700059.13.0.375039668353.issue13096@psf.upfronthosting.co.za> Message-ID: <1317700316.99.0.785254056011.issue13096@psf.upfronthosting.co.za> Meador Inge added the comment: There is similar crasher to this one that can be reproduced like: [meadori at motherbrain cpython]$ ./python Python 3.3.0a0 (default:61de28fa5537+d05350c14e77+, Oct 3 2011, 21:47:04) [GCC 4.6.0 20110603 (Red Hat 4.6.0-10)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from ctypes import * >>> p = POINTER('x' * 2 ** 25) Segmentation fault (core dumped) It should be fixed as well. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 05:57:17 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 03:57:17 +0000 Subject: [issue13097] ctypes: segfault with large number of callback arguments Message-ID: <1317700637.71.0.186696346618.issue13097@psf.upfronthosting.co.za> New submission from Meador Inge : Reproducible in 2.7 and tip: [meadori at motherbrain cpython]$ ./python Python 3.3.0a0 (default:61de28fa5537+d05350c14e77+, Oct 3 2011, 21:47:04) [GCC 4.6.0 20110603 (Red Hat 4.6.0-10)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from ctypes import * >>> NARGS = 2 ** 20 >>> proto = CFUNCTYPE(None, *(c_int,) * NARGS) >>> def func(*args): ... return (1, "abc", None) ... >>> cb = proto(func) >>> cb(*(1,) * NARGS) Segmentation fault (core dumped) ---------- components: ctypes messages: 144852 nosy: amaury.forgeotdarc, belopolsky, meador.inge priority: normal severity: normal stage: needs patch status: open title: ctypes: segfault with large number of callback arguments type: crash versions: Python 2.7, Python 3.2, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 05:58:14 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 03:58:14 +0000 Subject: [issue12881] ctypes: segfault with large structure field names In-Reply-To: <1314935452.01.0.397268917791.issue12881@psf.upfronthosting.co.za> Message-ID: <1317700694.67.0.00274053256914.issue12881@psf.upfronthosting.co.za> Meador Inge added the comment: Fixed. Opened issue13096 and issue13097 for the other crashers. ---------- resolution: -> fixed stage: commit review -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 06:21:34 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 04:21:34 +0000 Subject: [issue12880] ctypes: clearly document how structure bit fields are allocated In-Reply-To: <1317403151.53.0.993055557363.issue12880@psf.upfronthosting.co.za> Message-ID: Meador Inge added the comment: On Fri, Sep 30, 2011 at 12:19 PM, Vlad Riscutia wrote: > I believe this is the better thing to do rather than detailing how GCC and MSVC allocated their bitfields because that would just > encourage people to use this feature incorrectly. So clearly documenting how a feature works will cause people to use the feature incorrectly? I think not. In any case, I agree that documenting the low-level specifics of each compiler's algorithm is too much. > Most bugs opened on bit fields are because people are toying with the underlying buffer and get other results than what they expect. The issues that I have looked at (issue6069, issue11920, and issue11920) all involve fundamental misunderstandings of *how* the structure layout is determined. I don't know if I would generalize these misunderstanding as "toying with the underlying buffer". Some times people need to know the exact layout for proper C interop. In some of the bugs reported folks are casting buffers in an attempt to discover the structure layout since it is not clearly documented. The general content of your patch seems reasonable. I will provide more specific comments shortly. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 06:49:47 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 04:49:47 +0000 Subject: [issue12880] ctypes: clearly document how structure bit fields are allocated In-Reply-To: <1314930851.91.0.112444630543.issue12880@psf.upfronthosting.co.za> Message-ID: <1317703787.61.0.662563806056.issue12880@psf.upfronthosting.co.za> Meador Inge added the comment: Added some comments in rietveld. P.S. watch out for trailing whitespace when writing patches. Use 'make patchcheck' to help find bad whitespace formatting. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 07:42:03 2011 From: report at bugs.python.org (Lance Hepler) Date: Tue, 04 Oct 2011 05:42:03 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317706923.08.0.157657135549.issue7689@psf.upfronthosting.co.za> Lance Hepler added the comment: Hello all, sorry to be a bother, but what's the progress on this issue? I have a codebase that requires resolution of this issue to enable multiprocessing. What are the remaining outstanding problems herein preventing the attached patches from being merged? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 08:04:23 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 06:04:23 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317708263.34.0.612597595218.issue7689@psf.upfronthosting.co.za> Antoine Pitrou added the comment: + def __eq__(self, other): + r = (type(self) == type(other)) + if r: + return r I think this should be "if not r". ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 08:57:21 2011 From: report at bugs.python.org (Craig Citro) Date: Tue, 04 Oct 2011 06:57:21 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317711441.65.0.305999199855.issue7689@psf.upfronthosting.co.za> Craig Citro added the comment: Antoine -- why do you want to switch "if r" for "if not r"? If we did, the test would just confirm that the unpicked object was of the same type as the original; if we were going to do that, we might as well just replace the whole `__cmp__` function with just `return cmp(type(self), type(other))`. On the flipside, I could see an argument for adding *more* content to the test, but that seemed like overkill. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 09:03:05 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 07:03:05 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317711785.67.0.671613891867.issue7689@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Craig: I'm talking about the __eq__ version (durban's patch). The __cmp__ version is probably fine. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 09:13:12 2011 From: report at bugs.python.org (Craig Citro) Date: Tue, 04 Oct 2011 07:13:12 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317712392.73.0.349808009542.issue7689@psf.upfronthosting.co.za> Craig Citro added the comment: Antoine -- ah, that makes sense. Is that the only blocker? I've let this patch rot on the vine a long time; if so, I'll happily switch `__eq__` back to `__cmp__` and re-post if it'll get submitted. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 09:14:50 2011 From: report at bugs.python.org (STINNER Victor) Date: Tue, 04 Oct 2011 07:14:50 +0000 Subject: [issue12156] test_multiprocessing.test_notify_all() timeout (1 hour) on FreeBSD 7.2 In-Reply-To: <1306147254.92.0.286085135906.issue12156@psf.upfronthosting.co.za> Message-ID: <1317712490.61.0.26671581503.issue12156@psf.upfronthosting.co.za> STINNER Victor added the comment: > "OSError: [Errno 23] Too many open files in system" Yes, see issue #10348. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 09:25:11 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 07:25:11 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317713111.47.0.747778371133.issue7689@psf.upfronthosting.co.za> Antoine Pitrou added the comment: No need, I'll do it myself. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 09:34:56 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 07:34:56 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 760ac320fa3d by Antoine Pitrou in branch '3.2': Issue #7689: Allow pickling of dynamically created classes when their http://hg.python.org/cpython/rev/760ac320fa3d New changeset 46c026a5ccb9 by Antoine Pitrou in branch 'default': Issue #7689: Allow pickling of dynamically created classes when their http://hg.python.org/cpython/rev/46c026a5ccb9 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 09:39:15 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 07:39:15 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 64053bd79590 by Antoine Pitrou in branch '2.7': Issue #7689: Allow pickling of dynamically created classes when their http://hg.python.org/cpython/rev/64053bd79590 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 09:40:55 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 07:40:55 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317714055.24.0.599683161763.issue7689@psf.upfronthosting.co.za> Antoine Pitrou added the comment: This is fixed now, thank you! ---------- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed versions: -Python 3.1 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 10:12:37 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 08:12:37 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317715957.46.0.42408983329.issue6715@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Based on Amaury's report, I would suggest going forward integrating the xz module for configure-based systems, and letting someone else handle Windows integration later if a solution is found. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 11:52:30 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 09:52:30 +0000 Subject: [issue13098] the struct module should support storage for size_t / Py_ssize_t C types Message-ID: <1317721950.11.0.748294685734.issue13098@psf.upfronthosting.co.za> New submission from Antoine Pitrou : Title says it all. ---------- components: Library (Lib) messages: 144867 nosy: mark.dickinson, meador.inge, pitrou, skrah priority: normal severity: normal stage: needs patch status: open title: the struct module should support storage for size_t / Py_ssize_t C types type: feature request versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:33:02 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 10:33:02 +0000 Subject: [issue13087] C BufferedReader seek() is inconsistent with UnsupportedOperation for unseekable streams In-Reply-To: <1317508369.4.0.585699022019.issue13087@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset d287f0654349 by Antoine Pitrou in branch '3.2': Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation http://hg.python.org/cpython/rev/d287f0654349 New changeset 0cf38407a3a2 by Antoine Pitrou in branch 'default': Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation http://hg.python.org/cpython/rev/0cf38407a3a2 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:37:14 2011 From: report at bugs.python.org (Wong Wah Meng) Date: Tue, 04 Oct 2011 10:37:14 +0000 Subject: [issue12876] Make Test Error : ImportError: No module named _sha256 In-Reply-To: <1314873741.02.0.732192179897.issue12876@psf.upfronthosting.co.za> Message-ID: <1317724634.39.0.446895549236.issue12876@psf.upfronthosting.co.za> Wong Wah Meng added the comment: Hello there, I am encountering more modules/commands that uses the harslib that needs _sha256. I still haven't found a solution. Can anyone shed some lights here whether or not this is related to the way I "include" and "link" the library, or is that the _sha256 is not found on the server though I already have the server installed with the latest OpenSSL software? Thanks in advance for any reply, and I appreciate your input. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:39:18 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 10:39:18 +0000 Subject: [issue13087] C BufferedReader seek() is inconsistent with UnsupportedOperation for unseekable streams In-Reply-To: <1317508369.4.0.585699022019.issue13087@psf.upfronthosting.co.za> Message-ID: <1317724758.38.0.38099691856.issue13087@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Committed, thanks! ---------- resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:40:42 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 10:40:42 +0000 Subject: [issue12807] Optimization/refactoring for {bytearray, bytes, unicode}.strip() In-Reply-To: <1317508893.41.0.642593920744.issue12807@psf.upfronthosting.co.za> Message-ID: <1317724630.3784.0.camel@localhost.localdomain> Antoine Pitrou added the comment: > The patch no longer applies cleanly. Is there enough interest in this to justify rebasing? Yes, I think it's worth it. ---------- title: Optimization/refactoring for {bytearray,bytes,unicode}.strip() -> Optimization/refactoring for {bytearray, bytes, unicode}.strip() _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:42:14 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 10:42:14 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317491223.79.0.275640510789.issue13070@psf.upfronthosting.co.za> Message-ID: <1317724722.3784.2.camel@localhost.localdomain> Antoine Pitrou added the comment: > > Also, is it ok to just return NULL or should the error state also be > > set? > > Well, I'm not sure, that why I made you and Amaury noisy :-) > AFAICT, this is the only case where _check_closed can encounter a NULL > self->writer. Probably. OTOH, not setting the error state when returning NULL is usually an error (and can result in difficult-to-debug problems), so let's stay on the safe side. > Furthermore, I'm not sure about what kind of error would make sense here. RuntimeError perhaps. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:46:39 2011 From: report at bugs.python.org (Thomas Kluyver) Date: Tue, 04 Oct 2011 10:46:39 +0000 Subject: [issue13099] Sqlite3 & turkish locale Message-ID: <1317725199.69.0.486458552577.issue13099@psf.upfronthosting.co.za> New submission from Thomas Kluyver : When using sqlite3 with the Turkish locale, cursor.lastrowid is not accessible after an insert statement if "INSERT" is upper case. I believe that the cause is that the detect_statement_kind function [1] calls the locale-dependent C function tolower(). The Turkish locale specifies a different case mapping for I (to a dotless lowercase i: ?), so it's not recognised as an insert statement, which looks like it will cause the transaction to be committed immediately. See also the discussion on issue 1813 [2], and a Redhat bug with a test case for this [3]. [1] http://hg.python.org/cpython/file/c4b6d9312da1/Modules/_sqlite/cursor.c#l41 [2] http://bugs.python.org/issue1813 [3] https://bugzilla.redhat.com/show_bug.cgi?id=720209 ---------- components: Extension Modules messages: 144873 nosy: takluyver priority: normal severity: normal status: open title: Sqlite3 & turkish locale versions: Python 2.7, Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:55:43 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 10:55:43 +0000 Subject: [issue13063] test_concurrent_futures failures on Windows: IOError('[Errno 232] The pipe is being closed') on _send_bytes() In-Reply-To: <1317323234.29.0.73593324718.issue13063@psf.upfronthosting.co.za> Message-ID: <1317725743.29.0.85105754409.issue13063@psf.upfronthosting.co.za> Antoine Pitrou added the comment: I think the solution would be to map ERROR_NO_DATA (232 - "The pipe is being closed") to EPIPE. Attached patch. ---------- keywords: +patch Added file: http://bugs.python.org/file23309/error_no_data.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 12:56:25 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 10:56:25 +0000 Subject: [issue13063] test_concurrent_futures failures on Windows: IOError('[Errno 232] The pipe is being closed') on _send_bytes() In-Reply-To: <1317323234.29.0.73593324718.issue13063@psf.upfronthosting.co.za> Message-ID: <1317725785.44.0.811058233234.issue13063@psf.upfronthosting.co.za> Changes by Antoine Pitrou : ---------- components: +Tests nosy: +amaury.forgeotdarc stage: -> patch review type: -> behavior versions: +Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:00:10 2011 From: report at bugs.python.org (Mark Dickinson) Date: Tue, 04 Oct 2011 11:00:10 +0000 Subject: [issue13098] the struct module should support storage for size_t / Py_ssize_t C types In-Reply-To: <1317721950.11.0.748294685734.issue13098@psf.upfronthosting.co.za> Message-ID: <1317726010.95.0.659499145266.issue13098@psf.upfronthosting.co.za> Mark Dickinson added the comment: See issue #3163. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:07:03 2011 From: report at bugs.python.org (Victor Semionov) Date: Tue, 04 Oct 2011 11:07:03 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317726423.54.0.203502621213.issue13070@psf.upfronthosting.co.za> Victor Semionov added the comment: > Probably. OTOH, not setting the error state when returning NULL is > usually an error (and can result in difficult-to-debug problems), so > let's stay on the safe side. > > > Furthermore, I'm not sure about what kind of error would make sense here. > > RuntimeError perhaps. Does that mean that an application will see a Python exception? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:19:33 2011 From: report at bugs.python.org (Ezio Melotti) Date: Tue, 04 Oct 2011 11:19:33 +0000 Subject: [issue13099] Sqlite3 & turkish locale In-Reply-To: <1317725199.69.0.486458552577.issue13099@psf.upfronthosting.co.za> Message-ID: <1317727173.6.0.887209522737.issue13099@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +ezio.melotti stage: -> test needed type: -> behavior _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:32:50 2011 From: report at bugs.python.org (Thomas Kluyver) Date: Tue, 04 Oct 2011 11:32:50 +0000 Subject: [issue13099] Sqlite3 & turkish locale In-Reply-To: <1317725199.69.0.486458552577.issue13099@psf.upfronthosting.co.za> Message-ID: <1317727970.85.0.819713692309.issue13099@psf.upfronthosting.co.za> Thomas Kluyver added the comment: What form does the test need to be in? There's a script at the redhat bug I linked that demonstrates the issue. Do I need to turn it into a function? A patch for the existing test suite? ---------- type: behavior -> _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:35:11 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Tue, 04 Oct 2011 11:35:11 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317728111.84.0.0174860160427.issue13070@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: An "unraisable exception" warning will be displayed. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:35:51 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Tue, 04 Oct 2011 11:35:51 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317728151.55.0.216157545721.issue13070@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: > Probably. OTOH, not setting the error state when returning NULL is > usually an error (and can result in difficult-to-debug problems), so > let's stay on the safe side. > RuntimeError perhaps. OK, I'll update the patch accordingly. > Does that mean that an application will see a Python exception? No, the finalization code explicitly clears any exception set. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:40:54 2011 From: report at bugs.python.org (Ezio Melotti) Date: Tue, 04 Oct 2011 11:40:54 +0000 Subject: [issue13099] Sqlite3 & turkish locale In-Reply-To: <1317725199.69.0.486458552577.issue13099@psf.upfronthosting.co.za> Message-ID: <1317728454.2.0.60189687819.issue13099@psf.upfronthosting.co.za> Ezio Melotti added the comment: A patch against Lib/sqlite3/test/regression.py would be nice. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:41:46 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 11:41:46 +0000 Subject: [issue13099] Sqlite3 & turkish locale In-Reply-To: <1317725199.69.0.486458552577.issue13099@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 469555867244 by Antoine Pitrou in branch '3.2': Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. http://hg.python.org/cpython/rev/469555867244 New changeset 652e2dacbf4b by Antoine Pitrou in branch 'default': Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. http://hg.python.org/cpython/rev/652e2dacbf4b New changeset 89713606b654 by Antoine Pitrou in branch '2.7': Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. http://hg.python.org/cpython/rev/89713606b654 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:42:17 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 11:42:17 +0000 Subject: [issue13099] Sqlite3 & turkish locale In-Reply-To: <1317725199.69.0.486458552577.issue13099@psf.upfronthosting.co.za> Message-ID: <1317728537.85.0.510696997906.issue13099@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Fixed, thank you. ---------- nosy: +pitrou resolution: -> fixed stage: test needed -> committed/rejected status: open -> closed versions: +Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:45:19 2011 From: report at bugs.python.org (Thomas Kluyver) Date: Tue, 04 Oct 2011 11:45:19 +0000 Subject: [issue13099] Sqlite3 & turkish locale In-Reply-To: <1317725199.69.0.486458552577.issue13099@psf.upfronthosting.co.za> Message-ID: <1317728719.71.0.19420461483.issue13099@psf.upfronthosting.co.za> Thomas Kluyver added the comment: Thanks, Antoine. Should I still try to write a regression test for it? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 13:56:05 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 11:56:05 +0000 Subject: [issue13099] Sqlite3 & turkish locale In-Reply-To: <1317728719.71.0.19420461483.issue13099@psf.upfronthosting.co.za> Message-ID: <1317729152.3784.3.camel@localhost.localdomain> Antoine Pitrou added the comment: > Thanks, Antoine. Should I still try to write a regression test for it? I've had issues writing regression tests for other Turkish locale-related failures (namely, there are other bugs in some glibcs that could cause the test to fail anyway). I'm not sure it's worth it here. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 14:08:00 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 12:08:00 +0000 Subject: [issue13098] the struct module should support storage for size_t / Py_ssize_t C types In-Reply-To: <1317721950.11.0.748294685734.issue13098@psf.upfronthosting.co.za> Message-ID: <1317730080.63.0.0483052657804.issue13098@psf.upfronthosting.co.za> Changes by Antoine Pitrou : ---------- resolution: -> duplicate status: open -> closed superseder: -> module struct support for ssize_t and size_t _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 14:08:27 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 12:08:27 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317730107.96.0.330061637666.issue3163@psf.upfronthosting.co.za> Antoine Pitrou added the comment: #3163 is a duplicate. ---------- nosy: +pitrou priority: low -> normal stage: patch review -> needs patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 14:08:50 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 12:08:50 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317730130.0.0.617855999608.issue3163@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Ooops, I meant #13098. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 14:21:09 2011 From: report at bugs.python.org (Stefan Krah) Date: Tue, 04 Oct 2011 12:21:09 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317730869.69.0.846288828958.issue3163@psf.upfronthosting.co.za> Changes by Stefan Krah : ---------- nosy: +skrah _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 14:37:46 2011 From: report at bugs.python.org (STINNER Victor) Date: Tue, 04 Oct 2011 12:37:46 +0000 Subject: [issue13063] test_concurrent_futures failures on Windows: IOError('[Errno 232] The pipe is being closed') on _send_bytes() In-Reply-To: <1317323234.29.0.73593324718.issue13063@psf.upfronthosting.co.za> Message-ID: <1317731866.31.0.583204642071.issue13063@psf.upfronthosting.co.za> STINNER Victor added the comment: > Attached patch. Could you please explain your change in generrmap.c in a comment? For example, just add a reference to this issue. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 14:41:28 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 12:41:28 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317732088.5.0.945349865438.issue3163@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Here is a patch. ---------- Added file: http://bugs.python.org/file23310/struct_nn.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 15:03:14 2011 From: report at bugs.python.org (=?utf-8?q?Bo=C5=A1tjan_Mejak?=) Date: Tue, 04 Oct 2011 13:03:14 +0000 Subject: [issue12326] Linux 3: code should avoid using sys.platform == 'linux2' In-Reply-To: <1307975682.23.0.138592930251.issue12326@psf.upfronthosting.co.za> Message-ID: <1317733394.1.0.349233272923.issue12326@psf.upfronthosting.co.za> Bo?tjan Mejak added the comment: I have a better idea... Why don't we change the "linux2" string into just "linux". That way we will never run into this kind of issue, even in the future when Linux kernel version 4 is going to exist. Any thoughts on this? ---------- nosy: +Retro _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 16:24:54 2011 From: report at bugs.python.org (Barry A. Warsaw) Date: Tue, 04 Oct 2011 14:24:54 +0000 Subject: [issue12326] Linux 3: code should avoid using sys.platform == 'linux2' In-Reply-To: <1317733394.1.0.349233272923.issue12326@psf.upfronthosting.co.za> Message-ID: <20111004102446.1218c227@resist.wooz.org> Barry A. Warsaw added the comment: On Oct 04, 2011, at 01:03 PM, Bo?tjan Mejak wrote: >I have a better idea... Why don't we change the "linux2" string into just >"linux". That way we will never run into this kind of issue, even in the >future when Linux kernel version 4 is going to exist. Any thoughts on this? Python 3.3 already sets sys.platform to 'linux'. It can't be done for older versions due to backward compatibility. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 16:26:37 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 14:26:37 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317738397.63.0.922330981396.issue3163@psf.upfronthosting.co.za> Meador Inge added the comment: Mostly LGTM. I have a few comments in rietveld. ---------- nosy: +meador.inge stage: needs patch -> patch review _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 16:34:13 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 14:34:13 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317738853.04.0.924661648783.issue3163@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Thanks. I have answered one of your comments, and here is a new patch. ---------- Added file: http://bugs.python.org/file23311/struct_nn2.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 16:54:44 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 14:54:44 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317740084.86.0.513812335462.issue3163@psf.upfronthosting.co.za> Antoine Pitrou added the comment: New patch with cosmetic doc fix. ---------- Added file: http://bugs.python.org/file23312/struct_nn3.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 17:21:46 2011 From: report at bugs.python.org (Vlad Riscutia) Date: Tue, 04 Oct 2011 15:21:46 +0000 Subject: [issue12880] ctypes: clearly document how structure bit fields are allocated In-Reply-To: <1314930851.91.0.112444630543.issue12880@psf.upfronthosting.co.za> Message-ID: <1317741706.31.0.492233657134.issue12880@psf.upfronthosting.co.za> Vlad Riscutia added the comment: Thanks for the "make patchcheck" tip, I didn't know about that. I will update the patch soon. In the mean time, I want to point out a couple of things: First, I'm saying "toying with the underlying buffer" because none of the bugs are actual issues of the form "I created this bitfield structure with Python, passed it to C function but C structure was different". That would be a bitfield bug. All of these bugs are people setting raw memory to some bytes, then looking at bitfield members and not seeing what they expect. Since this is platform dependent, they shouldn't worry about the raw memory as long as C interop works fine. Bitfield layout is complex as it involves both allocation algorithm and structure packing and same Python code will work differently on Windows and Unix. My point is that documenting all this low-level stuff will encourage people to work with the raw memory which will open the door for other issues. I believe it would be better to encourage users to stick to declaring members and accessing them by name as raw memory WILL be different for the same code on different OSes. Second, one of your review comments is: "GCC is used for most Unix systems and Microsoft VC++ is used on Windows.". This is not how ctypes works. Ctypes implements the bitfield allocation algorithm itself, it doesn't use the compiler with which it is built. Basically it says #ifdef WIN32 - allocate like VC++ - #else - allocate like GCC. So it doesn't really matter with which compiler you are building Python. It will still do GCC style allocation on Solaris. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 17:30:13 2011 From: report at bugs.python.org (Oren Held) Date: Tue, 04 Oct 2011 15:30:13 +0000 Subject: [issue9035] os.path.ismount on windows doesn't support windows mount points In-Reply-To: <1277023751.91.0.438330382261.issue9035@psf.upfronthosting.co.za> Message-ID: <1317742213.38.0.423272429033.issue9035@psf.upfronthosting.co.za> Oren Held added the comment: Anything wrong with the following simple approach? (e.g. is it bad to depend on win32file?) def win_ismount(path): import win32file volume_path = win32file.GetVolumePathName(path) return volume_path == path # May have to ignore a trailing backslash ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 17:33:30 2011 From: report at bugs.python.org (Brian Curtin) Date: Tue, 04 Oct 2011 15:33:30 +0000 Subject: [issue9035] os.path.ismount on windows doesn't support windows mount points In-Reply-To: <1277023751.91.0.438330382261.issue9035@psf.upfronthosting.co.za> Message-ID: <1317742410.82.0.629122944017.issue9035@psf.upfronthosting.co.za> Brian Curtin added the comment: We can't depend on stuff from pywin32, but we could expose GetVolumePathName ourselves. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 18:06:24 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 16:06:24 +0000 Subject: [issue13054] sys.maxunicode value after PEP-393 In-Reply-To: <1317224855.38.0.863349241023.issue13054@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset f39b26ca7f3d by Ezio Melotti in branch 'default': #13054: fix usage of sys.maxunicode after PEP-393. http://hg.python.org/cpython/rev/f39b26ca7f3d ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 18:13:23 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Tue, 04 Oct 2011 16:13:23 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317744803.94.0.364164577652.issue6715@psf.upfronthosting.co.za> Changes by ?ric Araujo : Removed file: http://bugs.python.org/file23300/unnamed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 18:17:08 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Tue, 04 Oct 2011 16:17:08 +0000 Subject: [issue12804] "make test" fails on systems without internet access In-Reply-To: <1313935702.14.0.838210022398.issue12804@psf.upfronthosting.co.za> Message-ID: <1317745028.93.0.258450218551.issue12804@psf.upfronthosting.co.za> ?ric Araujo added the comment: > Actually, the skip system is already supposed to work for that if used > properly (see test.support.transient_internet()). However, perhaps it > actually doesn't work in all situations. It?s better than that: nearly all tests requiring network access use skips, it?s only a few modules like test_urllib*net that fails instead of skipping (certainly because it relied on the urlfetch resource being disabled by default). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 18:35:36 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Tue, 04 Oct 2011 16:35:36 +0000 Subject: [issue13055] Distutils tries to handle null versions but fails In-Reply-To: <1317238322.67.0.925527851228.issue13055@psf.upfronthosting.co.za> Message-ID: <1317746136.32.0.51452952004.issue13055@psf.upfronthosting.co.za> ?ric Araujo added the comment: Thanks, will fix it. ---------- assignee: tarek -> eric.araujo components: +Distutils2 nosy: +alexis versions: +3rd party, Python 3.2, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 18:53:09 2011 From: report at bugs.python.org (Brent Payne) Date: Tue, 04 Oct 2011 16:53:09 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317747189.49.0.990398670241.issue7689@psf.upfronthosting.co.za> Brent Payne added the comment: will the 2.7 patch also be incorporated into a 2.7 release? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 18:56:45 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Tue, 04 Oct 2011 16:56:45 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1316207701.26.0.62836662317.issue6715@psf.upfronthosting.co.za> Message-ID: <4E8B3ACA.4020309@v.loewis.de> Martin v. L?wis added the comment: > - liblzma can't be compiled by Visual Studio: too many C99 isms, > mostly variables declared in the middle of a block. It's doable for > sure, but it's a lot of work. I'd be in favor of doing so, and then feeding patches upstream. Hopefully, eventually, the code would compile out of the box on VS. > - liblzma is normally compiled with mingw, but we have to be sure > that is uses the correct MSCRT C runtime, and what about debug > builds? In principle, it's not necessary to use the same CRT, as long as we aren't passing CRT objects across DLL boundaries (memory blocks managed by malloc/free would be candidates). I haven't reviewed the module to find out whether the liblzma interface involves CRT objects. > - The way recommended by XZ is to use a precompiled liblzma.dll; Then > it should be easy to build an extension module, but its would be the > first time that we distribute an extension module which needs a > non-system DLL. Is it enough to copy it next to _lzma.pyd? Is there > some work to do in the installer? It wouldn't actually be the first time. We also ship Tcl DLLs. But it's a pain, so it would be much better if the sources were actually referenced in the VS project - so we would not need a library at all. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:00:01 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Tue, 04 Oct 2011 17:00:01 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317715957.46.0.42408983329.issue6715@psf.upfronthosting.co.za> Message-ID: <4E8B3B8F.4020508@v.loewis.de> Martin v. L?wis added the comment: > Based on Amaury's report, I would suggest going forward integrating > the xz module for configure-based systems, and letting someone else > handle Windows integration later if a solution is found. -1000. I feel quite strongly that this should not be added unless there is also support in the Windows build process for it. I'll see what I can do, but it may take some time - until then, I urge to stall this issue, i.e. not proceed with checking it in. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:05:49 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 17:05:49 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317747949.83.0.631993272959.issue7689@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Yes, it will. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:08:03 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 17:08:03 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <4E8B3B8F.4020508@v.loewis.de> Message-ID: <1317747866.3784.6.camel@localhost.localdomain> Antoine Pitrou added the comment: > > Based on Amaury's report, I would suggest going forward integrating > > the xz module for configure-based systems, and letting someone else > > handle Windows integration later if a solution is found. > > -1000. I feel quite strongly that this should not be added unless there > is also support in the Windows build process for it. Why? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:11:58 2011 From: report at bugs.python.org (STINNER Victor) Date: Tue, 04 Oct 2011 17:11:58 +0000 Subject: [issue13100] sre_compile._optimize_unicode() needs a cleanup Message-ID: <1317748318.81.0.992039803801.issue13100@psf.upfronthosting.co.za> New submission from STINNER Victor : The following comment is wrong, except IndexError: # non-BMP characters; XXX now they should work return charset sys.maxunicode != 65535 is now always true in Python 3.3 if sys.maxunicode != 65535: # XXX: negation does not work with big charsets # XXX2: now they should work, but removing this will make the # charmap 17 times bigger return charset See the related commit: f39b26ca7f3d (from issue #13054). ---------- components: Library (Lib), Regular Expressions, Unicode messages: 144905 nosy: ezio.melotti, haypo, pitrou priority: normal severity: normal status: open title: sre_compile._optimize_unicode() needs a cleanup versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:15:31 2011 From: report at bugs.python.org (Nick Coghlan) Date: Tue, 04 Oct 2011 17:15:31 +0000 Subject: [issue7689] Pickling of classes with a metaclass and copy_reg In-Reply-To: <1263375134.71.0.434114641669.issue7689@psf.upfronthosting.co.za> Message-ID: <1317748531.68.0.113143780359.issue7689@psf.upfronthosting.co.za> Nick Coghlan added the comment: Specifically, 2.7.3. A date for that has not yet been set, but somewhere in the December/January time frame is likely. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:17:52 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 17:17:52 +0000 Subject: [issue11956] 3.3 : test_import.py causes 'make test' to fail In-Reply-To: <1304094086.01.0.944392387954.issue11956@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 7697223df6df by Charles-Fran?ois Natali in branch '3.2': Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as http://hg.python.org/cpython/rev/7697223df6df New changeset 58870fe9a604 by Charles-Fran?ois Natali in branch 'default': Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as http://hg.python.org/cpython/rev/58870fe9a604 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:30:57 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Tue, 04 Oct 2011 17:30:57 +0000 Subject: [issue10348] multiprocessing: use SysV semaphores on FreeBSD In-Reply-To: <1289181459.49.0.601327253766.issue10348@psf.upfronthosting.co.za> Message-ID: <1317749457.07.0.53451424878.issue10348@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: -1 IMHO, implementing SysV semaphores would be a step backwards, plus the API is a real pain. I think there's no reason to complicate the code to accomodate such corner cases, especially since the systems that don't support POSIX semaphores will eventually die out... ---------- nosy: +neologix _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:31:50 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 17:31:50 +0000 Subject: [issue10348] multiprocessing: use SysV semaphores on FreeBSD In-Reply-To: <1289181459.49.0.601327253766.issue10348@psf.upfronthosting.co.za> Message-ID: <1317749510.49.0.399656352609.issue10348@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Agreed with Charles-Fran?ois. ---------- nosy: +pitrou _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:43:37 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Tue, 04 Oct 2011 17:43:37 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317747866.3784.6.camel@localhost.localdomain> Message-ID: <4E8B45C7.10708@v.loewis.de> Martin v. L?wis added the comment: Am 04.10.11 19:08, schrieb Antoine Pitrou: > > Antoine Pitrou added the comment: > >>> Based on Amaury's report, I would suggest going forward integrating >>> the xz module for configure-based systems, and letting someone else >>> handle Windows integration later if a solution is found. >> >> -1000. I feel quite strongly that this should not be added unless there >> is also support in the Windows build process for it. > > Why? This module is only useful in the standard library if it is available on all systems. If it is not available on all systems, it may just as well be available from PyPI only. Now, as it needs to be available on Windows, I want to see the actual Windows support, else we would have to block the release until Windows support is available, possibly then reverting the module because no Windows support is forthcoming. So it's easier not to commit in the first place. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:44:30 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 17:44:30 +0000 Subject: [issue12880] ctypes: clearly document how structure bit fields are allocated In-Reply-To: <1317741706.31.0.492233657134.issue12880@psf.upfronthosting.co.za> Message-ID: Meador Inge added the comment: On Tue, Oct 4, 2011 at 10:21 AM, Vlad Riscutia wrote: > First, I'm saying "toying with the underlying buffer" because none of the bugs are actual issues of the form "I created this bitfield > structure with Python, passed it to C function but C structure was different". That would be a bitfield bug. All of these bugs are people > setting raw memory to some bytes, then looking at bitfield members and not seeing what they expect. Please qualify "all" instead of generalizing. I can point to two issues (issue11990 "I'm generating python code from real c code.", issue12945 "We have raw data packages from some tools. These packages contains bitfields, arrays, simple data and so on.") where C code or raw data was, in fact, involved and the reporters just don't understand what layout algorithm is being used. They may not need to know the specifics of the algorithm, but they *do* need to know if it matches the compiler they are using to do interop or the one that generated the raw data. The reason that we are seeing folks cast raw memory into a cyptes bitfield structure is because they do not understand how the structure layout algorithm works and are trying to figure it out via these examples. > Second, one of your review comments is: "GCC is used for most Unix systems and Microsoft VC++ is used on Windows.". This is not > how ctypes works. Ctypes implements the bitfield allocation algorithm itself, it doesn't use the compiler with which it is built. Basically > it says #ifdef WIN32 - allocate like VC++ - #else - allocate like GCC. So it doesn't really matter with which compiler you are building > Python. It will still do GCC style allocation on Solaris. I understand how it works. This quote is taken somewhat out of context as the preceding sentence is important. Perhaps saying GCC- style and VC++-style would have been more clear. The reason that I mentioned the compiler used to build Python is that it is an easy reference point and more times than not the bitfield allocation and layout *do* match that of the compiler used to build the interpreter. Anyway, I am fine with dropping the "used to build the Python interpreter" and going with something similar to what you originally had. Also, in general, the compiler used to build the ctypes extension *does* matter. Look in 'cfield.c' where all of the native alignments are computed at compile time. These alignments affect the structure layout and are defined by the compiler building the ctypes extension. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:46:23 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 17:46:23 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <4E8B45C7.10708@v.loewis.de> Message-ID: <1317750171.3784.8.camel@localhost.localdomain> Antoine Pitrou added the comment: > This module is only useful in the standard library if it is available > on all systems. Not really. xz is becoming a defacto standard under Linux (and perhaps other free Unices) while I guess it is marginal under Windows. We have other system-specific functionality, and nobody sees it as a bad thing. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:46:36 2011 From: report at bugs.python.org (Meador Inge) Date: Tue, 04 Oct 2011 17:46:36 +0000 Subject: [issue12880] ctypes: clearly document how structure bit fields are allocated In-Reply-To: <1314930851.91.0.112444630543.issue12880@psf.upfronthosting.co.za> Message-ID: <1317750396.47.0.0335003197932.issue12880@psf.upfronthosting.co.za> Meador Inge added the comment: > Look in 'cfield.c' where all of the native alignments Well, not *all* the native alignments, but many of them. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 19:47:27 2011 From: report at bugs.python.org (Ezio Melotti) Date: Tue, 04 Oct 2011 17:47:27 +0000 Subject: [issue13054] sys.maxunicode value after PEP-393 In-Reply-To: <1317224855.38.0.863349241023.issue13054@psf.upfronthosting.co.za> Message-ID: <1317750447.26.0.485302388496.issue13054@psf.upfronthosting.co.za> Ezio Melotti added the comment: The buildbot seems happy, so I'm closing this. Antoine already took care of test_bigmem, and Victor opened #13100 for sre_compile. ---------- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 20:05:57 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Tue, 04 Oct 2011 18:05:57 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317750171.3784.8.camel@localhost.localdomain> Message-ID: <4E8B4B02.60306@v.loewis.de> Martin v. L?wis added the comment: > Not really. xz is becoming a defacto standard under Linux (and perhaps > other free Unices) while I guess it is marginal under Windows. > We have other system-specific functionality, and nobody sees it as a bad > thing. That's because all system-specific functionality that we have really depends on system features which just can't be available elsewhere. For all functionality that in principle works on all systems, it also actually works on all systems for Python. In cases where stuff was only available on Linux even though it could work on other systems, people *did* see it as a bad thing. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 20:08:09 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 18:08:09 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <4E8B4B02.60306@v.loewis.de> Message-ID: <1317751477.3784.9.camel@localhost.localdomain> Antoine Pitrou added the comment: > That's because all system-specific functionality that we have really > depends on system features which just can't be available elsewhere. > For all functionality that in principle works on all systems, it also > actually works on all systems for Python. In cases where stuff was > only available on Linux even though it could work on other systems, > people *did* see it as a bad thing. Agreed, but it's a probably with the external library. That's like saying we are responsible if libffi fails building with the AIX compiler. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 20:08:55 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Tue, 04 Oct 2011 18:08:55 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317751477.3784.9.camel@localhost.localdomain> Message-ID: <1317751516.3784.10.camel@localhost.localdomain> Antoine Pitrou added the comment: > > That's because all system-specific functionality that we have really > > depends on system features which just can't be available elsewhere. > > For all functionality that in principle works on all systems, it also > > actually works on all systems for Python. In cases where stuff was > > only available on Linux even though it could work on other systems, > > people *did* see it as a bad thing. > > Agreed, but it's a probably with the external library. That's like > saying we are responsible if libffi fails building with the AIX > compiler. s/probably/problem/ ;) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 20:36:49 2011 From: report at bugs.python.org (Brian Curtin) Date: Tue, 04 Oct 2011 18:36:49 +0000 Subject: [issue13101] Module Doc viewer closes when browser window closes on Windows 8 Message-ID: <1317753409.09.0.358970019617.issue13101@psf.upfronthosting.co.za> New submission from Brian Curtin : Reported by Ryan Wells (v-rywel at microsoft.com) of Microsoft, in reference to a problem with the Module Doc viewer on Windows 8 when using Internet Explorer 10. This was reported on 3.2.2, but it's likely the same on 2.7. Reference #: 70652 Description of the Problem: The application Python Module Doc is automatically closed when Internet Explorer 10 is closed. Steps to Reproduce: 1. Install Windows Developer Preview 2. Install Python 3.2.2 3. Launch Module Doc. Start Menu -> All Program -> Python -> Manual Docs 4. Click on the button open browser 5. It should open the site http://localhost:7464/ In Internet Explorer 10 and the contents should be displayed 6. Should be able to view list of Modules, Scripts, DLLs, and Libraries etc. 7. Close Internet Explorer Expected Result: Internet Explorer 10 should only get closed and we should be able to work with the application Module Doc. Actual Result: The application Module Doc is closed with Internet Explorer 10. Developer Notes: There is likely a difference in return values between IE8 and IE9/10 when launched from the app. ---------- assignee: docs at python components: Documentation, Windows messages: 144918 nosy: brian.curtin, docs at python priority: normal severity: normal status: open title: Module Doc viewer closes when browser window closes on Windows 8 type: behavior versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 20:39:27 2011 From: report at bugs.python.org (Vlad Riscutia) Date: Tue, 04 Oct 2011 18:39:27 +0000 Subject: [issue12880] ctypes: clearly document how structure bit fields are allocated In-Reply-To: <1314930851.91.0.112444630543.issue12880@psf.upfronthosting.co.za> Message-ID: <1317753567.25.0.786594630469.issue12880@psf.upfronthosting.co.za> Vlad Riscutia added the comment: I agree compiler matters for alignment but if you look at PyCField_FromDesc, you will see the layout is pretty much #ifdef MS_WIN32 - #else. Sorry for generalizing, "all" indeed is not the right word. My point is that we should set expectation correctly - VC++-style on Windows, GCC-style everywhere else and encourage users to access structure members by name, not raw memory. Issues opened for bitfields *usually* are of the form I mentioned - setting raw memory to some bytes then seeing members are not what user expected, even if ctypes algorithm works correctly. As I said, I will revise the patch and maybe make it more clear that users should look up how bitfield allocation works for their compiler instead of trying to understand this via structure raw memory. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 20:40:51 2011 From: report at bugs.python.org (Roundup Robot) Date: Tue, 04 Oct 2011 18:40:51 +0000 Subject: [issue11956] 3.3 : test_import.py causes 'make test' to fail In-Reply-To: <1304094086.01.0.944392387954.issue11956@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset cbda512c6d7f by Charles-Fran?ois Natali in branch '3.2': Issue #11956: Always skip test_import.test_unwritable_directory when run as http://hg.python.org/cpython/rev/cbda512c6d7f New changeset 971093a75613 by Charles-Fran?ois Natali in branch 'default': Issue #11956: Always skip test_import.test_unwritable_directory when run as http://hg.python.org/cpython/rev/971093a75613 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 20:56:18 2011 From: report at bugs.python.org (Nick Coghlan) Date: Tue, 04 Oct 2011 18:56:18 +0000 Subject: [issue13101] Module Doc viewer closes when browser window closes on Windows 8 In-Reply-To: <1317753409.09.0.358970019617.issue13101@psf.upfronthosting.co.za> Message-ID: <1317754578.99.0.926548764728.issue13101@psf.upfronthosting.co.za> Nick Coghlan added the comment: If that's the app I think it is (pydoc -g), we're probably going to kill it off in 3.3 in favour of the -b option. ---------- nosy: +ncoghlan _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 21:01:26 2011 From: report at bugs.python.org (Nick Coghlan) Date: Tue, 04 Oct 2011 19:01:26 +0000 Subject: [issue13101] Module Doc viewer closes when browser window closes on Windows 8 In-Reply-To: <1317753409.09.0.358970019617.issue13101@psf.upfronthosting.co.za> Message-ID: <1317754886.65.0.967783598215.issue13101@psf.upfronthosting.co.za> Nick Coghlan added the comment: Slight correction, pydoc.gui() is already gone in current hg tip. However, this error may be indicative of an underlying problem with webbrowser.open(url) throwing an exception. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 21:04:08 2011 From: report at bugs.python.org (Brian Curtin) Date: Tue, 04 Oct 2011 19:04:08 +0000 Subject: [issue13101] Module Doc viewer closes when browser window closes on Windows 8 In-Reply-To: <1317753409.09.0.358970019617.issue13101@psf.upfronthosting.co.za> Message-ID: <1317755048.33.0.0500806581238.issue13101@psf.upfronthosting.co.za> Brian Curtin added the comment: The menu shortcut opens up the following: "C:\Python32\pythonw.exe" "C:\Python32\Tools\scripts\pydocgui.pyw", which is just pydoc.gui() ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 22:11:54 2011 From: report at bugs.python.org (Glenn Washburn) Date: Tue, 04 Oct 2011 20:11:54 +0000 Subject: [issue13102] xml.dom.minidom does not support default namespaces Message-ID: <1317759114.05.0.782450819911.issue13102@psf.upfronthosting.co.za> New submission from Glenn Washburn : When using getAttributeNS, attributes with no namespace should be considered as having the default namespace for that scope. See examples in http://www.w3.org/TR/REC-xml-names/#defaulting. Python's xml.dom.minidom will always set the namespace to None for attributes that have no namespace prefix. I've attached a test program to illustrate this issue in action. The output I get is: [((None, u'attr'), u'value1')] [(('http://www.w3.org/2000/xmlns/', 'xmlns'), u'http://path/to/ns2#'), ((None, u'attr'), u'value2')] [((u'http://path/to/ns2#', u'attr'), u'value3')] Successfully got child3 attr value In the first two cases the namespaceURI is None, but it should be set to the default namespace specified in the root element. I believe this problem occurs with all *NS functions. Not tested in 3.x. ---------- components: XML files: test.py messages: 144924 nosy: crass priority: normal severity: normal status: open title: xml.dom.minidom does not support default namespaces type: behavior versions: Python 2.6, Python 2.7 Added file: http://bugs.python.org/file23313/test.py _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 22:41:20 2011 From: report at bugs.python.org (Xavier de Gaye) Date: Tue, 04 Oct 2011 20:41:20 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion Message-ID: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> New submission from Xavier de Gaye : A regression occurs in python 3.2 when doing a copy of an asyncore dispatcher. $ python3.1 Python 3.1.2 (r312:79147, Apr 4 2010, 17:46:48) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import asyncore, copy >>> copy.copy(asyncore.dispatcher()) $ python3.2 Python 3.2 (r32:88445, Jun 18 2011, 20:30:18) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import asyncore, copy >>> copy.copy(asyncore.dispatcher()) Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.2/copy.py", line 97, in copy return _reconstruct(x, rv, 0) File "/usr/local/lib/python3.2/copy.py", line 291, in _reconstruct if hasattr(y, '__setstate__'): File "/usr/local/lib/python3.2/asyncore.py", line 410, in __getattr__ retattr = getattr(self.socket, attr) .... File "/usr/local/lib/python3.2/asyncore.py", line 410, in __getattr__ retattr = getattr(self.socket, attr) File "/usr/local/lib/python3.2/asyncore.py", line 410, in __getattr__ retattr = getattr(self.socket, attr) RuntimeError: maximum recursion depth exceeded while calling a Python object This occurs after the 'copy' module has created the new instance with __new__(). This new instance does not have the 'socket' attribute, hence the infinite recursion. Adding the following methods to the dispatcher class, fixes the infinite recursion: def __getstate__(self): state = self.__dict__.copy() return state def __setstate__(self, state): self.__dict__.update(state) But it does not explain why the recursion occurs in 3.2 and not in 3.1. ---------- components: Extension Modules messages: 144925 nosy: xdegaye priority: normal severity: normal status: open title: copy of an asyncore dispatcher causes infinite recursion type: behavior versions: Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 23:31:51 2011 From: report at bugs.python.org (Amorilia) Date: Tue, 04 Oct 2011 21:31:51 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317763911.39.0.927652261818.issue13081@psf.upfronthosting.co.za> Amorilia added the comment: Quick update: apparently, fixing another seemingly unrelated bug, fixed this crashing issue as well for rlibiez. Here's relevant the commit: https://github.com/amorilia/pyffi/commit/bd7886eefedfce8fb108c4701cf0467e2a707907 Basically, the problem was with multiprocessing.Pools not getting closed and joined. I'm attaching a script (poolcrash.py) which, theoretically, ought to reproduce the crash - although it doesn't quite reproduce it on my machine; I'm running out of memory and my machine just hangs desperately accessing the swap file before anything happens... Beware that running the bugged script may force you perform a hard reboot of your system, particularly if you wait until all physical memory is used up by zombie processes. ---------- Added file: http://bugs.python.org/file23314/poolcrash.py _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Tue Oct 4 23:32:53 2011 From: report at bugs.python.org (Jeremy Kloth) Date: Tue, 04 Oct 2011 21:32:53 +0000 Subject: [issue13102] xml.dom.minidom does not support default namespaces In-Reply-To: <1317759114.05.0.782450819911.issue13102@psf.upfronthosting.co.za> Message-ID: <1317763973.25.0.614840648667.issue13102@psf.upfronthosting.co.za> Jeremy Kloth added the comment: Please read the link which you posted. Quoting the second paragraph, second sentence: "Default namespace declarations do not apply directly to attribute names;" and from the third paragraph, third sentence: "The namespace name for an unprefixed attribute name always has no value." Therefore minidom *is* conformant by having None as the namespace-uri for unprefixed attribute names. ---------- nosy: +jkloth _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 00:29:31 2011 From: report at bugs.python.org (Ned Deily) Date: Tue, 04 Oct 2011 22:29:31 +0000 Subject: [issue13061] Decimal module yields incorrect results when Python compiled with clang In-Reply-To: <1317317264.78.0.478675633552.issue13061@psf.upfronthosting.co.za> Message-ID: <1317767371.34.0.464564030141.issue13061@psf.upfronthosting.co.za> Changes by Ned Deily : ---------- assignee: ronaldoussoren -> ned.deily stage: -> committed/rejected status: pending -> closed title: Decimal module yields incorrect results when Python compiled with llvm -> Decimal module yields incorrect results when Python compiled with clang _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 01:01:23 2011 From: report at bugs.python.org (Larry Hastings) Date: Tue, 04 Oct 2011 23:01:23 +0000 Subject: [issue13053] Add Capsule migration documentation to "cporting" In-Reply-To: <1317224248.53.0.263122473581.issue13053@psf.upfronthosting.co.za> Message-ID: <1317769283.59.0.720844711677.issue13053@psf.upfronthosting.co.za> Larry Hastings added the comment: New patch based on comments from Ezio Melotti--thanks, Ezio! * capsulethunk.h is now its own file in Doc/includes. * Various minor formatting touchups. * I added some rationale behind the thunked PyCapsule_SetName behavior. ---------- Added file: http://bugs.python.org/file23315/larry.cporting.capsules.r3.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 01:02:51 2011 From: report at bugs.python.org (Deokhwan Kim) Date: Tue, 04 Oct 2011 23:02:51 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value Message-ID: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> New submission from Deokhwan Kim : There is a minor typo in Lib/urllib/request.py:thishost(). Because of it, the thishost() function is returning a garbage value: >>> import urllib.request >>> urllib.request.thishost() ('XXXXXXX.XXXXX.XXX.com', ['X.XXXXX.XXX.com'], ['123.45.67.89']) It is expected to return the IP addresses of the current host, so the correct return value would be like: >>> urllib.request.thishost.__doc__ 'Return the IP addresses of the current host.' >>> urllib.request.thishost() ('127.0.0.1', '127.0.1.1') The attached patch will fix the mistake . ---------- components: Library (Lib) files: thishost.patch keywords: patch messages: 144929 nosy: dkim priority: normal severity: normal status: open title: urllib.request.thishost() returns a garbage value type: behavior versions: Python 3.2 Added file: http://bugs.python.org/file23316/thishost.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 01:24:16 2011 From: report at bugs.python.org (Larry Hastings) Date: Tue, 04 Oct 2011 23:24:16 +0000 Subject: [issue13105] Please elaborate on how 2.x and 3.x are different heads Message-ID: <1317770656.72.0.880626385137.issue13105@psf.upfronthosting.co.za> New submission from Larry Hastings : It wasn't clear to me after reading the "Forward Porting" section exactly what was going on. Nick Coghlan spelled it out for me in a private email, and suggested that maybe this stuff should be in the devguide proper. Here's some specific stuff that I didn't understand until Nick explained it to me with simple words: * 2.x and 3.x have separate heads in the same repository * Since they're totally divorced, the order you check in to 2.x and 3.x does not matter * DO NOT MERGE between 2.x and 3.x * Branches that are in security-fix-only mode (e.g. 3.1) don't get bugfixes or documentation fixes (surely mentioned elsewhere, but I personally would have been helped with a reminder) I suggest it'd be clearer to start with discussing "2.x and 3.x are separate heads", and *then* move on to "But when merging changes solely inside a major version" and talk about forward-porting. Would you be interested in a patch? ---------- components: Devguide messages: 144930 nosy: larry priority: normal severity: normal status: open title: Please elaborate on how 2.x and 3.x are different heads type: feature request _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 01:26:39 2011 From: report at bugs.python.org (Jesse Noller) Date: Tue, 04 Oct 2011 23:26:39 +0000 Subject: [issue10348] multiprocessing: use SysV semaphores on FreeBSD In-Reply-To: <1289181459.49.0.601327253766.issue10348@psf.upfronthosting.co.za> Message-ID: <1317770799.23.0.287521451059.issue10348@psf.upfronthosting.co.za> Jesse Noller added the comment: Charles and Antoine's votes match my own, therefore closing the bug wont fix ---------- resolution: -> wont fix status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 01:51:15 2011 From: report at bugs.python.org (Terry J. Reedy) Date: Tue, 04 Oct 2011 23:51:15 +0000 Subject: [issue12880] ctypes: clearly document how structure bit fields are allocated In-Reply-To: <1314930851.91.0.112444630543.issue12880@psf.upfronthosting.co.za> Message-ID: <1317772275.78.0.121151886567.issue12880@psf.upfronthosting.co.za> Terry J. Reedy added the comment: If I understand correctly, this doc patch would apply to 2.7 and 3.2 also. I have two style comments. I believe "It is important to note that bit field allocation and layout in memory is not defined as a standard, rather its implementation is compiler-specific." could be shortened to "Bit field allocation and memory layout is compiler-specific." To me, this leads nicely into the proposed sentence that follows. "it is recommended that no assumptions are made about the structure size and layout." I do not like 'it is recommended'. Let us state the fact. "any assumptions about the structure size and layout may be wrong." ---------- versions: -Python 3.4 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 02:09:09 2011 From: report at bugs.python.org (Ned Deily) Date: Wed, 05 Oct 2011 00:09:09 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317773349.39.0.20605170014.issue13104@psf.upfronthosting.co.za> Changes by Ned Deily : ---------- nosy: +orsenthil stage: -> patch review versions: +Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 02:50:24 2011 From: report at bugs.python.org (Mark Hammond) Date: Wed, 05 Oct 2011 00:50:24 +0000 Subject: [issue13101] Module Doc viewer closes when browser window closes on Windows 8 In-Reply-To: <1317753409.09.0.358970019617.issue13101@psf.upfronthosting.co.za> Message-ID: <1317775824.63.0.290294797485.issue13101@psf.upfronthosting.co.za> Mark Hammond added the comment: For some reason, IE is struggling to even display the page - it just seems to sit there loading the page without displaying anything, but hitting "stop" then "refresh" usually brings it up. But if you kill IE (which best I can tell can only be done via the task manager - it has no other Windows controls) the doc server process does also terminate. If you run the doc server using python.exe, you will notice tracebacks in the console due to the socket connection being reset (which is probably related to the above problems - the socket should have been fully read by the time you manage to kill IE) - but using python.exe the process stays alive serving requests. I *guess* that the problem is pythonw.exe is hitting an error when it attempts to print to the invalid stderr handle. It might be possible that somehow under Windows 8, stderr isn't buffered (or has as large of a buffer) as other Windows versions, so dies when a small amount of data is written to stderr - but I suspect the same problem could be provoked on other Windows versions by arranging for > 8k of "connection reset by peer" tracebacks to be written, at which point the buffer is attempted to be flushed and fails. Here endeth my speculation for the day ;) ---------- nosy: +mhammond _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 04:11:49 2011 From: report at bugs.python.org (Aaron Staley) Date: Wed, 05 Oct 2011 02:11:49 +0000 Subject: [issue13106] Incorrect pool.py distributed with Python 2.7 windows 32bit Message-ID: <1317780709.51.0.218213650028.issue13106@psf.upfronthosting.co.za> New submission from Aaron Staley : The multiprocess/pool.py distributed with the Python 2.7.2 Windows Installer is different from the one distributed with the 64 bit windows installer or source tarball - and is buggy. Specifically, see Pool._terminate_pool: def _terminate_pool(cls, taskqueue, inqueue, outqueue, pool, worker_handler, task_handler, result_handler, cache): # this is guaranteed to only be called once debug('finalizing pool') worker_handler._state = TERMINATE task_handler._state = TERMINATE taskqueue.put(None) # THIS LINE MISSING! Without that line, termination may deadlock during Pool._help_stuff_finish. The consequence to the user is the interpreter not shutting down. ---------- components: Windows messages: 144934 nosy: Aaron.Staley priority: normal severity: normal status: open title: Incorrect pool.py distributed with Python 2.7 windows 32bit versions: Python 2.7 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 04:27:02 2011 From: report at bugs.python.org (Brian Curtin) Date: Wed, 05 Oct 2011 02:27:02 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317781622.86.0.97390353077.issue13081@psf.upfronthosting.co.za> Brian Curtin added the comment: I tried that script on 2.7 and like it did for you, it just ran until my machine became unusable. On 3.x I think I got a RuntimeError after a while, but I forgot exactly what happened since the machine ended up being hosed later from the 2.7 run. In any event, it certainly didn't crash there and only went a short time before erroring out with some exception. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 04:41:47 2011 From: report at bugs.python.org (Meador Inge) Date: Wed, 05 Oct 2011 02:41:47 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317782507.79.0.972955552598.issue3163@psf.upfronthosting.co.za> Meador Inge added the comment: Found a few test case nits. Comments in rietveld. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 05:48:02 2011 From: report at bugs.python.org (Olivier Refalo) Date: Wed, 05 Oct 2011 03:48:02 +0000 Subject: [issue9098] MSYS build fails with `S_IXGRP' undeclared In-Reply-To: <1277729031.36.0.263470334817.issue9098@psf.upfronthosting.co.za> Message-ID: <1317786482.03.0.664531017952.issue9098@psf.upfronthosting.co.za> Olivier Refalo added the comment: hum, your patch actually works on MSYS ! ok.. so I am pretty much having the very some issue. Could not find platform dependent libraries Consider setting $PYTHONHOME to [:] Fatal Python error: Py_Initialize: unable to load the file system codec LookupError: no codec search functions registered: can't find encoding ---------- nosy: +orefalo versions: +Python 3.2 -Python 2.6 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 06:25:17 2011 From: report at bugs.python.org (Ezio Melotti) Date: Wed, 05 Oct 2011 04:25:17 +0000 Subject: [issue13106] Incorrect pool.py distributed with Python 2.7 windows 32bit In-Reply-To: <1317780709.51.0.218213650028.issue13106@psf.upfronthosting.co.za> Message-ID: <1317788717.46.0.0789620430526.issue13106@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +jnoller stage: -> test needed type: -> behavior _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 06:26:48 2011 From: report at bugs.python.org (Ezio Melotti) Date: Wed, 05 Oct 2011 04:26:48 +0000 Subject: [issue13105] Please elaborate on how 2.x and 3.x are different heads In-Reply-To: <1317770656.72.0.880626385137.issue13105@psf.upfronthosting.co.za> Message-ID: <1317788808.31.0.699840919829.issue13105@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +ezio.melotti stage: -> needs patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 06:28:16 2011 From: report at bugs.python.org (Ezio Melotti) Date: Wed, 05 Oct 2011 04:28:16 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317788896.12.0.517068127424.issue13104@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +ezio.melotti stage: patch review -> test needed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 06:29:21 2011 From: report at bugs.python.org (Ezio Melotti) Date: Wed, 05 Oct 2011 04:29:21 +0000 Subject: [issue13102] xml.dom.minidom does not support default namespaces In-Reply-To: <1317759114.05.0.782450819911.issue13102@psf.upfronthosting.co.za> Message-ID: <1317788961.56.0.700516520037.issue13102@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +ezio.melotti _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 06:30:49 2011 From: report at bugs.python.org (Ezio Melotti) Date: Wed, 05 Oct 2011 04:30:49 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion In-Reply-To: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> Message-ID: <1317789049.85.0.733550478247.issue13103@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: +giampaolo.rodola, josiahcarlson, stutzbach stage: -> test needed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 11:24:04 2011 From: report at bugs.python.org (=?utf-8?b?SmVzw7pzIENlYSBBdmnDs24=?=) Date: Wed, 05 Oct 2011 09:24:04 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317806644.12.0.254163617697.issue6715@psf.upfronthosting.co.za> Jes?s Cea Avi?n added the comment: I agree with Martin here. We should *NOT* have first and second class OS support, if we can avoid it. That said, I wonder what happens in Windows with the BZ2 module, for instance :-?. Do we include the BZ2 sourcecode to compile it under windows?. I know, for instance, that Windows 2.* builds include Berkeley DB. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 11:27:57 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Wed, 05 Oct 2011 09:27:57 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317806644.12.0.254163617697.issue6715@psf.upfronthosting.co.za> Message-ID: <1317806663.3674.1.camel@localhost.localdomain> Antoine Pitrou added the comment: > I agree with Martin here. We should *NOT* have first and second class > OS support, if we can avoid it. The key word being "if we can avoid it". Jesus, if you are a Windows expert, your contribution is welcome. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 11:31:16 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Wed, 05 Oct 2011 09:31:16 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317807076.13.0.945347413393.issue6715@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: For bz2, Tools/buildbot/external-common.bat has code to download bz2 source, and PCbuild/_bz2.vcproj include and compile these files together with _bz2.pyd. The _ssl module does a similar thing, except that libeay32.lib and libssleay32.lib are built in a separate step. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 11:58:12 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Wed, 05 Oct 2011 09:58:12 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317806644.12.0.254163617697.issue6715@psf.upfronthosting.co.za> Message-ID: <4E8C2A31.2040206@v.loewis.de> Martin v. L?wis added the comment: > That said, I wonder what happens in Windows with the BZ2 module, for instance :-?. External code currently lives at http://svn.python.org/projects/external/. The build process gets it from there, and we may have local modifications to libraries where necessary. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 12:00:56 2011 From: report at bugs.python.org (Jens Diemer) Date: Wed, 05 Oct 2011 10:00:56 +0000 Subject: [issue11638] pysetup un sdist crashes with weird trace if version is unicode by accident In-Reply-To: <1300828733.05.0.475519990909.issue11638@psf.upfronthosting.co.za> Message-ID: <1317808856.68.0.144598007986.issue11638@psf.upfronthosting.co.za> Changes by Jens Diemer : ---------- nosy: +jens _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 12:09:22 2011 From: report at bugs.python.org (STINNER Victor) Date: Wed, 05 Oct 2011 10:09:22 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317809362.03.0.0890804054503.issue6715@psf.upfronthosting.co.za> STINNER Victor added the comment: > I agree with Martin here. We should *NOT* have first > and second class OS support, if we can avoid it. Ok but who will do the job? If nobody is motivated to fix compiler issues, it would be a pity to not add the module for that. ---------- nosy: +haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 12:13:26 2011 From: report at bugs.python.org (Jens Diemer) Date: Wed, 05 Oct 2011 10:13:26 +0000 Subject: [issue11638] pysetup un sdist crashes with weird trace if version is unicode by accident In-Reply-To: <1300828733.05.0.475519990909.issue11638@psf.upfronthosting.co.za> Message-ID: <1317809606.42.0.655470832066.issue11638@psf.upfronthosting.co.za> Jens Diemer added the comment: I have the same problem, using distutils (and not distutils2): Traceback (most recent call last): File "./setup.py", line 60, in test_suite="creole.tests.run_all_tests", File "/usr/lib/python2.7/distutils/core.py", line 152, in setup dist.run_commands() File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands self.run_command(cmd) File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command cmd_obj.run() File "/home/jens/python2creole_env/local/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/setuptools/command/sdist.py", line 147, in run File "/usr/lib/python2.7/distutils/command/sdist.py", line 448, in make_distribution owner=self.owner, group=self.group) File "/usr/lib/python2.7/distutils/cmd.py", line 392, in make_archive owner=owner, group=group) File "/usr/lib/python2.7/distutils/archive_util.py", line 237, in make_archive filename = func(base_name, base_dir, **kwargs) File "/usr/lib/python2.7/distutils/archive_util.py", line 101, in make_tarball tar = tarfile.open(archive_name, 'w|%s' % tar_compression[compress]) File "/usr/lib/python2.7/tarfile.py", line 1687, in open _Stream(name, filemode, comptype, fileobj, bufsize), File "/usr/lib/python2.7/tarfile.py", line 431, in __init__ self._init_write_gz() File "/usr/lib/python2.7/tarfile.py", line 459, in _init_write_gz self.__write(self.name + NUL) File "/usr/lib/python2.7/tarfile.py", line 475, in __write self.buf += s UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128) The Problem seems that tarfile._Stream() can't handle 'name' as unicode. With this changes, it works: class _Stream: ... def __init__(self, name, mode, comptype, fileobj, bufsize): ... self.name = str(name) or "" ++++ + Don't know it this is related to the usage of: from __future__ import unicode_literals ? ---------- components: +Distutils _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 12:30:42 2011 From: report at bugs.python.org (Petri Lehtinen) Date: Wed, 05 Oct 2011 10:30:42 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: <1317810642.73.0.210514641552.issue13073@psf.upfronthosting.co.za> Petri Lehtinen added the comment: The 2.7 documentation should mention the version in which the argument was added. I believe it was 2.7. ---------- resolution: fixed -> status: closed -> open _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 12:32:29 2011 From: report at bugs.python.org (Ezio Melotti) Date: Wed, 05 Oct 2011 10:32:29 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: <1317810749.55.0.422552656027.issue13073@psf.upfronthosting.co.za> Ezio Melotti added the comment: I also left some comments on the review page that should be addressed. ---------- nosy: +ezio.melotti _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 12:50:45 2011 From: report at bugs.python.org (=?utf-8?b?SmVzw7pzIENlYSBBdmnDs24=?=) Date: Wed, 05 Oct 2011 10:50:45 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317811845.51.0.893391050844.issue6715@psf.upfronthosting.co.za> Jes?s Cea Avi?n added the comment: Antoine, I am a Linux/Solaris/Illumos guy. I only use Windows (virtualized) to sync my iPhone with iTunes :) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 13:08:23 2011 From: report at bugs.python.org (Adam Byrtek) Date: Wed, 05 Oct 2011 11:08:23 +0000 Subject: [issue13107] Text width in optparse.py can become negative Message-ID: <1317812903.38.0.572106699519.issue13107@psf.upfronthosting.co.za> New submission from Adam Byrtek : Code snippet from optparse.py: 344 self.help_position = min(max_len + 2, self.max_help_position) 345 self.help_width = self.width - self.help_position Where self.width is initialized with the COLUMNS environment variable. On narrow terminals it can happen that self.help_position < self.width, leading to an exception in textwrap.py: raise ValueError("invalid width %r (must be > 0)" % self.width) ValueError: invalid width -15 (must be > 0) A reasonable workaround would be to trim part of the help text instead of causing an exception. ---------- components: Library (Lib) messages: 144947 nosy: adambyrtek priority: normal severity: normal status: open title: Text width in optparse.py can become negative type: behavior _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 13:16:13 2011 From: report at bugs.python.org (Xavier de Gaye) Date: Wed, 05 Oct 2011 11:16:13 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion In-Reply-To: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> Message-ID: <1317813373.95.0.798266766344.issue13103@psf.upfronthosting.co.za> Xavier de Gaye added the comment: The infinite recursion occurs also when running python 3.2 with the extension modules copy, copyreg and asyncore from python 3.1. So it seems this regression is not caused by a modification in these modules. Anyway, the bug is in asyncore. The attached patch fixes it and is more robust than adding the __getstate__ and __setstate__ methods to dispatcher. The patch includes a test case. ---------- keywords: +patch Added file: http://bugs.python.org/file23317/infinite_recursion_asyncore.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 13:41:11 2011 From: report at bugs.python.org (Aaron Staley) Date: Wed, 05 Oct 2011 11:41:11 +0000 Subject: [issue13106] Incorrect pool.py distributed with Python 2.7 windows 32bit In-Reply-To: <1317780709.51.0.218213650028.issue13106@psf.upfronthosting.co.za> Message-ID: <1317814871.42.0.701796806576.issue13106@psf.upfronthosting.co.za> Aaron Staley added the comment: Never mind; looks like this functionality was moved to handle_workers. I had inadvertently been testing under a modified pool.py. Sorry for the inconvenience! ---------- resolution: -> invalid status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 13:45:42 2011 From: report at bugs.python.org (Ezio Melotti) Date: Wed, 05 Oct 2011 11:45:42 +0000 Subject: [issue13106] Incorrect pool.py distributed with Python 2.7 windows 32bit In-Reply-To: <1317780709.51.0.218213650028.issue13106@psf.upfronthosting.co.za> Message-ID: <1317815142.32.0.456186579595.issue13106@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- stage: test needed -> committed/rejected _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 15:46:54 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Wed, 05 Oct 2011 13:46:54 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317822414.04.0.564528197138.issue3163@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Thanks for the comments. Here is an updated patch. ---------- Added file: http://bugs.python.org/file23318/struct_nn4.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 16:18:37 2011 From: report at bugs.python.org (Mark Dickinson) Date: Wed, 05 Oct 2011 14:18:37 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317824317.96.0.302082374193.issue3163@psf.upfronthosting.co.za> Changes by Mark Dickinson : ---------- assignee: mark.dickinson -> _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 17:27:53 2011 From: report at bugs.python.org (Roundup Robot) Date: Wed, 05 Oct 2011 15:27:53 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset befa7b926aad by Senthil Kumaran in branch '3.2': Issue #13073 - Address the review comments made by Ezio. http://hg.python.org/cpython/rev/befa7b926aad New changeset a7b7ba225de7 by Senthil Kumaran in branch 'default': merge from 3.2. Issue #13073 - Address the review comments made by Ezio. http://hg.python.org/cpython/rev/a7b7ba225de7 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 17:53:12 2011 From: report at bugs.python.org (Roundup Robot) Date: Wed, 05 Oct 2011 15:53:12 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 64fae6f7b64c by Senthil Kumaran in branch '2.7': Issue13073 - Address review comments and add versionchanged information in the docs. http://hg.python.org/cpython/rev/64fae6f7b64c ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 17:54:12 2011 From: report at bugs.python.org (Senthil Kumaran) Date: Wed, 05 Oct 2011 15:54:12 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: <1317830052.93.0.129183242088.issue13073@psf.upfronthosting.co.za> Senthil Kumaran added the comment: I believe, I have addressed all the comments. Closing this report. ---------- resolution: -> fixed status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 18:21:26 2011 From: report at bugs.python.org (Xavier de Gaye) Date: Wed, 05 Oct 2011 16:21:26 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion In-Reply-To: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> Message-ID: <1317831686.47.0.685622901498.issue13103@psf.upfronthosting.co.za> Xavier de Gaye added the comment: About why the asyncore bug shows up in python 3.2: The simple test below is ok with python 3.1 but triggers a "RuntimeError: maximum recursion depth exceeded..." with python 3.2: $ python3.1 Python 3.1.2 (r312:79147, Apr 4 2010, 17:46:48) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class C: ... def __getattr__(self, attr): ... return getattr(self.foo, attr) ... >>> c = C() >>> hasattr(c, 'bar') False >>> For the reasoning behind this change made in python 3.2, see issue 9666 and the mail http://mail.python.org/pipermail/python-dev/2010-August/103178.html ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 18:27:18 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Wed, 05 Oct 2011 16:27:18 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317728151.55.0.216157545721.issue13070@psf.upfronthosting.co.za> Message-ID: Charles-Fran?ois Natali added the comment: Sorry, forgot about this issue... Updated patch (I'm not really satisfied with the error message, don't hesitate if you can think of a better wording). ---------- Added file: http://bugs.python.org/file23319/buffered_closed_gc-3.diff _______________________________________ Python tracker _______________________________________ -------------- next part -------------- diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -2421,6 +2421,20 @@ with self.open(support.TESTFN, "rb") as f: self.assertEqual(f.read(), b"456def") + def test_rwpair_cleared_before_textio(self): + # Issue 13070: TextIOWrapper's finalization would crash when called + # after the reference to the underlying BufferedRWPair got cleared. + for i in range(1000): + b1 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t1 = self.TextIOWrapper(b1, encoding="ascii") + b2 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t2 = self.TextIOWrapper(b2, encoding="ascii") + # circular references + t1.buddy = t2 + t2.buddy = t1 + support.gc_collect() + + class PyTextIOWrapperTest(TextIOWrapperTest): pass diff --git a/Modules/_io/bufferedio.c b/Modules/_io/bufferedio.c --- a/Modules/_io/bufferedio.c +++ b/Modules/_io/bufferedio.c @@ -2307,6 +2307,10 @@ static PyObject * bufferedrwpair_closed_get(rwpair *self, void *context) { + if (self->writer == NULL) { + PyErr_SetString(PyExc_RuntimeError, "the writer object has been cleared"); + return NULL; + } return PyObject_GetAttr((PyObject *) self->writer, _PyIO_str_closed); } From report at bugs.python.org Wed Oct 5 18:33:08 2011 From: report at bugs.python.org (Roundup Robot) Date: Wed, 05 Oct 2011 16:33:08 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 805a0a1e3c2b by Senthil Kumaran in branch '3.2': Issue13104 - Fix urllib.request.thishost() utility function. http://hg.python.org/cpython/rev/805a0a1e3c2b New changeset a228e59ad693 by Senthil Kumaran in branch 'default': merge from 3.2. Issue13104 - Fix urllib.request.thishost() utility function. http://hg.python.org/cpython/rev/a228e59ad693 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 18:34:06 2011 From: report at bugs.python.org (Senthil Kumaran) Date: Wed, 05 Oct 2011 16:34:06 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317832446.33.0.368319366982.issue13104@psf.upfronthosting.co.za> Senthil Kumaran added the comment: Thanks for the report. This is fixed now. I hope in 3.3 I remove this old utility functions. (real soon). ---------- assignee: -> orsenthil resolution: -> fixed status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 18:45:32 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Wed, 05 Oct 2011 16:45:32 +0000 Subject: [issue13045] socket.getsockopt may require custom buffer contents In-Reply-To: <1317666568.73.0.957785734202.issue13045@psf.upfronthosting.co.za> Message-ID: Charles-Fran?ois Natali added the comment: > I've attached an update for the previous patch. Now there's no more > overloading for the third argument and socket.getsockopt accepts one more > optional argument -- a buffer to use as an input to kernel. Remarks: """ + length. If *buffer* is absent and *buflen* is an integer, then *buflen* [...] + this buffer is returned as a bytes object. If *buflen* is absent, an integer """ There's a problem here, the first buflen part should probably be removed. Also, you might want to specify that if a custom buffer is provided, the length argument will be ignored. > By the way, I don't really think that any POSIX-compliant UNIX out there > would treat the buffer given to getsockopt in any way different from what > Linux does. It is very easy to copy the buffer from user to kernel and back, > and it is so inconvenient to prevent kernel from reading it prior to > modification, that I bet no one has ever bothered to do this. Me neither, I don't expect the syscall to return EINVAL: the goal is just to test the correct passing of the input buffer, and the length computation. If we can't test this easily within test_socket, it's ok, I guess the following should be enough: - try supplying a non-buffer argument as fourth parameter (e.g. and int), and check that you get a ValueError - supply a buffer with a size == sizeof(int) (SIZEOF_INT is defined in Lib/test/test_socket.py), and call getsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 0, ): this should normally succeed, and return a buffer (check the return type) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 18:47:27 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Wed, 05 Oct 2011 16:47:27 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1287449366.98.0.655876257649.issue10141@psf.upfronthosting.co.za> Message-ID: <1317833247.16.0.747153035621.issue10141@psf.upfronthosting.co.za> Changes by Charles-Fran?ois Natali : ---------- nosy: +pitrou _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 18:54:46 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Wed, 05 Oct 2011 16:54:46 +0000 Subject: [issue11956] 3.3 : test_import.py causes 'make test' to fail In-Reply-To: <1304094086.01.0.944392387954.issue11956@psf.upfronthosting.co.za> Message-ID: <1317833686.16.0.744852950141.issue11956@psf.upfronthosting.co.za> Changes by Charles-Fran?ois Natali : ---------- resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 19:04:15 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Wed, 05 Oct 2011 17:04:15 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1287449366.98.0.655876257649.issue10141@psf.upfronthosting.co.za> Message-ID: <1317834255.64.0.861880532954.issue10141@psf.upfronthosting.co.za> Antoine Pitrou added the comment: I don't have much to say about the patch, given that I don't know anything about CAN and my system doesn't appear to have a "vcan0" interface. I think it's ok to commit and refine later if something turns out insufficient. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 19:09:23 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Wed, 05 Oct 2011 17:09:23 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1287449366.98.0.655876257649.issue10141@psf.upfronthosting.co.za> Message-ID: <1317834563.88.0.461612720392.issue10141@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: > I don't have much to say about the patch, given that I don't know > anything about CAN and my system doesn't appear to have a "vcan0" > interface. I had never heard about it before this issue, but the protocol is really simple. If you want to try it out (just for fun :-), you just have to do the following: # modprobe vcan # ip link add dev vcan0 type vcan # ifconfig vcan0 up ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 19:17:18 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Wed, 05 Oct 2011 17:17:18 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1317834563.88.0.461612720392.issue10141@psf.upfronthosting.co.za> Message-ID: <1317834824.3713.2.camel@localhost.localdomain> Antoine Pitrou added the comment: > I had never heard about it before this issue, but the protocol is really simple. > > If you want to try it out (just for fun :-), you just have to do the following: > # modprobe vcan > # ip link add dev vcan0 type vcan > # ifconfig vcan0 up Ah, thanks! Can you add a comment about that in test_socket.py? I can confirm that all tests pass ok on my Linux system (kernel 2.6.38.8-desktop-5.mga). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 19:20:55 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Wed, 05 Oct 2011 17:20:55 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion In-Reply-To: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> Message-ID: <1317835255.59.0.833614583816.issue13103@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: So, in 3.1 hasattr(y, '__setstate__') *did* recurse and hit the limit, but the exception was caught and hasattr returned False? I think I prefer the new behavior... The patch looks good, I would simply have raised AttributeError(name) though. ---------- nosy: +amaury.forgeotdarc _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 19:22:48 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Wed, 05 Oct 2011 17:22:48 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317835368.86.0.598178860941.issue13070@psf.upfronthosting.co.za> Antoine Pitrou added the comment: The latest patch looks good to me. As for the error message, how about "the BufferedRWPair object is being garbage-collected". ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 19:32:55 2011 From: report at bugs.python.org (Giampaolo Rodola') Date: Wed, 05 Oct 2011 17:32:55 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion In-Reply-To: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> Message-ID: <1317835975.0.0.342139987767.issue13103@psf.upfronthosting.co.za> Giampaolo Rodola' added the comment: IMO, patch should only be applied to Python 3.2. For 3.3 we finally have the chance to get rid of the dispatcher.__getattr__ aberration (see issue 8483) so I say let's just remove it and fix this issue as a consequence. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 19:53:35 2011 From: report at bugs.python.org (Roundup Robot) Date: Wed, 05 Oct 2011 17:53:35 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset d60c00015f01 by Charles-Fran?ois Natali in branch '3.2': Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle http://hg.python.org/cpython/rev/d60c00015f01 New changeset 7defc1e5d13a by Charles-Fran?ois Natali in branch 'default': Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle http://hg.python.org/cpython/rev/7defc1e5d13a ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 20:15:37 2011 From: report at bugs.python.org (Amorilia) Date: Wed, 05 Oct 2011 18:15:37 +0000 Subject: [issue13081] Crash in Windows with unknown cause In-Reply-To: <1317417665.11.0.474916583341.issue13081@psf.upfronthosting.co.za> Message-ID: <1317838537.66.0.0157022031873.issue13081@psf.upfronthosting.co.za> Amorilia added the comment: Thanks for also trying it out, Brian. I feel there's little more I can do. I guess the multiprocessing module could be documented a bit better that join() ought to be called before the pool is deleted? Currently, the docs merely say: "Wait for the worker processes to exit. One must call close() or terminate() before using join()." (http://docs.python.org/library/multiprocessing.html#multiprocessing.pool.multiprocessing.Pool.join) Something along the following lines could be added: "You must call join() when you no longer need the pool; otherwise, zombie processes may keep running." I'm happy to provide a patch, if needed. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 20:44:10 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Wed, 05 Oct 2011 18:44:10 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317840250.36.0.80690024783.issue13070@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: Committed to 3.2 and default. Victor, thanks for the report! ---------- resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 21:25:05 2011 From: report at bugs.python.org (Victor Semionov) Date: Wed, 05 Oct 2011 19:25:05 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317842705.14.0.409525881885.issue13070@psf.upfronthosting.co.za> Victor Semionov added the comment: Great, thanks to you too, for fixing it! ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 21:26:10 2011 From: report at bugs.python.org (David Andrzejewski) Date: Wed, 05 Oct 2011 19:26:10 +0000 Subject: [issue8813] SSLContext doesn't support loading a CRL In-Reply-To: <1274735830.03.0.714974872377.issue8813@psf.upfronthosting.co.za> Message-ID: <1317842770.02.0.118232909457.issue8813@psf.upfronthosting.co.za> Changes by David Andrzejewski : ---------- nosy: +dandrzejewski _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 21:55:49 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Wed, 05 Oct 2011 19:55:49 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion In-Reply-To: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> Message-ID: <1317844549.28.0.551481965401.issue13103@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: Let's add the test to 3.3 nonetheless. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 23:00:25 2011 From: report at bugs.python.org (Xavier de Gaye) Date: Wed, 05 Oct 2011 21:00:25 +0000 Subject: [issue13103] copy of an asyncore dispatcher causes infinite recursion In-Reply-To: <1317760879.94.0.397539088986.issue13103@psf.upfronthosting.co.za> Message-ID: <1317848425.1.0.101840696169.issue13103@psf.upfronthosting.co.za> Xavier de Gaye added the comment: > So, in 3.1 hasattr(y, '__setstate__') *did* recurse and hit the limit, > but the exception was caught and hasattr returned False? This is right. > I think I prefer the new behavior... > The patch looks good, I would simply have raised AttributeError(name) > though. It is fine with me to raise AttributeError(name). Note that when raising AttributeError('socket'), the user gets notified of the exceptions on both 'socket' and 'name'. For example with the patch applied: $ python3 Python 3.2 (r32:88445, Jun 18 2011, 20:30:18) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import asyncore >>> a = asyncore.dispatcher() >>> del a.socket >>> a.foo Traceback (most recent call last): File "asyncore.py", line 415, in __getattr__ retattr = getattr(self.socket, attr) File "asyncore.py", line 413, in __getattr__ % self.__class__.__name__) AttributeError: dispatcher instance has no attribute 'socket' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 1, in File "asyncore.py", line 418, in __getattr__ %(self.__class__.__name__, attr)) AttributeError: dispatcher instance has no attribute 'foo' ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 23:04:49 2011 From: report at bugs.python.org (Stefan Krah) Date: Wed, 05 Oct 2011 21:04:49 +0000 Subject: [issue13108] test_urllib: buildbot failure Message-ID: <1317848689.77.0.919610092444.issue13108@psf.upfronthosting.co.za> New submission from Stefan Krah : The FreeBSD-amd64 and Fedora buildbots are recently failing with: ====================================================================== ERROR: test_thishost (test.test_urllib.Utility_Tests) Test the urllib.request.thishost utility function returns a tuple ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/buildbot/buildarea/3.x.krah-fedora/build/Lib/test/test_urllib.py", line 1063, in test_thishost self.assertIsInstance(urllib.request.thishost(), tuple) File "/home/buildbot/buildarea/3.x.krah-fedora/build/Lib/urllib/request.py", line 2128, in thishost _thishost = tuple(socket.gethostbyname_ex(socket.gethostname())[2]) socket.gaierror: [Errno -2] Name or service not known ---------------------------------------------------------------------- ---------- components: Tests messages: 144971 nosy: skrah priority: normal severity: normal status: open title: test_urllib: buildbot failure versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 23:08:09 2011 From: report at bugs.python.org (xy zzy) Date: Wed, 05 Oct 2011 21:08:09 +0000 Subject: [issue13109] telnetlib insensitive to connection loss Message-ID: <1317848889.2.0.2461830982.issue13109@psf.upfronthosting.co.za> New submission from xy zzy : Using python's telnetlib I can connect and communicate with a device. While the telnet session is active I can disconnect the network cable of the device. At this point, I would expect read_until() with a timeout to throw a socket.error, EOFError or perhaps an IOError, but what I actually get is a null string. Because I'm reading in a loop, when the cable is reconnected the device will resume communicating, and the program will continue. My best guess ts that read_until() or perhaps everything except open() is insensitive to the loss of a connection. ---------- components: IO messages: 144972 nosy: xy.zzy priority: normal severity: normal status: open title: telnetlib insensitive to connection loss type: behavior _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 23:50:06 2011 From: report at bugs.python.org (Eric V. Smith) Date: Wed, 05 Oct 2011 21:50:06 +0000 Subject: [issue13109] telnetlib insensitive to connection loss In-Reply-To: <1317848889.2.0.2461830982.issue13109@psf.upfronthosting.co.za> Message-ID: <1317851406.44.0.756311285278.issue13109@psf.upfronthosting.co.za> Eric V. Smith added the comment: Can you post some example code? I would not expect disconnecting the network cable to close any TCP connections, unless you are transmitting data and/or you have keepalives turned on. ---------- nosy: +eric.smith _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 23:51:15 2011 From: report at bugs.python.org (Barry A. Warsaw) Date: Wed, 05 Oct 2011 21:51:15 +0000 Subject: [issue13110] test_socket.py failures on ARM Message-ID: <1317851475.03.0.915061914039.issue13110@psf.upfronthosting.co.za> New submission from Barry A. Warsaw : Initial results from warsaw-ubuntu-arm buildbot indicates two failures in test_socket.py ====================================================================== ERROR: test_create_connection_timeout (test.test_socket.NetworkConnectionNoServer) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/lib/buildbot/buildarea/2.7.warsaw-ubuntu-arm/build/Lib/test/test_socket.py", line 1198, in test_create_connection_timeout socket.create_connection((HOST, 1234)) File "/var/lib/buildbot/buildarea/2.7.warsaw-ubuntu-arm/build/Lib/socket.py", line 571, in create_connection raise err error: [Errno 97] Address family not supported by protocol ====================================================================== FAIL: test_create_connection (test.test_socket.NetworkConnectionNoServer) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/lib/buildbot/buildarea/2.7.warsaw-ubuntu-arm/build/Lib/test/test_socket.py", line 1191, in test_create_connection self.assertEqual(cm.exception.errno, errno.ECONNREFUSED) AssertionError: 97 != 111 ---------------------------------------------------------------------- I'm still investigating, but wanted to file the bug now so there's an issue number to reference. ---------- components: Tests messages: 144974 nosy: barry priority: normal severity: normal status: open title: test_socket.py failures on ARM versions: Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Wed Oct 5 23:52:45 2011 From: report at bugs.python.org (Barry A. Warsaw) Date: Wed, 05 Oct 2011 21:52:45 +0000 Subject: [issue13110] test_socket.py failures on ARM In-Reply-To: <1317851475.03.0.915061914039.issue13110@psf.upfronthosting.co.za> Message-ID: <1317851565.88.0.945508072434.issue13110@psf.upfronthosting.co.za> Changes by Barry A. Warsaw : ---------- versions: +Python 2.7, Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 00:55:27 2011 From: report at bugs.python.org (MA S) Date: Wed, 05 Oct 2011 22:55:27 +0000 Subject: [issue13111] Error 2203 when installing Python/Perl? Message-ID: <1317855327.52.0.585487977144.issue13111@psf.upfronthosting.co.za> New submission from MA S : I can't install Python or Strawberry Perl on the Windows 8 Developer Preview :( I keep getting installer error 2203; the log file's attached. I really don't know what's wrong... ---------- components: Windows files: python.log messages: 144975 nosy: MA.S priority: normal severity: normal status: open title: Error 2203 when installing Python/Perl? type: crash versions: Python 2.7 Added file: http://bugs.python.org/file23320/python.log _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 01:31:34 2011 From: report at bugs.python.org (yoch) Date: Wed, 05 Oct 2011 23:31:34 +0000 Subject: [issue13112] backreferences in comprehensions Message-ID: <1317857494.55.0.915280284739.issue13112@psf.upfronthosting.co.za> New submission from yoch : Hi, I would like to use backreferences in list comprehensions (or other comprehensions), such as : [[elt for elt in lst if elt] for lst in matrix if \{1}] # \{1} is back reference to [elt for elt in lst if elt] # to filter the result of the first comprehension It would be possible to do this ? Thanks ---------- messages: 144976 nosy: yoch.melka priority: normal severity: normal status: open title: backreferences in comprehensions type: feature request _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 02:09:49 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Thu, 06 Oct 2011 00:09:49 +0000 Subject: [issue7732] imp.find_module crashes Python if there exists a directory named "__init__.py" In-Reply-To: <1263816426.99.0.320342241745.issue7732@psf.upfronthosting.co.za> Message-ID: <1317859789.66.0.858386672243.issue7732@psf.upfronthosting.co.za> Antoine Pitrou added the comment: This broke the Windows buildbots in Python 2.7. ---------- assignee: -> haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 02:28:18 2011 From: report at bugs.python.org (Meador Inge) Date: Thu, 06 Oct 2011 00:28:18 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317860898.84.0.323806433769.issue3163@psf.upfronthosting.co.za> Meador Inge added the comment: No problem. This last version LGTM. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 02:34:18 2011 From: report at bugs.python.org (Barry A. Warsaw) Date: Thu, 06 Oct 2011 00:34:18 +0000 Subject: [issue13110] test_socket.py failures on ARM In-Reply-To: <1317851475.03.0.915061914039.issue13110@psf.upfronthosting.co.za> Message-ID: <1317861258.27.0.0389769831311.issue13110@psf.upfronthosting.co.za> Barry A. Warsaw added the comment: This appears to be the problem: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/868812 I'm going to close this Python bug since it seems to be related to the Linux kernel on armel. Editing the /etc/hosts file gets around the problem and lets the test pass. I could imagine that the test should be able to deal with the unexpected exceptions though. ---------- resolution: -> invalid status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 02:50:48 2011 From: report at bugs.python.org (STINNER Victor) Date: Thu, 06 Oct 2011 00:50:48 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317862248.22.0.891609832993.issue13070@psf.upfronthosting.co.za> STINNER Victor added the comment: The issue doesn't affect Python 2.7? ---------- nosy: +haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 02:52:17 2011 From: report at bugs.python.org (STINNER Victor) Date: Thu, 06 Oct 2011 00:52:17 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317862337.57.0.672156948755.issue13104@psf.upfronthosting.co.za> STINNER Victor added the comment: There is a failure on FreeBSD 8.2 buildbot: http://www.python.org/dev/buildbot/all/builders/AMD64%20FreeBSD%208.2%203.x/builds/1104/steps/test/logs/stdio ====================================================================== ERROR: test_thishost (test.test_urllib.Utility_Tests) Test the urllib.request.thishost utility function returns a tuple ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_urllib.py", line 1063, in test_thishost self.assertIsInstance(urllib.request.thishost(), tuple) File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/urllib/request.py", line 2128, in thishost _thishost = tuple(socket.gethostbyname_ex(socket.gethostname())[2]) socket.gaierror: [Errno 8] hostname nor servname provided, or not known ---------- nosy: +haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 02:52:26 2011 From: report at bugs.python.org (STINNER Victor) Date: Thu, 06 Oct 2011 00:52:26 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317862346.94.0.715280468107.issue13104@psf.upfronthosting.co.za> Changes by STINNER Victor : ---------- resolution: fixed -> status: closed -> open _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 03:50:27 2011 From: report at bugs.python.org (Diego Mascialino) Date: Thu, 06 Oct 2011 01:50:27 +0000 Subject: [issue8087] Unupdated source file in traceback In-Reply-To: <1267990809.73.0.116039880896.issue8087@psf.upfronthosting.co.za> Message-ID: <1317865827.71.0.0650957887768.issue8087@psf.upfronthosting.co.za> Diego Mascialino added the comment: I worked a few hours today and I have this patch. I tried to make a test but could not. I know this is not a really good patch, but it's my first one and I wanted to show my idea. ---------- keywords: +patch versions: -Python 2.7 Added file: http://bugs.python.org/file23321/issue8087.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 04:36:05 2011 From: report at bugs.python.org (Senthil Kumaran) Date: Thu, 06 Oct 2011 02:36:05 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317868565.91.0.195130573901.issue13104@psf.upfronthosting.co.za> Senthil Kumaran added the comment: hmm. interesting case in FreeBSD. Looks like socket.gethostname() did not return the hostname in freebsd buildbot. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 08:53:25 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Thu, 06 Oct 2011 06:53:25 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317862248.22.0.891609832993.issue13070@psf.upfronthosting.co.za> Message-ID: Charles-Fran?ois Natali added the comment: > The issue doesn't affect Python 2.7? > Duh! I was sure the _io module had been introduced in Python 3 (I/O layer rewrite, etc). Yes, it does apply to 2.7. I'll commit the patch later today. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 08:58:23 2011 From: report at bugs.python.org (Stefan Krah) Date: Thu, 06 Oct 2011 06:58:23 +0000 Subject: [issue13108] test_urllib: buildbot failure In-Reply-To: <1317848689.77.0.919610092444.issue13108@psf.upfronthosting.co.za> Message-ID: <1317884303.78.0.154067239873.issue13108@psf.upfronthosting.co.za> Changes by Stefan Krah : ---------- resolution: -> duplicate stage: -> committed/rejected status: open -> closed superseder: -> urllib.request.thishost() returns a garbage value type: -> behavior _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 08:58:40 2011 From: report at bugs.python.org (Stefan Krah) Date: Thu, 06 Oct 2011 06:58:40 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317884320.01.0.592201061802.issue13104@psf.upfronthosting.co.za> Changes by Stefan Krah : ---------- nosy: +skrah _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 09:22:32 2011 From: report at bugs.python.org (Stefan Krah) Date: Thu, 06 Oct 2011 07:22:32 +0000 Subject: [issue13104] urllib.request.thishost() returns a garbage value In-Reply-To: <1317769371.74.0.590194297783.issue13104@psf.upfronthosting.co.za> Message-ID: <1317885752.3.0.17651944283.issue13104@psf.upfronthosting.co.za> Stefan Krah added the comment: /etc/hosts was incomplete; works fine now. Closing again. ---------- resolution: -> fixed stage: test needed -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 11:57:34 2011 From: report at bugs.python.org (Stefan Krah) Date: Thu, 06 Oct 2011 09:57:34 +0000 Subject: [issue12210] test_smtplib: intermittent failures on FreeBSD In-Reply-To: <1306700978.51.0.107374125859.issue12210@psf.upfronthosting.co.za> Message-ID: <1317895054.61.0.162759403762.issue12210@psf.upfronthosting.co.za> Stefan Krah added the comment: Naturally, as soon as I declare it fixed, it occurs again: http://www.python.org/dev/buildbot/all/builders/AMD64%20FreeBSD%208.2%202.7/builds/326 ---------- status: closed -> open _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 12:15:29 2011 From: report at bugs.python.org (Victor Semionov) Date: Thu, 06 Oct 2011 10:15:29 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317896129.05.0.749333653336.issue13070@psf.upfronthosting.co.za> Victor Semionov added the comment: I did not see any segfaults when I ran my app on 2.7. Please verify that 2.7 is really affected before making changes. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 12:39:47 2011 From: report at bugs.python.org (Ezio Melotti) Date: Thu, 06 Oct 2011 10:39:47 +0000 Subject: [issue2771] Test issue In-Reply-To: <1210005645.74.0.283923986194.issue2771@psf.upfronthosting.co.za> Message-ID: <1317897587.25.0.618037713412.issue2771@psf.upfronthosting.co.za> Changes by Ezio Melotti : ---------- nosy: -vsemionov _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 12:41:33 2011 From: report at bugs.python.org (Ezio Melotti) Date: Thu, 06 Oct 2011 10:41:33 +0000 Subject: [issue2771] Test issue In-Reply-To: <1317897587.31.0.526787966097.issue2771@psf.upfronthosting.co.za> Message-ID: Ezio Melotti added the comment: test attachments ---------- Added file: http://bugs.python.org/file23322/unnamed Added file: http://bugs.python.org/file23323/issue12753-3.diff _______________________________________ Python tracker _______________________________________ -------------- next part -------------- test attachments
-------------- next part -------------- diff --git a/Doc/library/unicodedata.rst b/Doc/library/unicodedata.rst --- a/Doc/library/unicodedata.rst +++ b/Doc/library/unicodedata.rst @@ -29,6 +29,9 @@ Look up character by name. If a character with the given name is found, return the corresponding character. If not found, :exc:`KeyError` is raised. + .. versionchanged:: 3.3 + Support for name aliases [#]_ and named sequences [#]_ has been added. + .. function:: name(chr[, default]) @@ -160,3 +163,9 @@ >>> unicodedata.bidirectional('\u0660') # 'A'rabic, 'N'umber 'AN' + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NameAliases.txt + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NamedSequences.txt diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -492,13 +492,13 @@ +-----------------+---------------------------------+-------+ | Escape Sequence | Meaning | Notes | +=================+=================================+=======+ -| ``\N{name}`` | Character named *name* in the | | +| ``\N{name}`` | Character named *name* in the | \(4) | | | Unicode database | | +-----------------+---------------------------------+-------+ -| ``\uxxxx`` | Character with 16-bit hex value | \(4) | +| ``\uxxxx`` | Character with 16-bit hex value | \(5) | | | *xxxx* | | +-----------------+---------------------------------+-------+ -| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(5) | +| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(6) | | | *xxxxxxxx* | | +-----------------+---------------------------------+-------+ @@ -516,10 +516,14 @@ with the given value. (4) + .. versionchanged:: 3.3 + Support for name aliases [#]_ has been added. + +(5) Individual code units which form parts of a surrogate pair can be encoded using this escape sequence. Exactly four hex digits are required. -(5) +(6) Any Unicode character can be encoded this way, but characters outside the Basic Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is compiled to use 16-bit code units (the default). Exactly eight hex digits @@ -706,3 +710,8 @@ occurrence outside string literals and comments is an unconditional error:: $ ? ` + + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NameAliases.txt diff --git a/Lib/test/test_ucn.py b/Lib/test/test_ucn.py --- a/Lib/test/test_ucn.py +++ b/Lib/test/test_ucn.py @@ -8,8 +8,11 @@ """#" import unittest +import unicodedata from test import support +from http.client import HTTPException +from test.test_normalization import check_version class UnicodeNamesTest(unittest.TestCase): @@ -59,8 +62,6 @@ ) def test_ascii_letters(self): - import unicodedata - for char in "".join(map(chr, range(ord("a"), ord("z")))): name = "LATIN SMALL LETTER %s" % char.upper() code = unicodedata.lookup(name) @@ -81,7 +82,6 @@ self.checkletter("HANGUL SYLLABLE HWEOK", "\ud6f8") self.checkletter("HANGUL SYLLABLE HIH", "\ud7a3") - import unicodedata self.assertRaises(ValueError, unicodedata.name, "\ud7a4") def test_cjk_unified_ideographs(self): @@ -97,14 +97,11 @@ self.checkletter("CJK UNIFIED IDEOGRAPH-2B81D", "\U0002B81D") def test_bmp_characters(self): - import unicodedata - count = 0 for code in range(0x10000): char = chr(code) name = unicodedata.name(char, None) if name is not None: self.assertEqual(unicodedata.lookup(name), char) - count += 1 def test_misc_symbols(self): self.checkletter("PILCROW SIGN", "\u00b6") @@ -112,8 +109,65 @@ self.checkletter("HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK", "\uFF9F") self.checkletter("FULLWIDTH LATIN SMALL LETTER A", "\uFF41") + def test_aliases(self): + # Check that the aliases defined in the NameAliases.txt file work. + # This should be updated when new aliases are added or the file + # should be downloaded and parsed instead. See #12753. + aliases = [ + ('LATIN CAPITAL LETTER GHA', 0x01A2), + ('LATIN SMALL LETTER GHA', 0x01A3), + ('KANNADA LETTER LLLA', 0x0CDE), + ('LAO LETTER FO FON', 0x0E9D), + ('LAO LETTER FO FAY', 0x0E9F), + ('LAO LETTER RO', 0x0EA3), + ('LAO LETTER LO', 0x0EA5), + ('TIBETAN MARK BKA- SHOG GI MGO RGYAN', 0x0FD0), + ('YI SYLLABLE ITERATION MARK', 0xA015), + ('PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET', 0xFE18), + ('BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS', 0x1D0C5) + ] + for alias, codepoint in aliases: + self.checkletter(alias, chr(codepoint)) + name = unicodedata.name(chr(codepoint)) + self.assertNotEqual(name, alias) + self.assertEqual(unicodedata.lookup(alias), + unicodedata.lookup(name)) + + def test_named_sequences_sample(self): + # Check a few named sequences. See #12753. + sequences = [ + ('LATIN SMALL LETTER R WITH TILDE', '\u0072\u0303'), + ('TAMIL SYLLABLE SAI', '\u0BB8\u0BC8'), + ('TAMIL SYLLABLE MOO', '\u0BAE\u0BCB'), + ('TAMIL SYLLABLE NNOO', '\u0BA3\u0BCB'), + ('TAMIL CONSONANT KSS', '\u0B95\u0BCD\u0BB7\u0BCD'), + ] + for seqname, codepoints in sequences: + self.assertEqual(unicodedata.lookup(seqname), codepoints) + with self.assertRaises(SyntaxError): + self.checkletter(seqname, None) + + def test_named_sequences_full(self): + # Check all the named sequences + url = ("http://www.unicode.org/Public/%s/ucd/NamedSequences.txt" % + unicodedata.unidata_version) + try: + testdata = support.open_urlresource(url, encoding="utf-8", + check=check_version) + except (IOError, HTTPException): + self.skipTest("Could not retrieve " + url) + self.addCleanup(testdata.close) + for line in testdata: + line = line.strip() + if not line or line.startswith('#'): + continue + seqname, codepoints = line.split(';') + codepoints = ''.join(chr(int(cp, 16)) for cp in codepoints.split()) + self.assertEqual(unicodedata.lookup(seqname), codepoints) + with self.assertRaises(SyntaxError): + self.checkletter(seqname, None) + def test_errors(self): - import unicodedata self.assertRaises(TypeError, unicodedata.name) self.assertRaises(TypeError, unicodedata.name, 'xx') self.assertRaises(TypeError, unicodedata.lookup) diff --git a/Modules/unicodedata.c b/Modules/unicodedata.c --- a/Modules/unicodedata.c +++ b/Modules/unicodedata.c @@ -1054,7 +1054,7 @@ static int _getcode(PyObject* self, const char* name, int namelen, Py_UCS4* code) { - unsigned int h, v; + unsigned int h, v, k; unsigned int mask = code_size-1; unsigned int i, incr; @@ -1100,6 +1100,17 @@ return 1; } + /* check for aliases defined in NameAliases.txt */ + for (k=0; k 0) + low = mid + 1; + else + return PyUnicode_FromKindAndData(PyUnicode_2BYTE_KIND, + named_sequences[mid].seq, + named_sequences[mid].seqlen); + } + return NULL; +} + PyDoc_STRVAR(unicodedata_lookup__doc__, "lookup(name)\n\ \n\ @@ -1187,6 +1218,7 @@ unicodedata_lookup(PyObject* self, PyObject* args) { Py_UCS4 code; + PyObject *codes; /* for named sequences */ char* name; int namelen; @@ -1194,9 +1226,13 @@ return NULL; if (!_getcode(self, name, namelen, &code)) { - PyErr_Format(PyExc_KeyError, "undefined character name '%s'", - name); - return NULL; + /* if the normal lookup fails try with named sequences */ + codes = _lookup_named_sequences(name); + if (codes == NULL) { + PyErr_Format(PyExc_KeyError, "undefined character name '%s'", name); + return NULL; + } + return codes; } return PyUnicode_FromOrdinal(code); diff --git a/Modules/unicodename_db.h b/Modules/unicodename_db.h --- a/Modules/unicodename_db.h +++ b/Modules/unicodename_db.h @@ -18811,3 +18811,452 @@ #define code_magic 47 #define code_size 32768 #define code_poly 32771 + +typedef struct Alias { + char *name; + int namelen; + int codepoint; +} alias; + +static const int aliases_count = 11; +static const alias name_aliases[] = { + {"LATIN CAPITAL LETTER GHA", 24, 0x01A2}, + {"LATIN SMALL LETTER GHA", 22, 0x01A3}, + {"KANNADA LETTER LLLA", 19, 0x0CDE}, + {"LAO LETTER FO FON", 17, 0x0E9D}, + {"LAO LETTER FO FAY", 17, 0x0E9F}, + {"LAO LETTER RO", 13, 0x0EA3}, + {"LAO LETTER LO", 13, 0x0EA5}, + {"TIBETAN MARK BKA- SHOG GI MGO RGYAN", 35, 0x0FD0}, + {"YI SYLLABLE ITERATION MARK", 26, 0xA015}, + {"PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET", 61, 0xFE18}, + {"BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS", 52, 0x1D0C5}, +}; + +typedef struct NamedSequence { + char *name; + int seqlen; + Py_UCS2 seq[4]; +} named_sequence; + +static const int named_sequences_count = 418; +static const named_sequence named_sequences[] = { + {"BENGALI LETTER KHINYA", 3, {0x0995, 0x09CD, 0x09B7}}, + {"GEORGIAN LETTER U-BRJGU", 2, {0x10E3, 0x0302}}, + {"HIRAGANA LETTER BIDAKUON NGA", 2, {0x304B, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGE", 2, {0x3051, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGI", 2, {0x304D, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGO", 2, {0x3053, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGU", 2, {0x304F, 0x309A}}, + {"KATAKANA LETTER AINU CE", 2, {0x30BB, 0x309A}}, + {"KATAKANA LETTER AINU P", 2, {0x31F7, 0x309A}}, + {"KATAKANA LETTER AINU TO", 2, {0x30C8, 0x309A}}, + {"KATAKANA LETTER AINU TU", 2, {0x30C4, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGA", 2, {0x30AB, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGE", 2, {0x30B1, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGI", 2, {0x30AD, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGO", 2, {0x30B3, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGU", 2, {0x30AF, 0x309A}}, + {"KHMER CONSONANT SIGN COENG BA", 2, {0x17D2, 0x1794}}, + {"KHMER CONSONANT SIGN COENG CA", 2, {0x17D2, 0x1785}}, + {"KHMER CONSONANT SIGN COENG CHA", 2, {0x17D2, 0x1786}}, + {"KHMER CONSONANT SIGN COENG CHO", 2, {0x17D2, 0x1788}}, + {"KHMER CONSONANT SIGN COENG CO", 2, {0x17D2, 0x1787}}, + {"KHMER CONSONANT SIGN COENG DA", 2, {0x17D2, 0x178A}}, + {"KHMER CONSONANT SIGN COENG DO", 2, {0x17D2, 0x178C}}, + {"KHMER CONSONANT SIGN COENG HA", 2, {0x17D2, 0x17A0}}, + {"KHMER CONSONANT SIGN COENG KA", 2, {0x17D2, 0x1780}}, + {"KHMER CONSONANT SIGN COENG KHA", 2, {0x17D2, 0x1781}}, + {"KHMER CONSONANT SIGN COENG KHO", 2, {0x17D2, 0x1783}}, + {"KHMER CONSONANT SIGN COENG KO", 2, {0x17D2, 0x1782}}, + {"KHMER CONSONANT SIGN COENG LA", 2, {0x17D2, 0x17A1}}, + {"KHMER CONSONANT SIGN COENG LO", 2, {0x17D2, 0x179B}}, + {"KHMER CONSONANT SIGN COENG MO", 2, {0x17D2, 0x1798}}, + {"KHMER CONSONANT SIGN COENG NA", 2, {0x17D2, 0x178E}}, + {"KHMER CONSONANT SIGN COENG NGO", 2, {0x17D2, 0x1784}}, + {"KHMER CONSONANT SIGN COENG NO", 2, {0x17D2, 0x1793}}, + {"KHMER CONSONANT SIGN COENG NYO", 2, {0x17D2, 0x1789}}, + {"KHMER CONSONANT SIGN COENG PHA", 2, {0x17D2, 0x1795}}, + {"KHMER CONSONANT SIGN COENG PHO", 2, {0x17D2, 0x1797}}, + {"KHMER CONSONANT SIGN COENG PO", 2, {0x17D2, 0x1796}}, + {"KHMER CONSONANT SIGN COENG RO", 2, {0x17D2, 0x179A}}, + {"KHMER CONSONANT SIGN COENG SA", 2, {0x17D2, 0x179F}}, + {"KHMER CONSONANT SIGN COENG SHA", 2, {0x17D2, 0x179D}}, + {"KHMER CONSONANT SIGN COENG SSA", 2, {0x17D2, 0x179E}}, + {"KHMER CONSONANT SIGN COENG TA", 2, {0x17D2, 0x178F}}, + {"KHMER CONSONANT SIGN COENG THA", 2, {0x17D2, 0x1790}}, + {"KHMER CONSONANT SIGN COENG THO", 2, {0x17D2, 0x1792}}, + {"KHMER CONSONANT SIGN COENG TO", 2, {0x17D2, 0x1791}}, + {"KHMER CONSONANT SIGN COENG TTHA", 2, {0x17D2, 0x178B}}, + {"KHMER CONSONANT SIGN COENG TTHO", 2, {0x17D2, 0x178D}}, + {"KHMER CONSONANT SIGN COENG VO", 2, {0x17D2, 0x179C}}, + {"KHMER CONSONANT SIGN COENG YO", 2, {0x17D2, 0x1799}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG QE", 2, {0x17D2, 0x17AF}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG QU", 2, {0x17D2, 0x17A7}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG RY", 2, {0x17D2, 0x17AB}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG RYY", 2, {0x17D2, 0x17AC}}, + {"KHMER VOWEL SIGN AAM", 2, {0x17B6, 0x17C6}}, + {"KHMER VOWEL SIGN COENG QA", 2, {0x17D2, 0x17A2}}, + {"KHMER VOWEL SIGN OM", 2, {0x17BB, 0x17C6}}, + {"LATIN CAPITAL LETTER A WITH MACRON AND GRAVE", 2, {0x0100, 0x0300}}, + {"LATIN CAPITAL LETTER A WITH OGONEK AND ACUTE", 2, {0x0104, 0x0301}}, + {"LATIN CAPITAL LETTER A WITH OGONEK AND TILDE", 2, {0x0104, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND CARON", 2, {0x00CA, 0x030C}}, + {"LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND MACRON", 2, {0x00CA, 0x0304}}, + {"LATIN CAPITAL LETTER E WITH DOT ABOVE AND ACUTE", 2, {0x0116, 0x0301}}, + {"LATIN CAPITAL LETTER E WITH DOT ABOVE AND TILDE", 2, {0x0116, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH OGONEK AND ACUTE", 2, {0x0118, 0x0301}}, + {"LATIN CAPITAL LETTER E WITH OGONEK AND TILDE", 2, {0x0118, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW", 2, {0x0045, 0x0329}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00C9, 0x0329}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00C8, 0x0329}}, + {"LATIN CAPITAL LETTER I WITH MACRON AND GRAVE", 2, {0x012A, 0x0300}}, + {"LATIN CAPITAL LETTER I WITH OGONEK AND ACUTE", 2, {0x012E, 0x0301}}, + {"LATIN CAPITAL LETTER I WITH OGONEK AND TILDE", 2, {0x012E, 0x0303}}, + {"LATIN CAPITAL LETTER J WITH TILDE", 2, {0x004A, 0x0303}}, + {"LATIN CAPITAL LETTER L WITH TILDE", 2, {0x004C, 0x0303}}, + {"LATIN CAPITAL LETTER M WITH TILDE", 2, {0x004D, 0x0303}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW", 2, {0x004F, 0x0329}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00D3, 0x0329}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00D2, 0x0329}}, + {"LATIN CAPITAL LETTER R WITH TILDE", 2, {0x0052, 0x0303}}, + {"LATIN CAPITAL LETTER S WITH VERTICAL LINE BELOW", 2, {0x0053, 0x0329}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND ACUTE", 2, {0x016A, 0x0301}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND GRAVE", 2, {0x016A, 0x0300}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND TILDE", 2, {0x016A, 0x0303}}, + {"LATIN CAPITAL LETTER U WITH OGONEK AND ACUTE", 2, {0x0172, 0x0301}}, + {"LATIN CAPITAL LETTER U WITH OGONEK AND TILDE", 2, {0x0172, 0x0303}}, + {"LATIN SMALL LETTER A WITH MACRON AND GRAVE", 2, {0x0101, 0x0300}}, + {"LATIN SMALL LETTER A WITH OGONEK AND ACUTE", 2, {0x0105, 0x0301}}, + {"LATIN SMALL LETTER A WITH OGONEK AND TILDE", 2, {0x0105, 0x0303}}, + {"LATIN SMALL LETTER AE WITH GRAVE", 2, {0x00E6, 0x0300}}, + {"LATIN SMALL LETTER E WITH CIRCUMFLEX AND CARON", 2, {0x00EA, 0x030C}}, + {"LATIN SMALL LETTER E WITH CIRCUMFLEX AND MACRON", 2, {0x00EA, 0x0304}}, + {"LATIN SMALL LETTER E WITH DOT ABOVE AND ACUTE", 2, {0x0117, 0x0301}}, + {"LATIN SMALL LETTER E WITH DOT ABOVE AND TILDE", 2, {0x0117, 0x0303}}, + {"LATIN SMALL LETTER E WITH OGONEK AND ACUTE", 2, {0x0119, 0x0301}}, + {"LATIN SMALL LETTER E WITH OGONEK AND TILDE", 2, {0x0119, 0x0303}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW", 2, {0x0065, 0x0329}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00E9, 0x0329}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00E8, 0x0329}}, + {"LATIN SMALL LETTER HOOKED SCHWA WITH ACUTE", 2, {0x025A, 0x0301}}, + {"LATIN SMALL LETTER HOOKED SCHWA WITH GRAVE", 2, {0x025A, 0x0300}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND ACUTE", 3, {0x0069, 0x0307, 0x0301}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND GRAVE", 3, {0x0069, 0x0307, 0x0300}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND TILDE", 3, {0x0069, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER I WITH MACRON AND GRAVE", 2, {0x012B, 0x0300}}, + {"LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND ACUTE", 3, {0x012F, 0x0307, 0x0301}}, + {"LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND TILDE", 3, {0x012F, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER J WITH DOT ABOVE AND TILDE", 3, {0x006A, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER L WITH TILDE", 2, {0x006C, 0x0303}}, + {"LATIN SMALL LETTER M WITH TILDE", 2, {0x006D, 0x0303}}, + {"LATIN SMALL LETTER NG WITH TILDE ABOVE", 3, {0x006E, 0x0360, 0x0067}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW", 2, {0x006F, 0x0329}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00F3, 0x0329}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00F2, 0x0329}}, + {"LATIN SMALL LETTER OPEN O WITH ACUTE", 2, {0x0254, 0x0301}}, + {"LATIN SMALL LETTER OPEN O WITH GRAVE", 2, {0x0254, 0x0300}}, + {"LATIN SMALL LETTER R WITH TILDE", 2, {0x0072, 0x0303}}, + {"LATIN SMALL LETTER S WITH VERTICAL LINE BELOW", 2, {0x0073, 0x0329}}, + {"LATIN SMALL LETTER SCHWA WITH ACUTE", 2, {0x0259, 0x0301}}, + {"LATIN SMALL LETTER SCHWA WITH GRAVE", 2, {0x0259, 0x0300}}, + {"LATIN SMALL LETTER TURNED V WITH ACUTE", 2, {0x028C, 0x0301}}, + {"LATIN SMALL LETTER TURNED V WITH GRAVE", 2, {0x028C, 0x0300}}, + {"LATIN SMALL LETTER U WITH MACRON AND ACUTE", 2, {0x016B, 0x0301}}, + {"LATIN SMALL LETTER U WITH MACRON AND GRAVE", 2, {0x016B, 0x0300}}, + {"LATIN SMALL LETTER U WITH MACRON AND TILDE", 2, {0x016B, 0x0303}}, + {"LATIN SMALL LETTER U WITH OGONEK AND ACUTE", 2, {0x0173, 0x0301}}, + {"LATIN SMALL LETTER U WITH OGONEK AND TILDE", 2, {0x0173, 0x0303}}, + {"MODIFIER LETTER EXTRA-HIGH EXTRA-LOW CONTOUR TONE BAR", 2, {0x02E5, 0x02E9}}, + {"MODIFIER LETTER EXTRA-LOW EXTRA-HIGH CONTOUR TONE BAR", 2, {0x02E9, 0x02E5}}, + {"TAMIL CONSONANT C", 2, {0x0B9A, 0x0BCD}}, + {"TAMIL CONSONANT H", 2, {0x0BB9, 0x0BCD}}, + {"TAMIL CONSONANT J", 2, {0x0B9C, 0x0BCD}}, + {"TAMIL CONSONANT K", 2, {0x0B95, 0x0BCD}}, + {"TAMIL CONSONANT KSS", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCD}}, + {"TAMIL CONSONANT L", 2, {0x0BB2, 0x0BCD}}, + {"TAMIL CONSONANT LL", 2, {0x0BB3, 0x0BCD}}, + {"TAMIL CONSONANT LLL", 2, {0x0BB4, 0x0BCD}}, + {"TAMIL CONSONANT M", 2, {0x0BAE, 0x0BCD}}, + {"TAMIL CONSONANT N", 2, {0x0BA8, 0x0BCD}}, + {"TAMIL CONSONANT NG", 2, {0x0B99, 0x0BCD}}, + {"TAMIL CONSONANT NN", 2, {0x0BA3, 0x0BCD}}, + {"TAMIL CONSONANT NNN", 2, {0x0BA9, 0x0BCD}}, + {"TAMIL CONSONANT NY", 2, {0x0B9E, 0x0BCD}}, + {"TAMIL CONSONANT P", 2, {0x0BAA, 0x0BCD}}, + {"TAMIL CONSONANT R", 2, {0x0BB0, 0x0BCD}}, + {"TAMIL CONSONANT RR", 2, {0x0BB1, 0x0BCD}}, + {"TAMIL CONSONANT S", 2, {0x0BB8, 0x0BCD}}, + {"TAMIL CONSONANT SH", 2, {0x0BB6, 0x0BCD}}, + {"TAMIL CONSONANT SS", 2, {0x0BB7, 0x0BCD}}, + {"TAMIL CONSONANT T", 2, {0x0BA4, 0x0BCD}}, + {"TAMIL CONSONANT TT", 2, {0x0B9F, 0x0BCD}}, + {"TAMIL CONSONANT V", 2, {0x0BB5, 0x0BCD}}, + {"TAMIL CONSONANT Y", 2, {0x0BAF, 0x0BCD}}, + {"TAMIL SYLLABLE CAA", 2, {0x0B9A, 0x0BBE}}, + {"TAMIL SYLLABLE CAI", 2, {0x0B9A, 0x0BC8}}, + {"TAMIL SYLLABLE CAU", 2, {0x0B9A, 0x0BCC}}, + {"TAMIL SYLLABLE CE", 2, {0x0B9A, 0x0BC6}}, + {"TAMIL SYLLABLE CEE", 2, {0x0B9A, 0x0BC7}}, + {"TAMIL SYLLABLE CI", 2, {0x0B9A, 0x0BBF}}, + {"TAMIL SYLLABLE CII", 2, {0x0B9A, 0x0BC0}}, + {"TAMIL SYLLABLE CO", 2, {0x0B9A, 0x0BCA}}, + {"TAMIL SYLLABLE COO", 2, {0x0B9A, 0x0BCB}}, + {"TAMIL SYLLABLE CU", 2, {0x0B9A, 0x0BC1}}, + {"TAMIL SYLLABLE CUU", 2, {0x0B9A, 0x0BC2}}, + {"TAMIL SYLLABLE HAA", 2, {0x0BB9, 0x0BBE}}, + {"TAMIL SYLLABLE HAI", 2, {0x0BB9, 0x0BC8}}, + {"TAMIL SYLLABLE HAU", 2, {0x0BB9, 0x0BCC}}, + {"TAMIL SYLLABLE HE", 2, {0x0BB9, 0x0BC6}}, + {"TAMIL SYLLABLE HEE", 2, {0x0BB9, 0x0BC7}}, + {"TAMIL SYLLABLE HI", 2, {0x0BB9, 0x0BBF}}, + {"TAMIL SYLLABLE HII", 2, {0x0BB9, 0x0BC0}}, + {"TAMIL SYLLABLE HO", 2, {0x0BB9, 0x0BCA}}, + {"TAMIL SYLLABLE HOO", 2, {0x0BB9, 0x0BCB}}, + {"TAMIL SYLLABLE HU", 2, {0x0BB9, 0x0BC1}}, + {"TAMIL SYLLABLE HUU", 2, {0x0BB9, 0x0BC2}}, + {"TAMIL SYLLABLE JAA", 2, {0x0B9C, 0x0BBE}}, + {"TAMIL SYLLABLE JAI", 2, {0x0B9C, 0x0BC8}}, + {"TAMIL SYLLABLE JAU", 2, {0x0B9C, 0x0BCC}}, + {"TAMIL SYLLABLE JE", 2, {0x0B9C, 0x0BC6}}, + {"TAMIL SYLLABLE JEE", 2, {0x0B9C, 0x0BC7}}, + {"TAMIL SYLLABLE JI", 2, {0x0B9C, 0x0BBF}}, + {"TAMIL SYLLABLE JII", 2, {0x0B9C, 0x0BC0}}, + {"TAMIL SYLLABLE JO", 2, {0x0B9C, 0x0BCA}}, + {"TAMIL SYLLABLE JOO", 2, {0x0B9C, 0x0BCB}}, + {"TAMIL SYLLABLE JU", 2, {0x0B9C, 0x0BC1}}, + {"TAMIL SYLLABLE JUU", 2, {0x0B9C, 0x0BC2}}, + {"TAMIL SYLLABLE KAA", 2, {0x0B95, 0x0BBE}}, + {"TAMIL SYLLABLE KAI", 2, {0x0B95, 0x0BC8}}, + {"TAMIL SYLLABLE KAU", 2, {0x0B95, 0x0BCC}}, + {"TAMIL SYLLABLE KE", 2, {0x0B95, 0x0BC6}}, + {"TAMIL SYLLABLE KEE", 2, {0x0B95, 0x0BC7}}, + {"TAMIL SYLLABLE KI", 2, {0x0B95, 0x0BBF}}, + {"TAMIL SYLLABLE KII", 2, {0x0B95, 0x0BC0}}, + {"TAMIL SYLLABLE KO", 2, {0x0B95, 0x0BCA}}, + {"TAMIL SYLLABLE KOO", 2, {0x0B95, 0x0BCB}}, + {"TAMIL SYLLABLE KSSA", 3, {0x0B95, 0x0BCD, 0x0BB7}}, + {"TAMIL SYLLABLE KSSAA", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BBE}}, + {"TAMIL SYLLABLE KSSAI", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC8}}, + {"TAMIL SYLLABLE KSSAU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCC}}, + {"TAMIL SYLLABLE KSSE", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC6}}, + {"TAMIL SYLLABLE KSSEE", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC7}}, + {"TAMIL SYLLABLE KSSI", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BBF}}, + {"TAMIL SYLLABLE KSSII", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC0}}, + {"TAMIL SYLLABLE KSSO", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCA}}, + {"TAMIL SYLLABLE KSSOO", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCB}}, + {"TAMIL SYLLABLE KSSU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC1}}, + {"TAMIL SYLLABLE KSSUU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC2}}, + {"TAMIL SYLLABLE KU", 2, {0x0B95, 0x0BC1}}, + {"TAMIL SYLLABLE KUU", 2, {0x0B95, 0x0BC2}}, + {"TAMIL SYLLABLE LAA", 2, {0x0BB2, 0x0BBE}}, + {"TAMIL SYLLABLE LAI", 2, {0x0BB2, 0x0BC8}}, + {"TAMIL SYLLABLE LAU", 2, {0x0BB2, 0x0BCC}}, + {"TAMIL SYLLABLE LE", 2, {0x0BB2, 0x0BC6}}, + {"TAMIL SYLLABLE LEE", 2, {0x0BB2, 0x0BC7}}, + {"TAMIL SYLLABLE LI", 2, {0x0BB2, 0x0BBF}}, + {"TAMIL SYLLABLE LII", 2, {0x0BB2, 0x0BC0}}, + {"TAMIL SYLLABLE LLAA", 2, {0x0BB3, 0x0BBE}}, + {"TAMIL SYLLABLE LLAI", 2, {0x0BB3, 0x0BC8}}, + {"TAMIL SYLLABLE LLAU", 2, {0x0BB3, 0x0BCC}}, + {"TAMIL SYLLABLE LLE", 2, {0x0BB3, 0x0BC6}}, + {"TAMIL SYLLABLE LLEE", 2, {0x0BB3, 0x0BC7}}, + {"TAMIL SYLLABLE LLI", 2, {0x0BB3, 0x0BBF}}, + {"TAMIL SYLLABLE LLII", 2, {0x0BB3, 0x0BC0}}, + {"TAMIL SYLLABLE LLLAA", 2, {0x0BB4, 0x0BBE}}, + {"TAMIL SYLLABLE LLLAI", 2, {0x0BB4, 0x0BC8}}, + {"TAMIL SYLLABLE LLLAU", 2, {0x0BB4, 0x0BCC}}, + {"TAMIL SYLLABLE LLLE", 2, {0x0BB4, 0x0BC6}}, + {"TAMIL SYLLABLE LLLEE", 2, {0x0BB4, 0x0BC7}}, + {"TAMIL SYLLABLE LLLI", 2, {0x0BB4, 0x0BBF}}, + {"TAMIL SYLLABLE LLLII", 2, {0x0BB4, 0x0BC0}}, + {"TAMIL SYLLABLE LLLO", 2, {0x0BB4, 0x0BCA}}, + {"TAMIL SYLLABLE LLLOO", 2, {0x0BB4, 0x0BCB}}, + {"TAMIL SYLLABLE LLLU", 2, {0x0BB4, 0x0BC1}}, + {"TAMIL SYLLABLE LLLUU", 2, {0x0BB4, 0x0BC2}}, + {"TAMIL SYLLABLE LLO", 2, {0x0BB3, 0x0BCA}}, + {"TAMIL SYLLABLE LLOO", 2, {0x0BB3, 0x0BCB}}, + {"TAMIL SYLLABLE LLU", 2, {0x0BB3, 0x0BC1}}, + {"TAMIL SYLLABLE LLUU", 2, {0x0BB3, 0x0BC2}}, + {"TAMIL SYLLABLE LO", 2, {0x0BB2, 0x0BCA}}, + {"TAMIL SYLLABLE LOO", 2, {0x0BB2, 0x0BCB}}, + {"TAMIL SYLLABLE LU", 2, {0x0BB2, 0x0BC1}}, + {"TAMIL SYLLABLE LUU", 2, {0x0BB2, 0x0BC2}}, + {"TAMIL SYLLABLE MAA", 2, {0x0BAE, 0x0BBE}}, + {"TAMIL SYLLABLE MAI", 2, {0x0BAE, 0x0BC8}}, + {"TAMIL SYLLABLE MAU", 2, {0x0BAE, 0x0BCC}}, + {"TAMIL SYLLABLE ME", 2, {0x0BAE, 0x0BC6}}, + {"TAMIL SYLLABLE MEE", 2, {0x0BAE, 0x0BC7}}, + {"TAMIL SYLLABLE MI", 2, {0x0BAE, 0x0BBF}}, + {"TAMIL SYLLABLE MII", 2, {0x0BAE, 0x0BC0}}, + {"TAMIL SYLLABLE MO", 2, {0x0BAE, 0x0BCA}}, + {"TAMIL SYLLABLE MOO", 2, {0x0BAE, 0x0BCB}}, + {"TAMIL SYLLABLE MU", 2, {0x0BAE, 0x0BC1}}, + {"TAMIL SYLLABLE MUU", 2, {0x0BAE, 0x0BC2}}, + {"TAMIL SYLLABLE NAA", 2, {0x0BA8, 0x0BBE}}, + {"TAMIL SYLLABLE NAI", 2, {0x0BA8, 0x0BC8}}, + {"TAMIL SYLLABLE NAU", 2, {0x0BA8, 0x0BCC}}, + {"TAMIL SYLLABLE NE", 2, {0x0BA8, 0x0BC6}}, + {"TAMIL SYLLABLE NEE", 2, {0x0BA8, 0x0BC7}}, + {"TAMIL SYLLABLE NGAA", 2, {0x0B99, 0x0BBE}}, + {"TAMIL SYLLABLE NGAI", 2, {0x0B99, 0x0BC8}}, + {"TAMIL SYLLABLE NGAU", 2, {0x0B99, 0x0BCC}}, + {"TAMIL SYLLABLE NGE", 2, {0x0B99, 0x0BC6}}, + {"TAMIL SYLLABLE NGEE", 2, {0x0B99, 0x0BC7}}, + {"TAMIL SYLLABLE NGI", 2, {0x0B99, 0x0BBF}}, + {"TAMIL SYLLABLE NGII", 2, {0x0B99, 0x0BC0}}, + {"TAMIL SYLLABLE NGO", 2, {0x0B99, 0x0BCA}}, + {"TAMIL SYLLABLE NGOO", 2, {0x0B99, 0x0BCB}}, + {"TAMIL SYLLABLE NGU", 2, {0x0B99, 0x0BC1}}, + {"TAMIL SYLLABLE NGUU", 2, {0x0B99, 0x0BC2}}, + {"TAMIL SYLLABLE NI", 2, {0x0BA8, 0x0BBF}}, + {"TAMIL SYLLABLE NII", 2, {0x0BA8, 0x0BC0}}, + {"TAMIL SYLLABLE NNAA", 2, {0x0BA3, 0x0BBE}}, + {"TAMIL SYLLABLE NNAI", 2, {0x0BA3, 0x0BC8}}, + {"TAMIL SYLLABLE NNAU", 2, {0x0BA3, 0x0BCC}}, + {"TAMIL SYLLABLE NNE", 2, {0x0BA3, 0x0BC6}}, + {"TAMIL SYLLABLE NNEE", 2, {0x0BA3, 0x0BC7}}, + {"TAMIL SYLLABLE NNI", 2, {0x0BA3, 0x0BBF}}, + {"TAMIL SYLLABLE NNII", 2, {0x0BA3, 0x0BC0}}, + {"TAMIL SYLLABLE NNNAA", 2, {0x0BA9, 0x0BBE}}, + {"TAMIL SYLLABLE NNNAI", 2, {0x0BA9, 0x0BC8}}, + {"TAMIL SYLLABLE NNNAU", 2, {0x0BA9, 0x0BCC}}, + {"TAMIL SYLLABLE NNNE", 2, {0x0BA9, 0x0BC6}}, + {"TAMIL SYLLABLE NNNEE", 2, {0x0BA9, 0x0BC7}}, + {"TAMIL SYLLABLE NNNI", 2, {0x0BA9, 0x0BBF}}, + {"TAMIL SYLLABLE NNNII", 2, {0x0BA9, 0x0BC0}}, + {"TAMIL SYLLABLE NNNO", 2, {0x0BA9, 0x0BCA}}, + {"TAMIL SYLLABLE NNNOO", 2, {0x0BA9, 0x0BCB}}, + {"TAMIL SYLLABLE NNNU", 2, {0x0BA9, 0x0BC1}}, + {"TAMIL SYLLABLE NNNUU", 2, {0x0BA9, 0x0BC2}}, + {"TAMIL SYLLABLE NNO", 2, {0x0BA3, 0x0BCA}}, + {"TAMIL SYLLABLE NNOO", 2, {0x0BA3, 0x0BCB}}, + {"TAMIL SYLLABLE NNU", 2, {0x0BA3, 0x0BC1}}, + {"TAMIL SYLLABLE NNUU", 2, {0x0BA3, 0x0BC2}}, + {"TAMIL SYLLABLE NO", 2, {0x0BA8, 0x0BCA}}, + {"TAMIL SYLLABLE NOO", 2, {0x0BA8, 0x0BCB}}, + {"TAMIL SYLLABLE NU", 2, {0x0BA8, 0x0BC1}}, + {"TAMIL SYLLABLE NUU", 2, {0x0BA8, 0x0BC2}}, + {"TAMIL SYLLABLE NYAA", 2, {0x0B9E, 0x0BBE}}, + {"TAMIL SYLLABLE NYAI", 2, {0x0B9E, 0x0BC8}}, + {"TAMIL SYLLABLE NYAU", 2, {0x0B9E, 0x0BCC}}, + {"TAMIL SYLLABLE NYE", 2, {0x0B9E, 0x0BC6}}, + {"TAMIL SYLLABLE NYEE", 2, {0x0B9E, 0x0BC7}}, + {"TAMIL SYLLABLE NYI", 2, {0x0B9E, 0x0BBF}}, + {"TAMIL SYLLABLE NYII", 2, {0x0B9E, 0x0BC0}}, + {"TAMIL SYLLABLE NYO", 2, {0x0B9E, 0x0BCA}}, + {"TAMIL SYLLABLE NYOO", 2, {0x0B9E, 0x0BCB}}, + {"TAMIL SYLLABLE NYU", 2, {0x0B9E, 0x0BC1}}, + {"TAMIL SYLLABLE NYUU", 2, {0x0B9E, 0x0BC2}}, + {"TAMIL SYLLABLE PAA", 2, {0x0BAA, 0x0BBE}}, + {"TAMIL SYLLABLE PAI", 2, {0x0BAA, 0x0BC8}}, + {"TAMIL SYLLABLE PAU", 2, {0x0BAA, 0x0BCC}}, + {"TAMIL SYLLABLE PE", 2, {0x0BAA, 0x0BC6}}, + {"TAMIL SYLLABLE PEE", 2, {0x0BAA, 0x0BC7}}, + {"TAMIL SYLLABLE PI", 2, {0x0BAA, 0x0BBF}}, + {"TAMIL SYLLABLE PII", 2, {0x0BAA, 0x0BC0}}, + {"TAMIL SYLLABLE PO", 2, {0x0BAA, 0x0BCA}}, + {"TAMIL SYLLABLE POO", 2, {0x0BAA, 0x0BCB}}, + {"TAMIL SYLLABLE PU", 2, {0x0BAA, 0x0BC1}}, + {"TAMIL SYLLABLE PUU", 2, {0x0BAA, 0x0BC2}}, + {"TAMIL SYLLABLE RAA", 2, {0x0BB0, 0x0BBE}}, + {"TAMIL SYLLABLE RAI", 2, {0x0BB0, 0x0BC8}}, + {"TAMIL SYLLABLE RAU", 2, {0x0BB0, 0x0BCC}}, + {"TAMIL SYLLABLE RE", 2, {0x0BB0, 0x0BC6}}, + {"TAMIL SYLLABLE REE", 2, {0x0BB0, 0x0BC7}}, + {"TAMIL SYLLABLE RI", 2, {0x0BB0, 0x0BBF}}, + {"TAMIL SYLLABLE RII", 2, {0x0BB0, 0x0BC0}}, + {"TAMIL SYLLABLE RO", 2, {0x0BB0, 0x0BCA}}, + {"TAMIL SYLLABLE ROO", 2, {0x0BB0, 0x0BCB}}, + {"TAMIL SYLLABLE RRAA", 2, {0x0BB1, 0x0BBE}}, + {"TAMIL SYLLABLE RRAI", 2, {0x0BB1, 0x0BC8}}, + {"TAMIL SYLLABLE RRAU", 2, {0x0BB1, 0x0BCC}}, + {"TAMIL SYLLABLE RRE", 2, {0x0BB1, 0x0BC6}}, + {"TAMIL SYLLABLE RREE", 2, {0x0BB1, 0x0BC7}}, + {"TAMIL SYLLABLE RRI", 2, {0x0BB1, 0x0BBF}}, + {"TAMIL SYLLABLE RRII", 2, {0x0BB1, 0x0BC0}}, + {"TAMIL SYLLABLE RRO", 2, {0x0BB1, 0x0BCA}}, + {"TAMIL SYLLABLE RROO", 2, {0x0BB1, 0x0BCB}}, + {"TAMIL SYLLABLE RRU", 2, {0x0BB1, 0x0BC1}}, + {"TAMIL SYLLABLE RRUU", 2, {0x0BB1, 0x0BC2}}, + {"TAMIL SYLLABLE RU", 2, {0x0BB0, 0x0BC1}}, + {"TAMIL SYLLABLE RUU", 2, {0x0BB0, 0x0BC2}}, + {"TAMIL SYLLABLE SAA", 2, {0x0BB8, 0x0BBE}}, + {"TAMIL SYLLABLE SAI", 2, {0x0BB8, 0x0BC8}}, + {"TAMIL SYLLABLE SAU", 2, {0x0BB8, 0x0BCC}}, + {"TAMIL SYLLABLE SE", 2, {0x0BB8, 0x0BC6}}, + {"TAMIL SYLLABLE SEE", 2, {0x0BB8, 0x0BC7}}, + {"TAMIL SYLLABLE SHAA", 2, {0x0BB6, 0x0BBE}}, + {"TAMIL SYLLABLE SHAI", 2, {0x0BB6, 0x0BC8}}, + {"TAMIL SYLLABLE SHAU", 2, {0x0BB6, 0x0BCC}}, + {"TAMIL SYLLABLE SHE", 2, {0x0BB6, 0x0BC6}}, + {"TAMIL SYLLABLE SHEE", 2, {0x0BB6, 0x0BC7}}, + {"TAMIL SYLLABLE SHI", 2, {0x0BB6, 0x0BBF}}, + {"TAMIL SYLLABLE SHII", 2, {0x0BB6, 0x0BC0}}, + {"TAMIL SYLLABLE SHO", 2, {0x0BB6, 0x0BCA}}, + {"TAMIL SYLLABLE SHOO", 2, {0x0BB6, 0x0BCB}}, + {"TAMIL SYLLABLE SHRII", 4, {0x0BB6, 0x0BCD, 0x0BB0, 0x0BC0}}, + {"TAMIL SYLLABLE SHU", 2, {0x0BB6, 0x0BC1}}, + {"TAMIL SYLLABLE SHUU", 2, {0x0BB6, 0x0BC2}}, + {"TAMIL SYLLABLE SI", 2, {0x0BB8, 0x0BBF}}, + {"TAMIL SYLLABLE SII", 2, {0x0BB8, 0x0BC0}}, + {"TAMIL SYLLABLE SO", 2, {0x0BB8, 0x0BCA}}, + {"TAMIL SYLLABLE SOO", 2, {0x0BB8, 0x0BCB}}, + {"TAMIL SYLLABLE SSAA", 2, {0x0BB7, 0x0BBE}}, + {"TAMIL SYLLABLE SSAI", 2, {0x0BB7, 0x0BC8}}, + {"TAMIL SYLLABLE SSAU", 2, {0x0BB7, 0x0BCC}}, + {"TAMIL SYLLABLE SSE", 2, {0x0BB7, 0x0BC6}}, + {"TAMIL SYLLABLE SSEE", 2, {0x0BB7, 0x0BC7}}, + {"TAMIL SYLLABLE SSI", 2, {0x0BB7, 0x0BBF}}, + {"TAMIL SYLLABLE SSII", 2, {0x0BB7, 0x0BC0}}, + {"TAMIL SYLLABLE SSO", 2, {0x0BB7, 0x0BCA}}, + {"TAMIL SYLLABLE SSOO", 2, {0x0BB7, 0x0BCB}}, + {"TAMIL SYLLABLE SSU", 2, {0x0BB7, 0x0BC1}}, + {"TAMIL SYLLABLE SSUU", 2, {0x0BB7, 0x0BC2}}, + {"TAMIL SYLLABLE SU", 2, {0x0BB8, 0x0BC1}}, + {"TAMIL SYLLABLE SUU", 2, {0x0BB8, 0x0BC2}}, + {"TAMIL SYLLABLE TAA", 2, {0x0BA4, 0x0BBE}}, + {"TAMIL SYLLABLE TAI", 2, {0x0BA4, 0x0BC8}}, + {"TAMIL SYLLABLE TAU", 2, {0x0BA4, 0x0BCC}}, + {"TAMIL SYLLABLE TE", 2, {0x0BA4, 0x0BC6}}, + {"TAMIL SYLLABLE TEE", 2, {0x0BA4, 0x0BC7}}, + {"TAMIL SYLLABLE TI", 2, {0x0BA4, 0x0BBF}}, + {"TAMIL SYLLABLE TII", 2, {0x0BA4, 0x0BC0}}, + {"TAMIL SYLLABLE TO", 2, {0x0BA4, 0x0BCA}}, + {"TAMIL SYLLABLE TOO", 2, {0x0BA4, 0x0BCB}}, + {"TAMIL SYLLABLE TTAA", 2, {0x0B9F, 0x0BBE}}, + {"TAMIL SYLLABLE TTAI", 2, {0x0B9F, 0x0BC8}}, + {"TAMIL SYLLABLE TTAU", 2, {0x0B9F, 0x0BCC}}, + {"TAMIL SYLLABLE TTE", 2, {0x0B9F, 0x0BC6}}, + {"TAMIL SYLLABLE TTEE", 2, {0x0B9F, 0x0BC7}}, + {"TAMIL SYLLABLE TTI", 2, {0x0B9F, 0x0BBF}}, + {"TAMIL SYLLABLE TTII", 2, {0x0B9F, 0x0BC0}}, + {"TAMIL SYLLABLE TTO", 2, {0x0B9F, 0x0BCA}}, + {"TAMIL SYLLABLE TTOO", 2, {0x0B9F, 0x0BCB}}, + {"TAMIL SYLLABLE TTU", 2, {0x0B9F, 0x0BC1}}, + {"TAMIL SYLLABLE TTUU", 2, {0x0B9F, 0x0BC2}}, + {"TAMIL SYLLABLE TU", 2, {0x0BA4, 0x0BC1}}, + {"TAMIL SYLLABLE TUU", 2, {0x0BA4, 0x0BC2}}, + {"TAMIL SYLLABLE VAA", 2, {0x0BB5, 0x0BBE}}, + {"TAMIL SYLLABLE VAI", 2, {0x0BB5, 0x0BC8}}, + {"TAMIL SYLLABLE VAU", 2, {0x0BB5, 0x0BCC}}, + {"TAMIL SYLLABLE VE", 2, {0x0BB5, 0x0BC6}}, + {"TAMIL SYLLABLE VEE", 2, {0x0BB5, 0x0BC7}}, + {"TAMIL SYLLABLE VI", 2, {0x0BB5, 0x0BBF}}, + {"TAMIL SYLLABLE VII", 2, {0x0BB5, 0x0BC0}}, + {"TAMIL SYLLABLE VO", 2, {0x0BB5, 0x0BCA}}, + {"TAMIL SYLLABLE VOO", 2, {0x0BB5, 0x0BCB}}, + {"TAMIL SYLLABLE VU", 2, {0x0BB5, 0x0BC1}}, + {"TAMIL SYLLABLE VUU", 2, {0x0BB5, 0x0BC2}}, + {"TAMIL SYLLABLE YAA", 2, {0x0BAF, 0x0BBE}}, + {"TAMIL SYLLABLE YAI", 2, {0x0BAF, 0x0BC8}}, + {"TAMIL SYLLABLE YAU", 2, {0x0BAF, 0x0BCC}}, + {"TAMIL SYLLABLE YE", 2, {0x0BAF, 0x0BC6}}, + {"TAMIL SYLLABLE YEE", 2, {0x0BAF, 0x0BC7}}, + {"TAMIL SYLLABLE YI", 2, {0x0BAF, 0x0BBF}}, + {"TAMIL SYLLABLE YII", 2, {0x0BAF, 0x0BC0}}, + {"TAMIL SYLLABLE YO", 2, {0x0BAF, 0x0BCA}}, + {"TAMIL SYLLABLE YOO", 2, {0x0BAF, 0x0BCB}}, + {"TAMIL SYLLABLE YU", 2, {0x0BAF, 0x0BC1}}, + {"TAMIL SYLLABLE YUU", 2, {0x0BAF, 0x0BC2}}, +}; diff --git a/Tools/unicode/makeunicodedata.py b/Tools/unicode/makeunicodedata.py --- a/Tools/unicode/makeunicodedata.py +++ b/Tools/unicode/makeunicodedata.py @@ -25,7 +25,12 @@ # written by Fredrik Lundh (fredrik at pythonware.com) # -import sys, os, zipfile +import os +import sys +import zipfile + +from textwrap import dedent +from operator import itemgetter SCRIPT = sys.argv[0] VERSION = "3.2" @@ -39,6 +44,8 @@ DERIVED_CORE_PROPERTIES = "DerivedCoreProperties%s.txt" DERIVEDNORMALIZATION_PROPS = "DerivedNormalizationProps%s.txt" LINE_BREAK = "LineBreak%s.txt" +NAME_ALIASES = "NameAliases%s.txt" +NAMED_SEQUENCES = "NamedSequences%s.txt" old_versions = ["3.2.0"] @@ -692,6 +699,40 @@ print("/* name->code dictionary */", file=fp) codehash.dump(fp, trace) + print(dedent(""" + typedef struct Alias { + char *name; + int namelen; + int codepoint; + } alias; + """), file=fp) + + print('static const int aliases_count = %d;' % len(unicode.aliases), file=fp) + + print('static const alias name_aliases[] = {', file=fp) + for name, codepoint in unicode.aliases: + print(' {"%s", %d, 0x%04X},' % (name, len(name), codepoint), file=fp) + print('};', file=fp) + + # the Py_UCS2 seq[4] should use Py_UCS4 if non-BMP chars are added to the + # sequences and have an higher number of elements if the sequences get longer + print(dedent(""" + typedef struct NamedSequence { + char *name; + int seqlen; + Py_UCS2 seq[4]; + } named_sequence; + """), file=fp) + + print('static const int named_sequences_count = %d;' % len(unicode.named_sequences), + file=fp) + + print('static const named_sequence named_sequences[] = {', file=fp) + for name, sequence in unicode.named_sequences: + seq_str = ', '.join('0x%04X' % cp for cp in sequence) + print(' {"%s", %d, {%s}},' % (name, len(sequence), seq_str), file=fp) + print('};', file=fp) + fp.close() @@ -855,6 +896,31 @@ self.table = table self.chars = list(range(0x110000)) # unicode 3.2 + self.aliases = [] + with open_data(NAME_ALIASES, version) as file: + for s in file: + s = s.strip() + if not s or s.startswith('#'): + continue + char, name = s.split(';') + char = int(char, 16) + self.aliases.append((name, char)) + + self.named_sequences = [] + with open_data(NAMED_SEQUENCES, version) as file: + for s in file: + s = s.strip() + if not s or s.startswith('#'): + continue + name, chars = s.split(';') + chars = tuple(int(char, 16) for char in chars.split()) + # check that the structure defined in makeunicodename is OK + assert 2 <= len(chars) <= 4, "change the Py_UCS2 array size" + assert all(c <= 0xFFFF for c in chars), "use Py_UCS4 instead" + self.named_sequences.append((name, chars)) + # sort names to enable binary search + self.named_sequences.sort(key=itemgetter(0)) + self.exclusions = {} with open_data(COMPOSITION_EXCLUSIONS, version) as file: for s in file: From report at bugs.python.org Thu Oct 6 12:48:52 2011 From: report at bugs.python.org (Ezio Melotti) Date: Thu, 06 Oct 2011 10:48:52 +0000 Subject: [issue2771] Test issue In-Reply-To: Message-ID: Ezio Melotti added the comment: test attachments ---------- Added file: http://bugs.python.org/file23324/issue12753-3.diff _______________________________________ Python tracker _______________________________________ -------------- next part -------------- diff --git a/Doc/library/unicodedata.rst b/Doc/library/unicodedata.rst --- a/Doc/library/unicodedata.rst +++ b/Doc/library/unicodedata.rst @@ -29,6 +29,9 @@ Look up character by name. If a character with the given name is found, return the corresponding character. If not found, :exc:`KeyError` is raised. + .. versionchanged:: 3.3 + Support for name aliases [#]_ and named sequences [#]_ has been added. + .. function:: name(chr[, default]) @@ -160,3 +163,9 @@ >>> unicodedata.bidirectional('\u0660') # 'A'rabic, 'N'umber 'AN' + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NameAliases.txt + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NamedSequences.txt diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -492,13 +492,13 @@ +-----------------+---------------------------------+-------+ | Escape Sequence | Meaning | Notes | +=================+=================================+=======+ -| ``\N{name}`` | Character named *name* in the | | +| ``\N{name}`` | Character named *name* in the | \(4) | | | Unicode database | | +-----------------+---------------------------------+-------+ -| ``\uxxxx`` | Character with 16-bit hex value | \(4) | +| ``\uxxxx`` | Character with 16-bit hex value | \(5) | | | *xxxx* | | +-----------------+---------------------------------+-------+ -| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(5) | +| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(6) | | | *xxxxxxxx* | | +-----------------+---------------------------------+-------+ @@ -516,10 +516,14 @@ with the given value. (4) + .. versionchanged:: 3.3 + Support for name aliases [#]_ has been added. + +(5) Individual code units which form parts of a surrogate pair can be encoded using this escape sequence. Exactly four hex digits are required. -(5) +(6) Any Unicode character can be encoded this way, but characters outside the Basic Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is compiled to use 16-bit code units (the default). Exactly eight hex digits @@ -706,3 +710,8 @@ occurrence outside string literals and comments is an unconditional error:: $ ? ` + + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NameAliases.txt diff --git a/Lib/test/test_ucn.py b/Lib/test/test_ucn.py --- a/Lib/test/test_ucn.py +++ b/Lib/test/test_ucn.py @@ -8,8 +8,11 @@ """#" import unittest +import unicodedata from test import support +from http.client import HTTPException +from test.test_normalization import check_version class UnicodeNamesTest(unittest.TestCase): @@ -59,8 +62,6 @@ ) def test_ascii_letters(self): - import unicodedata - for char in "".join(map(chr, range(ord("a"), ord("z")))): name = "LATIN SMALL LETTER %s" % char.upper() code = unicodedata.lookup(name) @@ -81,7 +82,6 @@ self.checkletter("HANGUL SYLLABLE HWEOK", "\ud6f8") self.checkletter("HANGUL SYLLABLE HIH", "\ud7a3") - import unicodedata self.assertRaises(ValueError, unicodedata.name, "\ud7a4") def test_cjk_unified_ideographs(self): @@ -97,14 +97,11 @@ self.checkletter("CJK UNIFIED IDEOGRAPH-2B81D", "\U0002B81D") def test_bmp_characters(self): - import unicodedata - count = 0 for code in range(0x10000): char = chr(code) name = unicodedata.name(char, None) if name is not None: self.assertEqual(unicodedata.lookup(name), char) - count += 1 def test_misc_symbols(self): self.checkletter("PILCROW SIGN", "\u00b6") @@ -112,8 +109,65 @@ self.checkletter("HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK", "\uFF9F") self.checkletter("FULLWIDTH LATIN SMALL LETTER A", "\uFF41") + def test_aliases(self): + # Check that the aliases defined in the NameAliases.txt file work. + # This should be updated when new aliases are added or the file + # should be downloaded and parsed instead. See #12753. + aliases = [ + ('LATIN CAPITAL LETTER GHA', 0x01A2), + ('LATIN SMALL LETTER GHA', 0x01A3), + ('KANNADA LETTER LLLA', 0x0CDE), + ('LAO LETTER FO FON', 0x0E9D), + ('LAO LETTER FO FAY', 0x0E9F), + ('LAO LETTER RO', 0x0EA3), + ('LAO LETTER LO', 0x0EA5), + ('TIBETAN MARK BKA- SHOG GI MGO RGYAN', 0x0FD0), + ('YI SYLLABLE ITERATION MARK', 0xA015), + ('PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET', 0xFE18), + ('BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS', 0x1D0C5) + ] + for alias, codepoint in aliases: + self.checkletter(alias, chr(codepoint)) + name = unicodedata.name(chr(codepoint)) + self.assertNotEqual(name, alias) + self.assertEqual(unicodedata.lookup(alias), + unicodedata.lookup(name)) + + def test_named_sequences_sample(self): + # Check a few named sequences. See #12753. + sequences = [ + ('LATIN SMALL LETTER R WITH TILDE', '\u0072\u0303'), + ('TAMIL SYLLABLE SAI', '\u0BB8\u0BC8'), + ('TAMIL SYLLABLE MOO', '\u0BAE\u0BCB'), + ('TAMIL SYLLABLE NNOO', '\u0BA3\u0BCB'), + ('TAMIL CONSONANT KSS', '\u0B95\u0BCD\u0BB7\u0BCD'), + ] + for seqname, codepoints in sequences: + self.assertEqual(unicodedata.lookup(seqname), codepoints) + with self.assertRaises(SyntaxError): + self.checkletter(seqname, None) + + def test_named_sequences_full(self): + # Check all the named sequences + url = ("http://www.unicode.org/Public/%s/ucd/NamedSequences.txt" % + unicodedata.unidata_version) + try: + testdata = support.open_urlresource(url, encoding="utf-8", + check=check_version) + except (IOError, HTTPException): + self.skipTest("Could not retrieve " + url) + self.addCleanup(testdata.close) + for line in testdata: + line = line.strip() + if not line or line.startswith('#'): + continue + seqname, codepoints = line.split(';') + codepoints = ''.join(chr(int(cp, 16)) for cp in codepoints.split()) + self.assertEqual(unicodedata.lookup(seqname), codepoints) + with self.assertRaises(SyntaxError): + self.checkletter(seqname, None) + def test_errors(self): - import unicodedata self.assertRaises(TypeError, unicodedata.name) self.assertRaises(TypeError, unicodedata.name, 'xx') self.assertRaises(TypeError, unicodedata.lookup) diff --git a/Modules/unicodedata.c b/Modules/unicodedata.c --- a/Modules/unicodedata.c +++ b/Modules/unicodedata.c @@ -1054,7 +1054,7 @@ static int _getcode(PyObject* self, const char* name, int namelen, Py_UCS4* code) { - unsigned int h, v; + unsigned int h, v, k; unsigned int mask = code_size-1; unsigned int i, incr; @@ -1100,6 +1100,17 @@ return 1; } + /* check for aliases defined in NameAliases.txt */ + for (k=0; k 0) + low = mid + 1; + else + return PyUnicode_FromKindAndData(PyUnicode_2BYTE_KIND, + named_sequences[mid].seq, + named_sequences[mid].seqlen); + } + return NULL; +} + PyDoc_STRVAR(unicodedata_lookup__doc__, "lookup(name)\n\ \n\ @@ -1187,6 +1218,7 @@ unicodedata_lookup(PyObject* self, PyObject* args) { Py_UCS4 code; + PyObject *codes; /* for named sequences */ char* name; int namelen; @@ -1194,9 +1226,13 @@ return NULL; if (!_getcode(self, name, namelen, &code)) { - PyErr_Format(PyExc_KeyError, "undefined character name '%s'", - name); - return NULL; + /* if the normal lookup fails try with named sequences */ + codes = _lookup_named_sequences(name); + if (codes == NULL) { + PyErr_Format(PyExc_KeyError, "undefined character name '%s'", name); + return NULL; + } + return codes; } return PyUnicode_FromOrdinal(code); diff --git a/Modules/unicodename_db.h b/Modules/unicodename_db.h --- a/Modules/unicodename_db.h +++ b/Modules/unicodename_db.h @@ -18811,3 +18811,452 @@ #define code_magic 47 #define code_size 32768 #define code_poly 32771 + +typedef struct Alias { + char *name; + int namelen; + int codepoint; +} alias; + +static const int aliases_count = 11; +static const alias name_aliases[] = { + {"LATIN CAPITAL LETTER GHA", 24, 0x01A2}, + {"LATIN SMALL LETTER GHA", 22, 0x01A3}, + {"KANNADA LETTER LLLA", 19, 0x0CDE}, + {"LAO LETTER FO FON", 17, 0x0E9D}, + {"LAO LETTER FO FAY", 17, 0x0E9F}, + {"LAO LETTER RO", 13, 0x0EA3}, + {"LAO LETTER LO", 13, 0x0EA5}, + {"TIBETAN MARK BKA- SHOG GI MGO RGYAN", 35, 0x0FD0}, + {"YI SYLLABLE ITERATION MARK", 26, 0xA015}, + {"PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET", 61, 0xFE18}, + {"BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS", 52, 0x1D0C5}, +}; + +typedef struct NamedSequence { + char *name; + int seqlen; + Py_UCS2 seq[4]; +} named_sequence; + +static const int named_sequences_count = 418; +static const named_sequence named_sequences[] = { + {"BENGALI LETTER KHINYA", 3, {0x0995, 0x09CD, 0x09B7}}, + {"GEORGIAN LETTER U-BRJGU", 2, {0x10E3, 0x0302}}, + {"HIRAGANA LETTER BIDAKUON NGA", 2, {0x304B, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGE", 2, {0x3051, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGI", 2, {0x304D, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGO", 2, {0x3053, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGU", 2, {0x304F, 0x309A}}, + {"KATAKANA LETTER AINU CE", 2, {0x30BB, 0x309A}}, + {"KATAKANA LETTER AINU P", 2, {0x31F7, 0x309A}}, + {"KATAKANA LETTER AINU TO", 2, {0x30C8, 0x309A}}, + {"KATAKANA LETTER AINU TU", 2, {0x30C4, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGA", 2, {0x30AB, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGE", 2, {0x30B1, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGI", 2, {0x30AD, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGO", 2, {0x30B3, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGU", 2, {0x30AF, 0x309A}}, + {"KHMER CONSONANT SIGN COENG BA", 2, {0x17D2, 0x1794}}, + {"KHMER CONSONANT SIGN COENG CA", 2, {0x17D2, 0x1785}}, + {"KHMER CONSONANT SIGN COENG CHA", 2, {0x17D2, 0x1786}}, + {"KHMER CONSONANT SIGN COENG CHO", 2, {0x17D2, 0x1788}}, + {"KHMER CONSONANT SIGN COENG CO", 2, {0x17D2, 0x1787}}, + {"KHMER CONSONANT SIGN COENG DA", 2, {0x17D2, 0x178A}}, + {"KHMER CONSONANT SIGN COENG DO", 2, {0x17D2, 0x178C}}, + {"KHMER CONSONANT SIGN COENG HA", 2, {0x17D2, 0x17A0}}, + {"KHMER CONSONANT SIGN COENG KA", 2, {0x17D2, 0x1780}}, + {"KHMER CONSONANT SIGN COENG KHA", 2, {0x17D2, 0x1781}}, + {"KHMER CONSONANT SIGN COENG KHO", 2, {0x17D2, 0x1783}}, + {"KHMER CONSONANT SIGN COENG KO", 2, {0x17D2, 0x1782}}, + {"KHMER CONSONANT SIGN COENG LA", 2, {0x17D2, 0x17A1}}, + {"KHMER CONSONANT SIGN COENG LO", 2, {0x17D2, 0x179B}}, + {"KHMER CONSONANT SIGN COENG MO", 2, {0x17D2, 0x1798}}, + {"KHMER CONSONANT SIGN COENG NA", 2, {0x17D2, 0x178E}}, + {"KHMER CONSONANT SIGN COENG NGO", 2, {0x17D2, 0x1784}}, + {"KHMER CONSONANT SIGN COENG NO", 2, {0x17D2, 0x1793}}, + {"KHMER CONSONANT SIGN COENG NYO", 2, {0x17D2, 0x1789}}, + {"KHMER CONSONANT SIGN COENG PHA", 2, {0x17D2, 0x1795}}, + {"KHMER CONSONANT SIGN COENG PHO", 2, {0x17D2, 0x1797}}, + {"KHMER CONSONANT SIGN COENG PO", 2, {0x17D2, 0x1796}}, + {"KHMER CONSONANT SIGN COENG RO", 2, {0x17D2, 0x179A}}, + {"KHMER CONSONANT SIGN COENG SA", 2, {0x17D2, 0x179F}}, + {"KHMER CONSONANT SIGN COENG SHA", 2, {0x17D2, 0x179D}}, + {"KHMER CONSONANT SIGN COENG SSA", 2, {0x17D2, 0x179E}}, + {"KHMER CONSONANT SIGN COENG TA", 2, {0x17D2, 0x178F}}, + {"KHMER CONSONANT SIGN COENG THA", 2, {0x17D2, 0x1790}}, + {"KHMER CONSONANT SIGN COENG THO", 2, {0x17D2, 0x1792}}, + {"KHMER CONSONANT SIGN COENG TO", 2, {0x17D2, 0x1791}}, + {"KHMER CONSONANT SIGN COENG TTHA", 2, {0x17D2, 0x178B}}, + {"KHMER CONSONANT SIGN COENG TTHO", 2, {0x17D2, 0x178D}}, + {"KHMER CONSONANT SIGN COENG VO", 2, {0x17D2, 0x179C}}, + {"KHMER CONSONANT SIGN COENG YO", 2, {0x17D2, 0x1799}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG QE", 2, {0x17D2, 0x17AF}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG QU", 2, {0x17D2, 0x17A7}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG RY", 2, {0x17D2, 0x17AB}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG RYY", 2, {0x17D2, 0x17AC}}, + {"KHMER VOWEL SIGN AAM", 2, {0x17B6, 0x17C6}}, + {"KHMER VOWEL SIGN COENG QA", 2, {0x17D2, 0x17A2}}, + {"KHMER VOWEL SIGN OM", 2, {0x17BB, 0x17C6}}, + {"LATIN CAPITAL LETTER A WITH MACRON AND GRAVE", 2, {0x0100, 0x0300}}, + {"LATIN CAPITAL LETTER A WITH OGONEK AND ACUTE", 2, {0x0104, 0x0301}}, + {"LATIN CAPITAL LETTER A WITH OGONEK AND TILDE", 2, {0x0104, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND CARON", 2, {0x00CA, 0x030C}}, + {"LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND MACRON", 2, {0x00CA, 0x0304}}, + {"LATIN CAPITAL LETTER E WITH DOT ABOVE AND ACUTE", 2, {0x0116, 0x0301}}, + {"LATIN CAPITAL LETTER E WITH DOT ABOVE AND TILDE", 2, {0x0116, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH OGONEK AND ACUTE", 2, {0x0118, 0x0301}}, + {"LATIN CAPITAL LETTER E WITH OGONEK AND TILDE", 2, {0x0118, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW", 2, {0x0045, 0x0329}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00C9, 0x0329}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00C8, 0x0329}}, + {"LATIN CAPITAL LETTER I WITH MACRON AND GRAVE", 2, {0x012A, 0x0300}}, + {"LATIN CAPITAL LETTER I WITH OGONEK AND ACUTE", 2, {0x012E, 0x0301}}, + {"LATIN CAPITAL LETTER I WITH OGONEK AND TILDE", 2, {0x012E, 0x0303}}, + {"LATIN CAPITAL LETTER J WITH TILDE", 2, {0x004A, 0x0303}}, + {"LATIN CAPITAL LETTER L WITH TILDE", 2, {0x004C, 0x0303}}, + {"LATIN CAPITAL LETTER M WITH TILDE", 2, {0x004D, 0x0303}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW", 2, {0x004F, 0x0329}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00D3, 0x0329}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00D2, 0x0329}}, + {"LATIN CAPITAL LETTER R WITH TILDE", 2, {0x0052, 0x0303}}, + {"LATIN CAPITAL LETTER S WITH VERTICAL LINE BELOW", 2, {0x0053, 0x0329}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND ACUTE", 2, {0x016A, 0x0301}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND GRAVE", 2, {0x016A, 0x0300}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND TILDE", 2, {0x016A, 0x0303}}, + {"LATIN CAPITAL LETTER U WITH OGONEK AND ACUTE", 2, {0x0172, 0x0301}}, + {"LATIN CAPITAL LETTER U WITH OGONEK AND TILDE", 2, {0x0172, 0x0303}}, + {"LATIN SMALL LETTER A WITH MACRON AND GRAVE", 2, {0x0101, 0x0300}}, + {"LATIN SMALL LETTER A WITH OGONEK AND ACUTE", 2, {0x0105, 0x0301}}, + {"LATIN SMALL LETTER A WITH OGONEK AND TILDE", 2, {0x0105, 0x0303}}, + {"LATIN SMALL LETTER AE WITH GRAVE", 2, {0x00E6, 0x0300}}, + {"LATIN SMALL LETTER E WITH CIRCUMFLEX AND CARON", 2, {0x00EA, 0x030C}}, + {"LATIN SMALL LETTER E WITH CIRCUMFLEX AND MACRON", 2, {0x00EA, 0x0304}}, + {"LATIN SMALL LETTER E WITH DOT ABOVE AND ACUTE", 2, {0x0117, 0x0301}}, + {"LATIN SMALL LETTER E WITH DOT ABOVE AND TILDE", 2, {0x0117, 0x0303}}, + {"LATIN SMALL LETTER E WITH OGONEK AND ACUTE", 2, {0x0119, 0x0301}}, + {"LATIN SMALL LETTER E WITH OGONEK AND TILDE", 2, {0x0119, 0x0303}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW", 2, {0x0065, 0x0329}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00E9, 0x0329}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00E8, 0x0329}}, + {"LATIN SMALL LETTER HOOKED SCHWA WITH ACUTE", 2, {0x025A, 0x0301}}, + {"LATIN SMALL LETTER HOOKED SCHWA WITH GRAVE", 2, {0x025A, 0x0300}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND ACUTE", 3, {0x0069, 0x0307, 0x0301}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND GRAVE", 3, {0x0069, 0x0307, 0x0300}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND TILDE", 3, {0x0069, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER I WITH MACRON AND GRAVE", 2, {0x012B, 0x0300}}, + {"LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND ACUTE", 3, {0x012F, 0x0307, 0x0301}}, + {"LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND TILDE", 3, {0x012F, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER J WITH DOT ABOVE AND TILDE", 3, {0x006A, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER L WITH TILDE", 2, {0x006C, 0x0303}}, + {"LATIN SMALL LETTER M WITH TILDE", 2, {0x006D, 0x0303}}, + {"LATIN SMALL LETTER NG WITH TILDE ABOVE", 3, {0x006E, 0x0360, 0x0067}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW", 2, {0x006F, 0x0329}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00F3, 0x0329}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00F2, 0x0329}}, + {"LATIN SMALL LETTER OPEN O WITH ACUTE", 2, {0x0254, 0x0301}}, + {"LATIN SMALL LETTER OPEN O WITH GRAVE", 2, {0x0254, 0x0300}}, + {"LATIN SMALL LETTER R WITH TILDE", 2, {0x0072, 0x0303}}, + {"LATIN SMALL LETTER S WITH VERTICAL LINE BELOW", 2, {0x0073, 0x0329}}, + {"LATIN SMALL LETTER SCHWA WITH ACUTE", 2, {0x0259, 0x0301}}, + {"LATIN SMALL LETTER SCHWA WITH GRAVE", 2, {0x0259, 0x0300}}, + {"LATIN SMALL LETTER TURNED V WITH ACUTE", 2, {0x028C, 0x0301}}, + {"LATIN SMALL LETTER TURNED V WITH GRAVE", 2, {0x028C, 0x0300}}, + {"LATIN SMALL LETTER U WITH MACRON AND ACUTE", 2, {0x016B, 0x0301}}, + {"LATIN SMALL LETTER U WITH MACRON AND GRAVE", 2, {0x016B, 0x0300}}, + {"LATIN SMALL LETTER U WITH MACRON AND TILDE", 2, {0x016B, 0x0303}}, + {"LATIN SMALL LETTER U WITH OGONEK AND ACUTE", 2, {0x0173, 0x0301}}, + {"LATIN SMALL LETTER U WITH OGONEK AND TILDE", 2, {0x0173, 0x0303}}, + {"MODIFIER LETTER EXTRA-HIGH EXTRA-LOW CONTOUR TONE BAR", 2, {0x02E5, 0x02E9}}, + {"MODIFIER LETTER EXTRA-LOW EXTRA-HIGH CONTOUR TONE BAR", 2, {0x02E9, 0x02E5}}, + {"TAMIL CONSONANT C", 2, {0x0B9A, 0x0BCD}}, + {"TAMIL CONSONANT H", 2, {0x0BB9, 0x0BCD}}, + {"TAMIL CONSONANT J", 2, {0x0B9C, 0x0BCD}}, + {"TAMIL CONSONANT K", 2, {0x0B95, 0x0BCD}}, + {"TAMIL CONSONANT KSS", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCD}}, + {"TAMIL CONSONANT L", 2, {0x0BB2, 0x0BCD}}, + {"TAMIL CONSONANT LL", 2, {0x0BB3, 0x0BCD}}, + {"TAMIL CONSONANT LLL", 2, {0x0BB4, 0x0BCD}}, + {"TAMIL CONSONANT M", 2, {0x0BAE, 0x0BCD}}, + {"TAMIL CONSONANT N", 2, {0x0BA8, 0x0BCD}}, + {"TAMIL CONSONANT NG", 2, {0x0B99, 0x0BCD}}, + {"TAMIL CONSONANT NN", 2, {0x0BA3, 0x0BCD}}, + {"TAMIL CONSONANT NNN", 2, {0x0BA9, 0x0BCD}}, + {"TAMIL CONSONANT NY", 2, {0x0B9E, 0x0BCD}}, + {"TAMIL CONSONANT P", 2, {0x0BAA, 0x0BCD}}, + {"TAMIL CONSONANT R", 2, {0x0BB0, 0x0BCD}}, + {"TAMIL CONSONANT RR", 2, {0x0BB1, 0x0BCD}}, + {"TAMIL CONSONANT S", 2, {0x0BB8, 0x0BCD}}, + {"TAMIL CONSONANT SH", 2, {0x0BB6, 0x0BCD}}, + {"TAMIL CONSONANT SS", 2, {0x0BB7, 0x0BCD}}, + {"TAMIL CONSONANT T", 2, {0x0BA4, 0x0BCD}}, + {"TAMIL CONSONANT TT", 2, {0x0B9F, 0x0BCD}}, + {"TAMIL CONSONANT V", 2, {0x0BB5, 0x0BCD}}, + {"TAMIL CONSONANT Y", 2, {0x0BAF, 0x0BCD}}, + {"TAMIL SYLLABLE CAA", 2, {0x0B9A, 0x0BBE}}, + {"TAMIL SYLLABLE CAI", 2, {0x0B9A, 0x0BC8}}, + {"TAMIL SYLLABLE CAU", 2, {0x0B9A, 0x0BCC}}, + {"TAMIL SYLLABLE CE", 2, {0x0B9A, 0x0BC6}}, + {"TAMIL SYLLABLE CEE", 2, {0x0B9A, 0x0BC7}}, + {"TAMIL SYLLABLE CI", 2, {0x0B9A, 0x0BBF}}, + {"TAMIL SYLLABLE CII", 2, {0x0B9A, 0x0BC0}}, + {"TAMIL SYLLABLE CO", 2, {0x0B9A, 0x0BCA}}, + {"TAMIL SYLLABLE COO", 2, {0x0B9A, 0x0BCB}}, + {"TAMIL SYLLABLE CU", 2, {0x0B9A, 0x0BC1}}, + {"TAMIL SYLLABLE CUU", 2, {0x0B9A, 0x0BC2}}, + {"TAMIL SYLLABLE HAA", 2, {0x0BB9, 0x0BBE}}, + {"TAMIL SYLLABLE HAI", 2, {0x0BB9, 0x0BC8}}, + {"TAMIL SYLLABLE HAU", 2, {0x0BB9, 0x0BCC}}, + {"TAMIL SYLLABLE HE", 2, {0x0BB9, 0x0BC6}}, + {"TAMIL SYLLABLE HEE", 2, {0x0BB9, 0x0BC7}}, + {"TAMIL SYLLABLE HI", 2, {0x0BB9, 0x0BBF}}, + {"TAMIL SYLLABLE HII", 2, {0x0BB9, 0x0BC0}}, + {"TAMIL SYLLABLE HO", 2, {0x0BB9, 0x0BCA}}, + {"TAMIL SYLLABLE HOO", 2, {0x0BB9, 0x0BCB}}, + {"TAMIL SYLLABLE HU", 2, {0x0BB9, 0x0BC1}}, + {"TAMIL SYLLABLE HUU", 2, {0x0BB9, 0x0BC2}}, + {"TAMIL SYLLABLE JAA", 2, {0x0B9C, 0x0BBE}}, + {"TAMIL SYLLABLE JAI", 2, {0x0B9C, 0x0BC8}}, + {"TAMIL SYLLABLE JAU", 2, {0x0B9C, 0x0BCC}}, + {"TAMIL SYLLABLE JE", 2, {0x0B9C, 0x0BC6}}, + {"TAMIL SYLLABLE JEE", 2, {0x0B9C, 0x0BC7}}, + {"TAMIL SYLLABLE JI", 2, {0x0B9C, 0x0BBF}}, + {"TAMIL SYLLABLE JII", 2, {0x0B9C, 0x0BC0}}, + {"TAMIL SYLLABLE JO", 2, {0x0B9C, 0x0BCA}}, + {"TAMIL SYLLABLE JOO", 2, {0x0B9C, 0x0BCB}}, + {"TAMIL SYLLABLE JU", 2, {0x0B9C, 0x0BC1}}, + {"TAMIL SYLLABLE JUU", 2, {0x0B9C, 0x0BC2}}, + {"TAMIL SYLLABLE KAA", 2, {0x0B95, 0x0BBE}}, + {"TAMIL SYLLABLE KAI", 2, {0x0B95, 0x0BC8}}, + {"TAMIL SYLLABLE KAU", 2, {0x0B95, 0x0BCC}}, + {"TAMIL SYLLABLE KE", 2, {0x0B95, 0x0BC6}}, + {"TAMIL SYLLABLE KEE", 2, {0x0B95, 0x0BC7}}, + {"TAMIL SYLLABLE KI", 2, {0x0B95, 0x0BBF}}, + {"TAMIL SYLLABLE KII", 2, {0x0B95, 0x0BC0}}, + {"TAMIL SYLLABLE KO", 2, {0x0B95, 0x0BCA}}, + {"TAMIL SYLLABLE KOO", 2, {0x0B95, 0x0BCB}}, + {"TAMIL SYLLABLE KSSA", 3, {0x0B95, 0x0BCD, 0x0BB7}}, + {"TAMIL SYLLABLE KSSAA", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BBE}}, + {"TAMIL SYLLABLE KSSAI", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC8}}, + {"TAMIL SYLLABLE KSSAU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCC}}, + {"TAMIL SYLLABLE KSSE", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC6}}, + {"TAMIL SYLLABLE KSSEE", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC7}}, + {"TAMIL SYLLABLE KSSI", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BBF}}, + {"TAMIL SYLLABLE KSSII", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC0}}, + {"TAMIL SYLLABLE KSSO", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCA}}, + {"TAMIL SYLLABLE KSSOO", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCB}}, + {"TAMIL SYLLABLE KSSU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC1}}, + {"TAMIL SYLLABLE KSSUU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC2}}, + {"TAMIL SYLLABLE KU", 2, {0x0B95, 0x0BC1}}, + {"TAMIL SYLLABLE KUU", 2, {0x0B95, 0x0BC2}}, + {"TAMIL SYLLABLE LAA", 2, {0x0BB2, 0x0BBE}}, + {"TAMIL SYLLABLE LAI", 2, {0x0BB2, 0x0BC8}}, + {"TAMIL SYLLABLE LAU", 2, {0x0BB2, 0x0BCC}}, + {"TAMIL SYLLABLE LE", 2, {0x0BB2, 0x0BC6}}, + {"TAMIL SYLLABLE LEE", 2, {0x0BB2, 0x0BC7}}, + {"TAMIL SYLLABLE LI", 2, {0x0BB2, 0x0BBF}}, + {"TAMIL SYLLABLE LII", 2, {0x0BB2, 0x0BC0}}, + {"TAMIL SYLLABLE LLAA", 2, {0x0BB3, 0x0BBE}}, + {"TAMIL SYLLABLE LLAI", 2, {0x0BB3, 0x0BC8}}, + {"TAMIL SYLLABLE LLAU", 2, {0x0BB3, 0x0BCC}}, + {"TAMIL SYLLABLE LLE", 2, {0x0BB3, 0x0BC6}}, + {"TAMIL SYLLABLE LLEE", 2, {0x0BB3, 0x0BC7}}, + {"TAMIL SYLLABLE LLI", 2, {0x0BB3, 0x0BBF}}, + {"TAMIL SYLLABLE LLII", 2, {0x0BB3, 0x0BC0}}, + {"TAMIL SYLLABLE LLLAA", 2, {0x0BB4, 0x0BBE}}, + {"TAMIL SYLLABLE LLLAI", 2, {0x0BB4, 0x0BC8}}, + {"TAMIL SYLLABLE LLLAU", 2, {0x0BB4, 0x0BCC}}, + {"TAMIL SYLLABLE LLLE", 2, {0x0BB4, 0x0BC6}}, + {"TAMIL SYLLABLE LLLEE", 2, {0x0BB4, 0x0BC7}}, + {"TAMIL SYLLABLE LLLI", 2, {0x0BB4, 0x0BBF}}, + {"TAMIL SYLLABLE LLLII", 2, {0x0BB4, 0x0BC0}}, + {"TAMIL SYLLABLE LLLO", 2, {0x0BB4, 0x0BCA}}, + {"TAMIL SYLLABLE LLLOO", 2, {0x0BB4, 0x0BCB}}, + {"TAMIL SYLLABLE LLLU", 2, {0x0BB4, 0x0BC1}}, + {"TAMIL SYLLABLE LLLUU", 2, {0x0BB4, 0x0BC2}}, + {"TAMIL SYLLABLE LLO", 2, {0x0BB3, 0x0BCA}}, + {"TAMIL SYLLABLE LLOO", 2, {0x0BB3, 0x0BCB}}, + {"TAMIL SYLLABLE LLU", 2, {0x0BB3, 0x0BC1}}, + {"TAMIL SYLLABLE LLUU", 2, {0x0BB3, 0x0BC2}}, + {"TAMIL SYLLABLE LO", 2, {0x0BB2, 0x0BCA}}, + {"TAMIL SYLLABLE LOO", 2, {0x0BB2, 0x0BCB}}, + {"TAMIL SYLLABLE LU", 2, {0x0BB2, 0x0BC1}}, + {"TAMIL SYLLABLE LUU", 2, {0x0BB2, 0x0BC2}}, + {"TAMIL SYLLABLE MAA", 2, {0x0BAE, 0x0BBE}}, + {"TAMIL SYLLABLE MAI", 2, {0x0BAE, 0x0BC8}}, + {"TAMIL SYLLABLE MAU", 2, {0x0BAE, 0x0BCC}}, + {"TAMIL SYLLABLE ME", 2, {0x0BAE, 0x0BC6}}, + {"TAMIL SYLLABLE MEE", 2, {0x0BAE, 0x0BC7}}, + {"TAMIL SYLLABLE MI", 2, {0x0BAE, 0x0BBF}}, + {"TAMIL SYLLABLE MII", 2, {0x0BAE, 0x0BC0}}, + {"TAMIL SYLLABLE MO", 2, {0x0BAE, 0x0BCA}}, + {"TAMIL SYLLABLE MOO", 2, {0x0BAE, 0x0BCB}}, + {"TAMIL SYLLABLE MU", 2, {0x0BAE, 0x0BC1}}, + {"TAMIL SYLLABLE MUU", 2, {0x0BAE, 0x0BC2}}, + {"TAMIL SYLLABLE NAA", 2, {0x0BA8, 0x0BBE}}, + {"TAMIL SYLLABLE NAI", 2, {0x0BA8, 0x0BC8}}, + {"TAMIL SYLLABLE NAU", 2, {0x0BA8, 0x0BCC}}, + {"TAMIL SYLLABLE NE", 2, {0x0BA8, 0x0BC6}}, + {"TAMIL SYLLABLE NEE", 2, {0x0BA8, 0x0BC7}}, + {"TAMIL SYLLABLE NGAA", 2, {0x0B99, 0x0BBE}}, + {"TAMIL SYLLABLE NGAI", 2, {0x0B99, 0x0BC8}}, + {"TAMIL SYLLABLE NGAU", 2, {0x0B99, 0x0BCC}}, + {"TAMIL SYLLABLE NGE", 2, {0x0B99, 0x0BC6}}, + {"TAMIL SYLLABLE NGEE", 2, {0x0B99, 0x0BC7}}, + {"TAMIL SYLLABLE NGI", 2, {0x0B99, 0x0BBF}}, + {"TAMIL SYLLABLE NGII", 2, {0x0B99, 0x0BC0}}, + {"TAMIL SYLLABLE NGO", 2, {0x0B99, 0x0BCA}}, + {"TAMIL SYLLABLE NGOO", 2, {0x0B99, 0x0BCB}}, + {"TAMIL SYLLABLE NGU", 2, {0x0B99, 0x0BC1}}, + {"TAMIL SYLLABLE NGUU", 2, {0x0B99, 0x0BC2}}, + {"TAMIL SYLLABLE NI", 2, {0x0BA8, 0x0BBF}}, + {"TAMIL SYLLABLE NII", 2, {0x0BA8, 0x0BC0}}, + {"TAMIL SYLLABLE NNAA", 2, {0x0BA3, 0x0BBE}}, + {"TAMIL SYLLABLE NNAI", 2, {0x0BA3, 0x0BC8}}, + {"TAMIL SYLLABLE NNAU", 2, {0x0BA3, 0x0BCC}}, + {"TAMIL SYLLABLE NNE", 2, {0x0BA3, 0x0BC6}}, + {"TAMIL SYLLABLE NNEE", 2, {0x0BA3, 0x0BC7}}, + {"TAMIL SYLLABLE NNI", 2, {0x0BA3, 0x0BBF}}, + {"TAMIL SYLLABLE NNII", 2, {0x0BA3, 0x0BC0}}, + {"TAMIL SYLLABLE NNNAA", 2, {0x0BA9, 0x0BBE}}, + {"TAMIL SYLLABLE NNNAI", 2, {0x0BA9, 0x0BC8}}, + {"TAMIL SYLLABLE NNNAU", 2, {0x0BA9, 0x0BCC}}, + {"TAMIL SYLLABLE NNNE", 2, {0x0BA9, 0x0BC6}}, + {"TAMIL SYLLABLE NNNEE", 2, {0x0BA9, 0x0BC7}}, + {"TAMIL SYLLABLE NNNI", 2, {0x0BA9, 0x0BBF}}, + {"TAMIL SYLLABLE NNNII", 2, {0x0BA9, 0x0BC0}}, + {"TAMIL SYLLABLE NNNO", 2, {0x0BA9, 0x0BCA}}, + {"TAMIL SYLLABLE NNNOO", 2, {0x0BA9, 0x0BCB}}, + {"TAMIL SYLLABLE NNNU", 2, {0x0BA9, 0x0BC1}}, + {"TAMIL SYLLABLE NNNUU", 2, {0x0BA9, 0x0BC2}}, + {"TAMIL SYLLABLE NNO", 2, {0x0BA3, 0x0BCA}}, + {"TAMIL SYLLABLE NNOO", 2, {0x0BA3, 0x0BCB}}, + {"TAMIL SYLLABLE NNU", 2, {0x0BA3, 0x0BC1}}, + {"TAMIL SYLLABLE NNUU", 2, {0x0BA3, 0x0BC2}}, + {"TAMIL SYLLABLE NO", 2, {0x0BA8, 0x0BCA}}, + {"TAMIL SYLLABLE NOO", 2, {0x0BA8, 0x0BCB}}, + {"TAMIL SYLLABLE NU", 2, {0x0BA8, 0x0BC1}}, + {"TAMIL SYLLABLE NUU", 2, {0x0BA8, 0x0BC2}}, + {"TAMIL SYLLABLE NYAA", 2, {0x0B9E, 0x0BBE}}, + {"TAMIL SYLLABLE NYAI", 2, {0x0B9E, 0x0BC8}}, + {"TAMIL SYLLABLE NYAU", 2, {0x0B9E, 0x0BCC}}, + {"TAMIL SYLLABLE NYE", 2, {0x0B9E, 0x0BC6}}, + {"TAMIL SYLLABLE NYEE", 2, {0x0B9E, 0x0BC7}}, + {"TAMIL SYLLABLE NYI", 2, {0x0B9E, 0x0BBF}}, + {"TAMIL SYLLABLE NYII", 2, {0x0B9E, 0x0BC0}}, + {"TAMIL SYLLABLE NYO", 2, {0x0B9E, 0x0BCA}}, + {"TAMIL SYLLABLE NYOO", 2, {0x0B9E, 0x0BCB}}, + {"TAMIL SYLLABLE NYU", 2, {0x0B9E, 0x0BC1}}, + {"TAMIL SYLLABLE NYUU", 2, {0x0B9E, 0x0BC2}}, + {"TAMIL SYLLABLE PAA", 2, {0x0BAA, 0x0BBE}}, + {"TAMIL SYLLABLE PAI", 2, {0x0BAA, 0x0BC8}}, + {"TAMIL SYLLABLE PAU", 2, {0x0BAA, 0x0BCC}}, + {"TAMIL SYLLABLE PE", 2, {0x0BAA, 0x0BC6}}, + {"TAMIL SYLLABLE PEE", 2, {0x0BAA, 0x0BC7}}, + {"TAMIL SYLLABLE PI", 2, {0x0BAA, 0x0BBF}}, + {"TAMIL SYLLABLE PII", 2, {0x0BAA, 0x0BC0}}, + {"TAMIL SYLLABLE PO", 2, {0x0BAA, 0x0BCA}}, + {"TAMIL SYLLABLE POO", 2, {0x0BAA, 0x0BCB}}, + {"TAMIL SYLLABLE PU", 2, {0x0BAA, 0x0BC1}}, + {"TAMIL SYLLABLE PUU", 2, {0x0BAA, 0x0BC2}}, + {"TAMIL SYLLABLE RAA", 2, {0x0BB0, 0x0BBE}}, + {"TAMIL SYLLABLE RAI", 2, {0x0BB0, 0x0BC8}}, + {"TAMIL SYLLABLE RAU", 2, {0x0BB0, 0x0BCC}}, + {"TAMIL SYLLABLE RE", 2, {0x0BB0, 0x0BC6}}, + {"TAMIL SYLLABLE REE", 2, {0x0BB0, 0x0BC7}}, + {"TAMIL SYLLABLE RI", 2, {0x0BB0, 0x0BBF}}, + {"TAMIL SYLLABLE RII", 2, {0x0BB0, 0x0BC0}}, + {"TAMIL SYLLABLE RO", 2, {0x0BB0, 0x0BCA}}, + {"TAMIL SYLLABLE ROO", 2, {0x0BB0, 0x0BCB}}, + {"TAMIL SYLLABLE RRAA", 2, {0x0BB1, 0x0BBE}}, + {"TAMIL SYLLABLE RRAI", 2, {0x0BB1, 0x0BC8}}, + {"TAMIL SYLLABLE RRAU", 2, {0x0BB1, 0x0BCC}}, + {"TAMIL SYLLABLE RRE", 2, {0x0BB1, 0x0BC6}}, + {"TAMIL SYLLABLE RREE", 2, {0x0BB1, 0x0BC7}}, + {"TAMIL SYLLABLE RRI", 2, {0x0BB1, 0x0BBF}}, + {"TAMIL SYLLABLE RRII", 2, {0x0BB1, 0x0BC0}}, + {"TAMIL SYLLABLE RRO", 2, {0x0BB1, 0x0BCA}}, + {"TAMIL SYLLABLE RROO", 2, {0x0BB1, 0x0BCB}}, + {"TAMIL SYLLABLE RRU", 2, {0x0BB1, 0x0BC1}}, + {"TAMIL SYLLABLE RRUU", 2, {0x0BB1, 0x0BC2}}, + {"TAMIL SYLLABLE RU", 2, {0x0BB0, 0x0BC1}}, + {"TAMIL SYLLABLE RUU", 2, {0x0BB0, 0x0BC2}}, + {"TAMIL SYLLABLE SAA", 2, {0x0BB8, 0x0BBE}}, + {"TAMIL SYLLABLE SAI", 2, {0x0BB8, 0x0BC8}}, + {"TAMIL SYLLABLE SAU", 2, {0x0BB8, 0x0BCC}}, + {"TAMIL SYLLABLE SE", 2, {0x0BB8, 0x0BC6}}, + {"TAMIL SYLLABLE SEE", 2, {0x0BB8, 0x0BC7}}, + {"TAMIL SYLLABLE SHAA", 2, {0x0BB6, 0x0BBE}}, + {"TAMIL SYLLABLE SHAI", 2, {0x0BB6, 0x0BC8}}, + {"TAMIL SYLLABLE SHAU", 2, {0x0BB6, 0x0BCC}}, + {"TAMIL SYLLABLE SHE", 2, {0x0BB6, 0x0BC6}}, + {"TAMIL SYLLABLE SHEE", 2, {0x0BB6, 0x0BC7}}, + {"TAMIL SYLLABLE SHI", 2, {0x0BB6, 0x0BBF}}, + {"TAMIL SYLLABLE SHII", 2, {0x0BB6, 0x0BC0}}, + {"TAMIL SYLLABLE SHO", 2, {0x0BB6, 0x0BCA}}, + {"TAMIL SYLLABLE SHOO", 2, {0x0BB6, 0x0BCB}}, + {"TAMIL SYLLABLE SHRII", 4, {0x0BB6, 0x0BCD, 0x0BB0, 0x0BC0}}, + {"TAMIL SYLLABLE SHU", 2, {0x0BB6, 0x0BC1}}, + {"TAMIL SYLLABLE SHUU", 2, {0x0BB6, 0x0BC2}}, + {"TAMIL SYLLABLE SI", 2, {0x0BB8, 0x0BBF}}, + {"TAMIL SYLLABLE SII", 2, {0x0BB8, 0x0BC0}}, + {"TAMIL SYLLABLE SO", 2, {0x0BB8, 0x0BCA}}, + {"TAMIL SYLLABLE SOO", 2, {0x0BB8, 0x0BCB}}, + {"TAMIL SYLLABLE SSAA", 2, {0x0BB7, 0x0BBE}}, + {"TAMIL SYLLABLE SSAI", 2, {0x0BB7, 0x0BC8}}, + {"TAMIL SYLLABLE SSAU", 2, {0x0BB7, 0x0BCC}}, + {"TAMIL SYLLABLE SSE", 2, {0x0BB7, 0x0BC6}}, + {"TAMIL SYLLABLE SSEE", 2, {0x0BB7, 0x0BC7}}, + {"TAMIL SYLLABLE SSI", 2, {0x0BB7, 0x0BBF}}, + {"TAMIL SYLLABLE SSII", 2, {0x0BB7, 0x0BC0}}, + {"TAMIL SYLLABLE SSO", 2, {0x0BB7, 0x0BCA}}, + {"TAMIL SYLLABLE SSOO", 2, {0x0BB7, 0x0BCB}}, + {"TAMIL SYLLABLE SSU", 2, {0x0BB7, 0x0BC1}}, + {"TAMIL SYLLABLE SSUU", 2, {0x0BB7, 0x0BC2}}, + {"TAMIL SYLLABLE SU", 2, {0x0BB8, 0x0BC1}}, + {"TAMIL SYLLABLE SUU", 2, {0x0BB8, 0x0BC2}}, + {"TAMIL SYLLABLE TAA", 2, {0x0BA4, 0x0BBE}}, + {"TAMIL SYLLABLE TAI", 2, {0x0BA4, 0x0BC8}}, + {"TAMIL SYLLABLE TAU", 2, {0x0BA4, 0x0BCC}}, + {"TAMIL SYLLABLE TE", 2, {0x0BA4, 0x0BC6}}, + {"TAMIL SYLLABLE TEE", 2, {0x0BA4, 0x0BC7}}, + {"TAMIL SYLLABLE TI", 2, {0x0BA4, 0x0BBF}}, + {"TAMIL SYLLABLE TII", 2, {0x0BA4, 0x0BC0}}, + {"TAMIL SYLLABLE TO", 2, {0x0BA4, 0x0BCA}}, + {"TAMIL SYLLABLE TOO", 2, {0x0BA4, 0x0BCB}}, + {"TAMIL SYLLABLE TTAA", 2, {0x0B9F, 0x0BBE}}, + {"TAMIL SYLLABLE TTAI", 2, {0x0B9F, 0x0BC8}}, + {"TAMIL SYLLABLE TTAU", 2, {0x0B9F, 0x0BCC}}, + {"TAMIL SYLLABLE TTE", 2, {0x0B9F, 0x0BC6}}, + {"TAMIL SYLLABLE TTEE", 2, {0x0B9F, 0x0BC7}}, + {"TAMIL SYLLABLE TTI", 2, {0x0B9F, 0x0BBF}}, + {"TAMIL SYLLABLE TTII", 2, {0x0B9F, 0x0BC0}}, + {"TAMIL SYLLABLE TTO", 2, {0x0B9F, 0x0BCA}}, + {"TAMIL SYLLABLE TTOO", 2, {0x0B9F, 0x0BCB}}, + {"TAMIL SYLLABLE TTU", 2, {0x0B9F, 0x0BC1}}, + {"TAMIL SYLLABLE TTUU", 2, {0x0B9F, 0x0BC2}}, + {"TAMIL SYLLABLE TU", 2, {0x0BA4, 0x0BC1}}, + {"TAMIL SYLLABLE TUU", 2, {0x0BA4, 0x0BC2}}, + {"TAMIL SYLLABLE VAA", 2, {0x0BB5, 0x0BBE}}, + {"TAMIL SYLLABLE VAI", 2, {0x0BB5, 0x0BC8}}, + {"TAMIL SYLLABLE VAU", 2, {0x0BB5, 0x0BCC}}, + {"TAMIL SYLLABLE VE", 2, {0x0BB5, 0x0BC6}}, + {"TAMIL SYLLABLE VEE", 2, {0x0BB5, 0x0BC7}}, + {"TAMIL SYLLABLE VI", 2, {0x0BB5, 0x0BBF}}, + {"TAMIL SYLLABLE VII", 2, {0x0BB5, 0x0BC0}}, + {"TAMIL SYLLABLE VO", 2, {0x0BB5, 0x0BCA}}, + {"TAMIL SYLLABLE VOO", 2, {0x0BB5, 0x0BCB}}, + {"TAMIL SYLLABLE VU", 2, {0x0BB5, 0x0BC1}}, + {"TAMIL SYLLABLE VUU", 2, {0x0BB5, 0x0BC2}}, + {"TAMIL SYLLABLE YAA", 2, {0x0BAF, 0x0BBE}}, + {"TAMIL SYLLABLE YAI", 2, {0x0BAF, 0x0BC8}}, + {"TAMIL SYLLABLE YAU", 2, {0x0BAF, 0x0BCC}}, + {"TAMIL SYLLABLE YE", 2, {0x0BAF, 0x0BC6}}, + {"TAMIL SYLLABLE YEE", 2, {0x0BAF, 0x0BC7}}, + {"TAMIL SYLLABLE YI", 2, {0x0BAF, 0x0BBF}}, + {"TAMIL SYLLABLE YII", 2, {0x0BAF, 0x0BC0}}, + {"TAMIL SYLLABLE YO", 2, {0x0BAF, 0x0BCA}}, + {"TAMIL SYLLABLE YOO", 2, {0x0BAF, 0x0BCB}}, + {"TAMIL SYLLABLE YU", 2, {0x0BAF, 0x0BC1}}, + {"TAMIL SYLLABLE YUU", 2, {0x0BAF, 0x0BC2}}, +}; diff --git a/Tools/unicode/makeunicodedata.py b/Tools/unicode/makeunicodedata.py --- a/Tools/unicode/makeunicodedata.py +++ b/Tools/unicode/makeunicodedata.py @@ -25,7 +25,12 @@ # written by Fredrik Lundh (fredrik at pythonware.com) # -import sys, os, zipfile +import os +import sys +import zipfile + +from textwrap import dedent +from operator import itemgetter SCRIPT = sys.argv[0] VERSION = "3.2" @@ -39,6 +44,8 @@ DERIVED_CORE_PROPERTIES = "DerivedCoreProperties%s.txt" DERIVEDNORMALIZATION_PROPS = "DerivedNormalizationProps%s.txt" LINE_BREAK = "LineBreak%s.txt" +NAME_ALIASES = "NameAliases%s.txt" +NAMED_SEQUENCES = "NamedSequences%s.txt" old_versions = ["3.2.0"] @@ -692,6 +699,40 @@ print("/* name->code dictionary */", file=fp) codehash.dump(fp, trace) + print(dedent(""" + typedef struct Alias { + char *name; + int namelen; + int codepoint; + } alias; + """), file=fp) + + print('static const int aliases_count = %d;' % len(unicode.aliases), file=fp) + + print('static const alias name_aliases[] = {', file=fp) + for name, codepoint in unicode.aliases: + print(' {"%s", %d, 0x%04X},' % (name, len(name), codepoint), file=fp) + print('};', file=fp) + + # the Py_UCS2 seq[4] should use Py_UCS4 if non-BMP chars are added to the + # sequences and have an higher number of elements if the sequences get longer + print(dedent(""" + typedef struct NamedSequence { + char *name; + int seqlen; + Py_UCS2 seq[4]; + } named_sequence; + """), file=fp) + + print('static const int named_sequences_count = %d;' % len(unicode.named_sequences), + file=fp) + + print('static const named_sequence named_sequences[] = {', file=fp) + for name, sequence in unicode.named_sequences: + seq_str = ', '.join('0x%04X' % cp for cp in sequence) + print(' {"%s", %d, {%s}},' % (name, len(sequence), seq_str), file=fp) + print('};', file=fp) + fp.close() @@ -855,6 +896,31 @@ self.table = table self.chars = list(range(0x110000)) # unicode 3.2 + self.aliases = [] + with open_data(NAME_ALIASES, version) as file: + for s in file: + s = s.strip() + if not s or s.startswith('#'): + continue + char, name = s.split(';') + char = int(char, 16) + self.aliases.append((name, char)) + + self.named_sequences = [] + with open_data(NAMED_SEQUENCES, version) as file: + for s in file: + s = s.strip() + if not s or s.startswith('#'): + continue + name, chars = s.split(';') + chars = tuple(int(char, 16) for char in chars.split()) + # check that the structure defined in makeunicodename is OK + assert 2 <= len(chars) <= 4, "change the Py_UCS2 array size" + assert all(c <= 0xFFFF for c in chars), "use Py_UCS4 instead" + self.named_sequences.append((name, chars)) + # sort names to enable binary search + self.named_sequences.sort(key=itemgetter(0)) + self.exclusions = {} with open_data(COMPOSITION_EXCLUSIONS, version) as file: for s in file: From report at bugs.python.org Thu Oct 6 12:50:53 2011 From: report at bugs.python.org (Ezio Melotti) Date: Thu, 06 Oct 2011 10:50:53 +0000 Subject: [issue2771] Test issue In-Reply-To: Message-ID: Ezio Melotti added the comment: test attachments ---------- Added file: http://bugs.python.org/file23325/unnamed Added file: http://bugs.python.org/file23326/issue12753-3.diff _______________________________________ Python tracker _______________________________________ -------------- next part -------------- test attachments


-------------- next part -------------- diff --git a/Doc/library/unicodedata.rst b/Doc/library/unicodedata.rst --- a/Doc/library/unicodedata.rst +++ b/Doc/library/unicodedata.rst @@ -29,6 +29,9 @@ Look up character by name. If a character with the given name is found, return the corresponding character. If not found, :exc:`KeyError` is raised. + .. versionchanged:: 3.3 + Support for name aliases [#]_ and named sequences [#]_ has been added. + .. function:: name(chr[, default]) @@ -160,3 +163,9 @@ >>> unicodedata.bidirectional('\u0660') # 'A'rabic, 'N'umber 'AN' + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NameAliases.txt + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NamedSequences.txt diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -492,13 +492,13 @@ +-----------------+---------------------------------+-------+ | Escape Sequence | Meaning | Notes | +=================+=================================+=======+ -| ``\N{name}`` | Character named *name* in the | | +| ``\N{name}`` | Character named *name* in the | \(4) | | | Unicode database | | +-----------------+---------------------------------+-------+ -| ``\uxxxx`` | Character with 16-bit hex value | \(4) | +| ``\uxxxx`` | Character with 16-bit hex value | \(5) | | | *xxxx* | | +-----------------+---------------------------------+-------+ -| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(5) | +| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(6) | | | *xxxxxxxx* | | +-----------------+---------------------------------+-------+ @@ -516,10 +516,14 @@ with the given value. (4) + .. versionchanged:: 3.3 + Support for name aliases [#]_ has been added. + +(5) Individual code units which form parts of a surrogate pair can be encoded using this escape sequence. Exactly four hex digits are required. -(5) +(6) Any Unicode character can be encoded this way, but characters outside the Basic Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is compiled to use 16-bit code units (the default). Exactly eight hex digits @@ -706,3 +710,8 @@ occurrence outside string literals and comments is an unconditional error:: $ ? ` + + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.0.0/ucd/NameAliases.txt diff --git a/Lib/test/test_ucn.py b/Lib/test/test_ucn.py --- a/Lib/test/test_ucn.py +++ b/Lib/test/test_ucn.py @@ -8,8 +8,11 @@ """#" import unittest +import unicodedata from test import support +from http.client import HTTPException +from test.test_normalization import check_version class UnicodeNamesTest(unittest.TestCase): @@ -59,8 +62,6 @@ ) def test_ascii_letters(self): - import unicodedata - for char in "".join(map(chr, range(ord("a"), ord("z")))): name = "LATIN SMALL LETTER %s" % char.upper() code = unicodedata.lookup(name) @@ -81,7 +82,6 @@ self.checkletter("HANGUL SYLLABLE HWEOK", "\ud6f8") self.checkletter("HANGUL SYLLABLE HIH", "\ud7a3") - import unicodedata self.assertRaises(ValueError, unicodedata.name, "\ud7a4") def test_cjk_unified_ideographs(self): @@ -97,14 +97,11 @@ self.checkletter("CJK UNIFIED IDEOGRAPH-2B81D", "\U0002B81D") def test_bmp_characters(self): - import unicodedata - count = 0 for code in range(0x10000): char = chr(code) name = unicodedata.name(char, None) if name is not None: self.assertEqual(unicodedata.lookup(name), char) - count += 1 def test_misc_symbols(self): self.checkletter("PILCROW SIGN", "\u00b6") @@ -112,8 +109,65 @@ self.checkletter("HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK", "\uFF9F") self.checkletter("FULLWIDTH LATIN SMALL LETTER A", "\uFF41") + def test_aliases(self): + # Check that the aliases defined in the NameAliases.txt file work. + # This should be updated when new aliases are added or the file + # should be downloaded and parsed instead. See #12753. + aliases = [ + ('LATIN CAPITAL LETTER GHA', 0x01A2), + ('LATIN SMALL LETTER GHA', 0x01A3), + ('KANNADA LETTER LLLA', 0x0CDE), + ('LAO LETTER FO FON', 0x0E9D), + ('LAO LETTER FO FAY', 0x0E9F), + ('LAO LETTER RO', 0x0EA3), + ('LAO LETTER LO', 0x0EA5), + ('TIBETAN MARK BKA- SHOG GI MGO RGYAN', 0x0FD0), + ('YI SYLLABLE ITERATION MARK', 0xA015), + ('PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET', 0xFE18), + ('BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS', 0x1D0C5) + ] + for alias, codepoint in aliases: + self.checkletter(alias, chr(codepoint)) + name = unicodedata.name(chr(codepoint)) + self.assertNotEqual(name, alias) + self.assertEqual(unicodedata.lookup(alias), + unicodedata.lookup(name)) + + def test_named_sequences_sample(self): + # Check a few named sequences. See #12753. + sequences = [ + ('LATIN SMALL LETTER R WITH TILDE', '\u0072\u0303'), + ('TAMIL SYLLABLE SAI', '\u0BB8\u0BC8'), + ('TAMIL SYLLABLE MOO', '\u0BAE\u0BCB'), + ('TAMIL SYLLABLE NNOO', '\u0BA3\u0BCB'), + ('TAMIL CONSONANT KSS', '\u0B95\u0BCD\u0BB7\u0BCD'), + ] + for seqname, codepoints in sequences: + self.assertEqual(unicodedata.lookup(seqname), codepoints) + with self.assertRaises(SyntaxError): + self.checkletter(seqname, None) + + def test_named_sequences_full(self): + # Check all the named sequences + url = ("http://www.unicode.org/Public/%s/ucd/NamedSequences.txt" % + unicodedata.unidata_version) + try: + testdata = support.open_urlresource(url, encoding="utf-8", + check=check_version) + except (IOError, HTTPException): + self.skipTest("Could not retrieve " + url) + self.addCleanup(testdata.close) + for line in testdata: + line = line.strip() + if not line or line.startswith('#'): + continue + seqname, codepoints = line.split(';') + codepoints = ''.join(chr(int(cp, 16)) for cp in codepoints.split()) + self.assertEqual(unicodedata.lookup(seqname), codepoints) + with self.assertRaises(SyntaxError): + self.checkletter(seqname, None) + def test_errors(self): - import unicodedata self.assertRaises(TypeError, unicodedata.name) self.assertRaises(TypeError, unicodedata.name, 'xx') self.assertRaises(TypeError, unicodedata.lookup) diff --git a/Modules/unicodedata.c b/Modules/unicodedata.c --- a/Modules/unicodedata.c +++ b/Modules/unicodedata.c @@ -1054,7 +1054,7 @@ static int _getcode(PyObject* self, const char* name, int namelen, Py_UCS4* code) { - unsigned int h, v; + unsigned int h, v, k; unsigned int mask = code_size-1; unsigned int i, incr; @@ -1100,6 +1100,17 @@ return 1; } + /* check for aliases defined in NameAliases.txt */ + for (k=0; k 0) + low = mid + 1; + else + return PyUnicode_FromKindAndData(PyUnicode_2BYTE_KIND, + named_sequences[mid].seq, + named_sequences[mid].seqlen); + } + return NULL; +} + PyDoc_STRVAR(unicodedata_lookup__doc__, "lookup(name)\n\ \n\ @@ -1187,6 +1218,7 @@ unicodedata_lookup(PyObject* self, PyObject* args) { Py_UCS4 code; + PyObject *codes; /* for named sequences */ char* name; int namelen; @@ -1194,9 +1226,13 @@ return NULL; if (!_getcode(self, name, namelen, &code)) { - PyErr_Format(PyExc_KeyError, "undefined character name '%s'", - name); - return NULL; + /* if the normal lookup fails try with named sequences */ + codes = _lookup_named_sequences(name); + if (codes == NULL) { + PyErr_Format(PyExc_KeyError, "undefined character name '%s'", name); + return NULL; + } + return codes; } return PyUnicode_FromOrdinal(code); diff --git a/Modules/unicodename_db.h b/Modules/unicodename_db.h --- a/Modules/unicodename_db.h +++ b/Modules/unicodename_db.h @@ -18811,3 +18811,452 @@ #define code_magic 47 #define code_size 32768 #define code_poly 32771 + +typedef struct Alias { + char *name; + int namelen; + int codepoint; +} alias; + +static const int aliases_count = 11; +static const alias name_aliases[] = { + {"LATIN CAPITAL LETTER GHA", 24, 0x01A2}, + {"LATIN SMALL LETTER GHA", 22, 0x01A3}, + {"KANNADA LETTER LLLA", 19, 0x0CDE}, + {"LAO LETTER FO FON", 17, 0x0E9D}, + {"LAO LETTER FO FAY", 17, 0x0E9F}, + {"LAO LETTER RO", 13, 0x0EA3}, + {"LAO LETTER LO", 13, 0x0EA5}, + {"TIBETAN MARK BKA- SHOG GI MGO RGYAN", 35, 0x0FD0}, + {"YI SYLLABLE ITERATION MARK", 26, 0xA015}, + {"PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET", 61, 0xFE18}, + {"BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS", 52, 0x1D0C5}, +}; + +typedef struct NamedSequence { + char *name; + int seqlen; + Py_UCS2 seq[4]; +} named_sequence; + +static const int named_sequences_count = 418; +static const named_sequence named_sequences[] = { + {"BENGALI LETTER KHINYA", 3, {0x0995, 0x09CD, 0x09B7}}, + {"GEORGIAN LETTER U-BRJGU", 2, {0x10E3, 0x0302}}, + {"HIRAGANA LETTER BIDAKUON NGA", 2, {0x304B, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGE", 2, {0x3051, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGI", 2, {0x304D, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGO", 2, {0x3053, 0x309A}}, + {"HIRAGANA LETTER BIDAKUON NGU", 2, {0x304F, 0x309A}}, + {"KATAKANA LETTER AINU CE", 2, {0x30BB, 0x309A}}, + {"KATAKANA LETTER AINU P", 2, {0x31F7, 0x309A}}, + {"KATAKANA LETTER AINU TO", 2, {0x30C8, 0x309A}}, + {"KATAKANA LETTER AINU TU", 2, {0x30C4, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGA", 2, {0x30AB, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGE", 2, {0x30B1, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGI", 2, {0x30AD, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGO", 2, {0x30B3, 0x309A}}, + {"KATAKANA LETTER BIDAKUON NGU", 2, {0x30AF, 0x309A}}, + {"KHMER CONSONANT SIGN COENG BA", 2, {0x17D2, 0x1794}}, + {"KHMER CONSONANT SIGN COENG CA", 2, {0x17D2, 0x1785}}, + {"KHMER CONSONANT SIGN COENG CHA", 2, {0x17D2, 0x1786}}, + {"KHMER CONSONANT SIGN COENG CHO", 2, {0x17D2, 0x1788}}, + {"KHMER CONSONANT SIGN COENG CO", 2, {0x17D2, 0x1787}}, + {"KHMER CONSONANT SIGN COENG DA", 2, {0x17D2, 0x178A}}, + {"KHMER CONSONANT SIGN COENG DO", 2, {0x17D2, 0x178C}}, + {"KHMER CONSONANT SIGN COENG HA", 2, {0x17D2, 0x17A0}}, + {"KHMER CONSONANT SIGN COENG KA", 2, {0x17D2, 0x1780}}, + {"KHMER CONSONANT SIGN COENG KHA", 2, {0x17D2, 0x1781}}, + {"KHMER CONSONANT SIGN COENG KHO", 2, {0x17D2, 0x1783}}, + {"KHMER CONSONANT SIGN COENG KO", 2, {0x17D2, 0x1782}}, + {"KHMER CONSONANT SIGN COENG LA", 2, {0x17D2, 0x17A1}}, + {"KHMER CONSONANT SIGN COENG LO", 2, {0x17D2, 0x179B}}, + {"KHMER CONSONANT SIGN COENG MO", 2, {0x17D2, 0x1798}}, + {"KHMER CONSONANT SIGN COENG NA", 2, {0x17D2, 0x178E}}, + {"KHMER CONSONANT SIGN COENG NGO", 2, {0x17D2, 0x1784}}, + {"KHMER CONSONANT SIGN COENG NO", 2, {0x17D2, 0x1793}}, + {"KHMER CONSONANT SIGN COENG NYO", 2, {0x17D2, 0x1789}}, + {"KHMER CONSONANT SIGN COENG PHA", 2, {0x17D2, 0x1795}}, + {"KHMER CONSONANT SIGN COENG PHO", 2, {0x17D2, 0x1797}}, + {"KHMER CONSONANT SIGN COENG PO", 2, {0x17D2, 0x1796}}, + {"KHMER CONSONANT SIGN COENG RO", 2, {0x17D2, 0x179A}}, + {"KHMER CONSONANT SIGN COENG SA", 2, {0x17D2, 0x179F}}, + {"KHMER CONSONANT SIGN COENG SHA", 2, {0x17D2, 0x179D}}, + {"KHMER CONSONANT SIGN COENG SSA", 2, {0x17D2, 0x179E}}, + {"KHMER CONSONANT SIGN COENG TA", 2, {0x17D2, 0x178F}}, + {"KHMER CONSONANT SIGN COENG THA", 2, {0x17D2, 0x1790}}, + {"KHMER CONSONANT SIGN COENG THO", 2, {0x17D2, 0x1792}}, + {"KHMER CONSONANT SIGN COENG TO", 2, {0x17D2, 0x1791}}, + {"KHMER CONSONANT SIGN COENG TTHA", 2, {0x17D2, 0x178B}}, + {"KHMER CONSONANT SIGN COENG TTHO", 2, {0x17D2, 0x178D}}, + {"KHMER CONSONANT SIGN COENG VO", 2, {0x17D2, 0x179C}}, + {"KHMER CONSONANT SIGN COENG YO", 2, {0x17D2, 0x1799}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG QE", 2, {0x17D2, 0x17AF}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG QU", 2, {0x17D2, 0x17A7}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG RY", 2, {0x17D2, 0x17AB}}, + {"KHMER INDEPENDENT VOWEL SIGN COENG RYY", 2, {0x17D2, 0x17AC}}, + {"KHMER VOWEL SIGN AAM", 2, {0x17B6, 0x17C6}}, + {"KHMER VOWEL SIGN COENG QA", 2, {0x17D2, 0x17A2}}, + {"KHMER VOWEL SIGN OM", 2, {0x17BB, 0x17C6}}, + {"LATIN CAPITAL LETTER A WITH MACRON AND GRAVE", 2, {0x0100, 0x0300}}, + {"LATIN CAPITAL LETTER A WITH OGONEK AND ACUTE", 2, {0x0104, 0x0301}}, + {"LATIN CAPITAL LETTER A WITH OGONEK AND TILDE", 2, {0x0104, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND CARON", 2, {0x00CA, 0x030C}}, + {"LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND MACRON", 2, {0x00CA, 0x0304}}, + {"LATIN CAPITAL LETTER E WITH DOT ABOVE AND ACUTE", 2, {0x0116, 0x0301}}, + {"LATIN CAPITAL LETTER E WITH DOT ABOVE AND TILDE", 2, {0x0116, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH OGONEK AND ACUTE", 2, {0x0118, 0x0301}}, + {"LATIN CAPITAL LETTER E WITH OGONEK AND TILDE", 2, {0x0118, 0x0303}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW", 2, {0x0045, 0x0329}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00C9, 0x0329}}, + {"LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00C8, 0x0329}}, + {"LATIN CAPITAL LETTER I WITH MACRON AND GRAVE", 2, {0x012A, 0x0300}}, + {"LATIN CAPITAL LETTER I WITH OGONEK AND ACUTE", 2, {0x012E, 0x0301}}, + {"LATIN CAPITAL LETTER I WITH OGONEK AND TILDE", 2, {0x012E, 0x0303}}, + {"LATIN CAPITAL LETTER J WITH TILDE", 2, {0x004A, 0x0303}}, + {"LATIN CAPITAL LETTER L WITH TILDE", 2, {0x004C, 0x0303}}, + {"LATIN CAPITAL LETTER M WITH TILDE", 2, {0x004D, 0x0303}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW", 2, {0x004F, 0x0329}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00D3, 0x0329}}, + {"LATIN CAPITAL LETTER O WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00D2, 0x0329}}, + {"LATIN CAPITAL LETTER R WITH TILDE", 2, {0x0052, 0x0303}}, + {"LATIN CAPITAL LETTER S WITH VERTICAL LINE BELOW", 2, {0x0053, 0x0329}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND ACUTE", 2, {0x016A, 0x0301}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND GRAVE", 2, {0x016A, 0x0300}}, + {"LATIN CAPITAL LETTER U WITH MACRON AND TILDE", 2, {0x016A, 0x0303}}, + {"LATIN CAPITAL LETTER U WITH OGONEK AND ACUTE", 2, {0x0172, 0x0301}}, + {"LATIN CAPITAL LETTER U WITH OGONEK AND TILDE", 2, {0x0172, 0x0303}}, + {"LATIN SMALL LETTER A WITH MACRON AND GRAVE", 2, {0x0101, 0x0300}}, + {"LATIN SMALL LETTER A WITH OGONEK AND ACUTE", 2, {0x0105, 0x0301}}, + {"LATIN SMALL LETTER A WITH OGONEK AND TILDE", 2, {0x0105, 0x0303}}, + {"LATIN SMALL LETTER AE WITH GRAVE", 2, {0x00E6, 0x0300}}, + {"LATIN SMALL LETTER E WITH CIRCUMFLEX AND CARON", 2, {0x00EA, 0x030C}}, + {"LATIN SMALL LETTER E WITH CIRCUMFLEX AND MACRON", 2, {0x00EA, 0x0304}}, + {"LATIN SMALL LETTER E WITH DOT ABOVE AND ACUTE", 2, {0x0117, 0x0301}}, + {"LATIN SMALL LETTER E WITH DOT ABOVE AND TILDE", 2, {0x0117, 0x0303}}, + {"LATIN SMALL LETTER E WITH OGONEK AND ACUTE", 2, {0x0119, 0x0301}}, + {"LATIN SMALL LETTER E WITH OGONEK AND TILDE", 2, {0x0119, 0x0303}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW", 2, {0x0065, 0x0329}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00E9, 0x0329}}, + {"LATIN SMALL LETTER E WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00E8, 0x0329}}, + {"LATIN SMALL LETTER HOOKED SCHWA WITH ACUTE", 2, {0x025A, 0x0301}}, + {"LATIN SMALL LETTER HOOKED SCHWA WITH GRAVE", 2, {0x025A, 0x0300}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND ACUTE", 3, {0x0069, 0x0307, 0x0301}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND GRAVE", 3, {0x0069, 0x0307, 0x0300}}, + {"LATIN SMALL LETTER I WITH DOT ABOVE AND TILDE", 3, {0x0069, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER I WITH MACRON AND GRAVE", 2, {0x012B, 0x0300}}, + {"LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND ACUTE", 3, {0x012F, 0x0307, 0x0301}}, + {"LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND TILDE", 3, {0x012F, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER J WITH DOT ABOVE AND TILDE", 3, {0x006A, 0x0307, 0x0303}}, + {"LATIN SMALL LETTER L WITH TILDE", 2, {0x006C, 0x0303}}, + {"LATIN SMALL LETTER M WITH TILDE", 2, {0x006D, 0x0303}}, + {"LATIN SMALL LETTER NG WITH TILDE ABOVE", 3, {0x006E, 0x0360, 0x0067}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW", 2, {0x006F, 0x0329}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW AND ACUTE", 2, {0x00F3, 0x0329}}, + {"LATIN SMALL LETTER O WITH VERTICAL LINE BELOW AND GRAVE", 2, {0x00F2, 0x0329}}, + {"LATIN SMALL LETTER OPEN O WITH ACUTE", 2, {0x0254, 0x0301}}, + {"LATIN SMALL LETTER OPEN O WITH GRAVE", 2, {0x0254, 0x0300}}, + {"LATIN SMALL LETTER R WITH TILDE", 2, {0x0072, 0x0303}}, + {"LATIN SMALL LETTER S WITH VERTICAL LINE BELOW", 2, {0x0073, 0x0329}}, + {"LATIN SMALL LETTER SCHWA WITH ACUTE", 2, {0x0259, 0x0301}}, + {"LATIN SMALL LETTER SCHWA WITH GRAVE", 2, {0x0259, 0x0300}}, + {"LATIN SMALL LETTER TURNED V WITH ACUTE", 2, {0x028C, 0x0301}}, + {"LATIN SMALL LETTER TURNED V WITH GRAVE", 2, {0x028C, 0x0300}}, + {"LATIN SMALL LETTER U WITH MACRON AND ACUTE", 2, {0x016B, 0x0301}}, + {"LATIN SMALL LETTER U WITH MACRON AND GRAVE", 2, {0x016B, 0x0300}}, + {"LATIN SMALL LETTER U WITH MACRON AND TILDE", 2, {0x016B, 0x0303}}, + {"LATIN SMALL LETTER U WITH OGONEK AND ACUTE", 2, {0x0173, 0x0301}}, + {"LATIN SMALL LETTER U WITH OGONEK AND TILDE", 2, {0x0173, 0x0303}}, + {"MODIFIER LETTER EXTRA-HIGH EXTRA-LOW CONTOUR TONE BAR", 2, {0x02E5, 0x02E9}}, + {"MODIFIER LETTER EXTRA-LOW EXTRA-HIGH CONTOUR TONE BAR", 2, {0x02E9, 0x02E5}}, + {"TAMIL CONSONANT C", 2, {0x0B9A, 0x0BCD}}, + {"TAMIL CONSONANT H", 2, {0x0BB9, 0x0BCD}}, + {"TAMIL CONSONANT J", 2, {0x0B9C, 0x0BCD}}, + {"TAMIL CONSONANT K", 2, {0x0B95, 0x0BCD}}, + {"TAMIL CONSONANT KSS", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCD}}, + {"TAMIL CONSONANT L", 2, {0x0BB2, 0x0BCD}}, + {"TAMIL CONSONANT LL", 2, {0x0BB3, 0x0BCD}}, + {"TAMIL CONSONANT LLL", 2, {0x0BB4, 0x0BCD}}, + {"TAMIL CONSONANT M", 2, {0x0BAE, 0x0BCD}}, + {"TAMIL CONSONANT N", 2, {0x0BA8, 0x0BCD}}, + {"TAMIL CONSONANT NG", 2, {0x0B99, 0x0BCD}}, + {"TAMIL CONSONANT NN", 2, {0x0BA3, 0x0BCD}}, + {"TAMIL CONSONANT NNN", 2, {0x0BA9, 0x0BCD}}, + {"TAMIL CONSONANT NY", 2, {0x0B9E, 0x0BCD}}, + {"TAMIL CONSONANT P", 2, {0x0BAA, 0x0BCD}}, + {"TAMIL CONSONANT R", 2, {0x0BB0, 0x0BCD}}, + {"TAMIL CONSONANT RR", 2, {0x0BB1, 0x0BCD}}, + {"TAMIL CONSONANT S", 2, {0x0BB8, 0x0BCD}}, + {"TAMIL CONSONANT SH", 2, {0x0BB6, 0x0BCD}}, + {"TAMIL CONSONANT SS", 2, {0x0BB7, 0x0BCD}}, + {"TAMIL CONSONANT T", 2, {0x0BA4, 0x0BCD}}, + {"TAMIL CONSONANT TT", 2, {0x0B9F, 0x0BCD}}, + {"TAMIL CONSONANT V", 2, {0x0BB5, 0x0BCD}}, + {"TAMIL CONSONANT Y", 2, {0x0BAF, 0x0BCD}}, + {"TAMIL SYLLABLE CAA", 2, {0x0B9A, 0x0BBE}}, + {"TAMIL SYLLABLE CAI", 2, {0x0B9A, 0x0BC8}}, + {"TAMIL SYLLABLE CAU", 2, {0x0B9A, 0x0BCC}}, + {"TAMIL SYLLABLE CE", 2, {0x0B9A, 0x0BC6}}, + {"TAMIL SYLLABLE CEE", 2, {0x0B9A, 0x0BC7}}, + {"TAMIL SYLLABLE CI", 2, {0x0B9A, 0x0BBF}}, + {"TAMIL SYLLABLE CII", 2, {0x0B9A, 0x0BC0}}, + {"TAMIL SYLLABLE CO", 2, {0x0B9A, 0x0BCA}}, + {"TAMIL SYLLABLE COO", 2, {0x0B9A, 0x0BCB}}, + {"TAMIL SYLLABLE CU", 2, {0x0B9A, 0x0BC1}}, + {"TAMIL SYLLABLE CUU", 2, {0x0B9A, 0x0BC2}}, + {"TAMIL SYLLABLE HAA", 2, {0x0BB9, 0x0BBE}}, + {"TAMIL SYLLABLE HAI", 2, {0x0BB9, 0x0BC8}}, + {"TAMIL SYLLABLE HAU", 2, {0x0BB9, 0x0BCC}}, + {"TAMIL SYLLABLE HE", 2, {0x0BB9, 0x0BC6}}, + {"TAMIL SYLLABLE HEE", 2, {0x0BB9, 0x0BC7}}, + {"TAMIL SYLLABLE HI", 2, {0x0BB9, 0x0BBF}}, + {"TAMIL SYLLABLE HII", 2, {0x0BB9, 0x0BC0}}, + {"TAMIL SYLLABLE HO", 2, {0x0BB9, 0x0BCA}}, + {"TAMIL SYLLABLE HOO", 2, {0x0BB9, 0x0BCB}}, + {"TAMIL SYLLABLE HU", 2, {0x0BB9, 0x0BC1}}, + {"TAMIL SYLLABLE HUU", 2, {0x0BB9, 0x0BC2}}, + {"TAMIL SYLLABLE JAA", 2, {0x0B9C, 0x0BBE}}, + {"TAMIL SYLLABLE JAI", 2, {0x0B9C, 0x0BC8}}, + {"TAMIL SYLLABLE JAU", 2, {0x0B9C, 0x0BCC}}, + {"TAMIL SYLLABLE JE", 2, {0x0B9C, 0x0BC6}}, + {"TAMIL SYLLABLE JEE", 2, {0x0B9C, 0x0BC7}}, + {"TAMIL SYLLABLE JI", 2, {0x0B9C, 0x0BBF}}, + {"TAMIL SYLLABLE JII", 2, {0x0B9C, 0x0BC0}}, + {"TAMIL SYLLABLE JO", 2, {0x0B9C, 0x0BCA}}, + {"TAMIL SYLLABLE JOO", 2, {0x0B9C, 0x0BCB}}, + {"TAMIL SYLLABLE JU", 2, {0x0B9C, 0x0BC1}}, + {"TAMIL SYLLABLE JUU", 2, {0x0B9C, 0x0BC2}}, + {"TAMIL SYLLABLE KAA", 2, {0x0B95, 0x0BBE}}, + {"TAMIL SYLLABLE KAI", 2, {0x0B95, 0x0BC8}}, + {"TAMIL SYLLABLE KAU", 2, {0x0B95, 0x0BCC}}, + {"TAMIL SYLLABLE KE", 2, {0x0B95, 0x0BC6}}, + {"TAMIL SYLLABLE KEE", 2, {0x0B95, 0x0BC7}}, + {"TAMIL SYLLABLE KI", 2, {0x0B95, 0x0BBF}}, + {"TAMIL SYLLABLE KII", 2, {0x0B95, 0x0BC0}}, + {"TAMIL SYLLABLE KO", 2, {0x0B95, 0x0BCA}}, + {"TAMIL SYLLABLE KOO", 2, {0x0B95, 0x0BCB}}, + {"TAMIL SYLLABLE KSSA", 3, {0x0B95, 0x0BCD, 0x0BB7}}, + {"TAMIL SYLLABLE KSSAA", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BBE}}, + {"TAMIL SYLLABLE KSSAI", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC8}}, + {"TAMIL SYLLABLE KSSAU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCC}}, + {"TAMIL SYLLABLE KSSE", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC6}}, + {"TAMIL SYLLABLE KSSEE", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC7}}, + {"TAMIL SYLLABLE KSSI", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BBF}}, + {"TAMIL SYLLABLE KSSII", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC0}}, + {"TAMIL SYLLABLE KSSO", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCA}}, + {"TAMIL SYLLABLE KSSOO", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BCB}}, + {"TAMIL SYLLABLE KSSU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC1}}, + {"TAMIL SYLLABLE KSSUU", 4, {0x0B95, 0x0BCD, 0x0BB7, 0x0BC2}}, + {"TAMIL SYLLABLE KU", 2, {0x0B95, 0x0BC1}}, + {"TAMIL SYLLABLE KUU", 2, {0x0B95, 0x0BC2}}, + {"TAMIL SYLLABLE LAA", 2, {0x0BB2, 0x0BBE}}, + {"TAMIL SYLLABLE LAI", 2, {0x0BB2, 0x0BC8}}, + {"TAMIL SYLLABLE LAU", 2, {0x0BB2, 0x0BCC}}, + {"TAMIL SYLLABLE LE", 2, {0x0BB2, 0x0BC6}}, + {"TAMIL SYLLABLE LEE", 2, {0x0BB2, 0x0BC7}}, + {"TAMIL SYLLABLE LI", 2, {0x0BB2, 0x0BBF}}, + {"TAMIL SYLLABLE LII", 2, {0x0BB2, 0x0BC0}}, + {"TAMIL SYLLABLE LLAA", 2, {0x0BB3, 0x0BBE}}, + {"TAMIL SYLLABLE LLAI", 2, {0x0BB3, 0x0BC8}}, + {"TAMIL SYLLABLE LLAU", 2, {0x0BB3, 0x0BCC}}, + {"TAMIL SYLLABLE LLE", 2, {0x0BB3, 0x0BC6}}, + {"TAMIL SYLLABLE LLEE", 2, {0x0BB3, 0x0BC7}}, + {"TAMIL SYLLABLE LLI", 2, {0x0BB3, 0x0BBF}}, + {"TAMIL SYLLABLE LLII", 2, {0x0BB3, 0x0BC0}}, + {"TAMIL SYLLABLE LLLAA", 2, {0x0BB4, 0x0BBE}}, + {"TAMIL SYLLABLE LLLAI", 2, {0x0BB4, 0x0BC8}}, + {"TAMIL SYLLABLE LLLAU", 2, {0x0BB4, 0x0BCC}}, + {"TAMIL SYLLABLE LLLE", 2, {0x0BB4, 0x0BC6}}, + {"TAMIL SYLLABLE LLLEE", 2, {0x0BB4, 0x0BC7}}, + {"TAMIL SYLLABLE LLLI", 2, {0x0BB4, 0x0BBF}}, + {"TAMIL SYLLABLE LLLII", 2, {0x0BB4, 0x0BC0}}, + {"TAMIL SYLLABLE LLLO", 2, {0x0BB4, 0x0BCA}}, + {"TAMIL SYLLABLE LLLOO", 2, {0x0BB4, 0x0BCB}}, + {"TAMIL SYLLABLE LLLU", 2, {0x0BB4, 0x0BC1}}, + {"TAMIL SYLLABLE LLLUU", 2, {0x0BB4, 0x0BC2}}, + {"TAMIL SYLLABLE LLO", 2, {0x0BB3, 0x0BCA}}, + {"TAMIL SYLLABLE LLOO", 2, {0x0BB3, 0x0BCB}}, + {"TAMIL SYLLABLE LLU", 2, {0x0BB3, 0x0BC1}}, + {"TAMIL SYLLABLE LLUU", 2, {0x0BB3, 0x0BC2}}, + {"TAMIL SYLLABLE LO", 2, {0x0BB2, 0x0BCA}}, + {"TAMIL SYLLABLE LOO", 2, {0x0BB2, 0x0BCB}}, + {"TAMIL SYLLABLE LU", 2, {0x0BB2, 0x0BC1}}, + {"TAMIL SYLLABLE LUU", 2, {0x0BB2, 0x0BC2}}, + {"TAMIL SYLLABLE MAA", 2, {0x0BAE, 0x0BBE}}, + {"TAMIL SYLLABLE MAI", 2, {0x0BAE, 0x0BC8}}, + {"TAMIL SYLLABLE MAU", 2, {0x0BAE, 0x0BCC}}, + {"TAMIL SYLLABLE ME", 2, {0x0BAE, 0x0BC6}}, + {"TAMIL SYLLABLE MEE", 2, {0x0BAE, 0x0BC7}}, + {"TAMIL SYLLABLE MI", 2, {0x0BAE, 0x0BBF}}, + {"TAMIL SYLLABLE MII", 2, {0x0BAE, 0x0BC0}}, + {"TAMIL SYLLABLE MO", 2, {0x0BAE, 0x0BCA}}, + {"TAMIL SYLLABLE MOO", 2, {0x0BAE, 0x0BCB}}, + {"TAMIL SYLLABLE MU", 2, {0x0BAE, 0x0BC1}}, + {"TAMIL SYLLABLE MUU", 2, {0x0BAE, 0x0BC2}}, + {"TAMIL SYLLABLE NAA", 2, {0x0BA8, 0x0BBE}}, + {"TAMIL SYLLABLE NAI", 2, {0x0BA8, 0x0BC8}}, + {"TAMIL SYLLABLE NAU", 2, {0x0BA8, 0x0BCC}}, + {"TAMIL SYLLABLE NE", 2, {0x0BA8, 0x0BC6}}, + {"TAMIL SYLLABLE NEE", 2, {0x0BA8, 0x0BC7}}, + {"TAMIL SYLLABLE NGAA", 2, {0x0B99, 0x0BBE}}, + {"TAMIL SYLLABLE NGAI", 2, {0x0B99, 0x0BC8}}, + {"TAMIL SYLLABLE NGAU", 2, {0x0B99, 0x0BCC}}, + {"TAMIL SYLLABLE NGE", 2, {0x0B99, 0x0BC6}}, + {"TAMIL SYLLABLE NGEE", 2, {0x0B99, 0x0BC7}}, + {"TAMIL SYLLABLE NGI", 2, {0x0B99, 0x0BBF}}, + {"TAMIL SYLLABLE NGII", 2, {0x0B99, 0x0BC0}}, + {"TAMIL SYLLABLE NGO", 2, {0x0B99, 0x0BCA}}, + {"TAMIL SYLLABLE NGOO", 2, {0x0B99, 0x0BCB}}, + {"TAMIL SYLLABLE NGU", 2, {0x0B99, 0x0BC1}}, + {"TAMIL SYLLABLE NGUU", 2, {0x0B99, 0x0BC2}}, + {"TAMIL SYLLABLE NI", 2, {0x0BA8, 0x0BBF}}, + {"TAMIL SYLLABLE NII", 2, {0x0BA8, 0x0BC0}}, + {"TAMIL SYLLABLE NNAA", 2, {0x0BA3, 0x0BBE}}, + {"TAMIL SYLLABLE NNAI", 2, {0x0BA3, 0x0BC8}}, + {"TAMIL SYLLABLE NNAU", 2, {0x0BA3, 0x0BCC}}, + {"TAMIL SYLLABLE NNE", 2, {0x0BA3, 0x0BC6}}, + {"TAMIL SYLLABLE NNEE", 2, {0x0BA3, 0x0BC7}}, + {"TAMIL SYLLABLE NNI", 2, {0x0BA3, 0x0BBF}}, + {"TAMIL SYLLABLE NNII", 2, {0x0BA3, 0x0BC0}}, + {"TAMIL SYLLABLE NNNAA", 2, {0x0BA9, 0x0BBE}}, + {"TAMIL SYLLABLE NNNAI", 2, {0x0BA9, 0x0BC8}}, + {"TAMIL SYLLABLE NNNAU", 2, {0x0BA9, 0x0BCC}}, + {"TAMIL SYLLABLE NNNE", 2, {0x0BA9, 0x0BC6}}, + {"TAMIL SYLLABLE NNNEE", 2, {0x0BA9, 0x0BC7}}, + {"TAMIL SYLLABLE NNNI", 2, {0x0BA9, 0x0BBF}}, + {"TAMIL SYLLABLE NNNII", 2, {0x0BA9, 0x0BC0}}, + {"TAMIL SYLLABLE NNNO", 2, {0x0BA9, 0x0BCA}}, + {"TAMIL SYLLABLE NNNOO", 2, {0x0BA9, 0x0BCB}}, + {"TAMIL SYLLABLE NNNU", 2, {0x0BA9, 0x0BC1}}, + {"TAMIL SYLLABLE NNNUU", 2, {0x0BA9, 0x0BC2}}, + {"TAMIL SYLLABLE NNO", 2, {0x0BA3, 0x0BCA}}, + {"TAMIL SYLLABLE NNOO", 2, {0x0BA3, 0x0BCB}}, + {"TAMIL SYLLABLE NNU", 2, {0x0BA3, 0x0BC1}}, + {"TAMIL SYLLABLE NNUU", 2, {0x0BA3, 0x0BC2}}, + {"TAMIL SYLLABLE NO", 2, {0x0BA8, 0x0BCA}}, + {"TAMIL SYLLABLE NOO", 2, {0x0BA8, 0x0BCB}}, + {"TAMIL SYLLABLE NU", 2, {0x0BA8, 0x0BC1}}, + {"TAMIL SYLLABLE NUU", 2, {0x0BA8, 0x0BC2}}, + {"TAMIL SYLLABLE NYAA", 2, {0x0B9E, 0x0BBE}}, + {"TAMIL SYLLABLE NYAI", 2, {0x0B9E, 0x0BC8}}, + {"TAMIL SYLLABLE NYAU", 2, {0x0B9E, 0x0BCC}}, + {"TAMIL SYLLABLE NYE", 2, {0x0B9E, 0x0BC6}}, + {"TAMIL SYLLABLE NYEE", 2, {0x0B9E, 0x0BC7}}, + {"TAMIL SYLLABLE NYI", 2, {0x0B9E, 0x0BBF}}, + {"TAMIL SYLLABLE NYII", 2, {0x0B9E, 0x0BC0}}, + {"TAMIL SYLLABLE NYO", 2, {0x0B9E, 0x0BCA}}, + {"TAMIL SYLLABLE NYOO", 2, {0x0B9E, 0x0BCB}}, + {"TAMIL SYLLABLE NYU", 2, {0x0B9E, 0x0BC1}}, + {"TAMIL SYLLABLE NYUU", 2, {0x0B9E, 0x0BC2}}, + {"TAMIL SYLLABLE PAA", 2, {0x0BAA, 0x0BBE}}, + {"TAMIL SYLLABLE PAI", 2, {0x0BAA, 0x0BC8}}, + {"TAMIL SYLLABLE PAU", 2, {0x0BAA, 0x0BCC}}, + {"TAMIL SYLLABLE PE", 2, {0x0BAA, 0x0BC6}}, + {"TAMIL SYLLABLE PEE", 2, {0x0BAA, 0x0BC7}}, + {"TAMIL SYLLABLE PI", 2, {0x0BAA, 0x0BBF}}, + {"TAMIL SYLLABLE PII", 2, {0x0BAA, 0x0BC0}}, + {"TAMIL SYLLABLE PO", 2, {0x0BAA, 0x0BCA}}, + {"TAMIL SYLLABLE POO", 2, {0x0BAA, 0x0BCB}}, + {"TAMIL SYLLABLE PU", 2, {0x0BAA, 0x0BC1}}, + {"TAMIL SYLLABLE PUU", 2, {0x0BAA, 0x0BC2}}, + {"TAMIL SYLLABLE RAA", 2, {0x0BB0, 0x0BBE}}, + {"TAMIL SYLLABLE RAI", 2, {0x0BB0, 0x0BC8}}, + {"TAMIL SYLLABLE RAU", 2, {0x0BB0, 0x0BCC}}, + {"TAMIL SYLLABLE RE", 2, {0x0BB0, 0x0BC6}}, + {"TAMIL SYLLABLE REE", 2, {0x0BB0, 0x0BC7}}, + {"TAMIL SYLLABLE RI", 2, {0x0BB0, 0x0BBF}}, + {"TAMIL SYLLABLE RII", 2, {0x0BB0, 0x0BC0}}, + {"TAMIL SYLLABLE RO", 2, {0x0BB0, 0x0BCA}}, + {"TAMIL SYLLABLE ROO", 2, {0x0BB0, 0x0BCB}}, + {"TAMIL SYLLABLE RRAA", 2, {0x0BB1, 0x0BBE}}, + {"TAMIL SYLLABLE RRAI", 2, {0x0BB1, 0x0BC8}}, + {"TAMIL SYLLABLE RRAU", 2, {0x0BB1, 0x0BCC}}, + {"TAMIL SYLLABLE RRE", 2, {0x0BB1, 0x0BC6}}, + {"TAMIL SYLLABLE RREE", 2, {0x0BB1, 0x0BC7}}, + {"TAMIL SYLLABLE RRI", 2, {0x0BB1, 0x0BBF}}, + {"TAMIL SYLLABLE RRII", 2, {0x0BB1, 0x0BC0}}, + {"TAMIL SYLLABLE RRO", 2, {0x0BB1, 0x0BCA}}, + {"TAMIL SYLLABLE RROO", 2, {0x0BB1, 0x0BCB}}, + {"TAMIL SYLLABLE RRU", 2, {0x0BB1, 0x0BC1}}, + {"TAMIL SYLLABLE RRUU", 2, {0x0BB1, 0x0BC2}}, + {"TAMIL SYLLABLE RU", 2, {0x0BB0, 0x0BC1}}, + {"TAMIL SYLLABLE RUU", 2, {0x0BB0, 0x0BC2}}, + {"TAMIL SYLLABLE SAA", 2, {0x0BB8, 0x0BBE}}, + {"TAMIL SYLLABLE SAI", 2, {0x0BB8, 0x0BC8}}, + {"TAMIL SYLLABLE SAU", 2, {0x0BB8, 0x0BCC}}, + {"TAMIL SYLLABLE SE", 2, {0x0BB8, 0x0BC6}}, + {"TAMIL SYLLABLE SEE", 2, {0x0BB8, 0x0BC7}}, + {"TAMIL SYLLABLE SHAA", 2, {0x0BB6, 0x0BBE}}, + {"TAMIL SYLLABLE SHAI", 2, {0x0BB6, 0x0BC8}}, + {"TAMIL SYLLABLE SHAU", 2, {0x0BB6, 0x0BCC}}, + {"TAMIL SYLLABLE SHE", 2, {0x0BB6, 0x0BC6}}, + {"TAMIL SYLLABLE SHEE", 2, {0x0BB6, 0x0BC7}}, + {"TAMIL SYLLABLE SHI", 2, {0x0BB6, 0x0BBF}}, + {"TAMIL SYLLABLE SHII", 2, {0x0BB6, 0x0BC0}}, + {"TAMIL SYLLABLE SHO", 2, {0x0BB6, 0x0BCA}}, + {"TAMIL SYLLABLE SHOO", 2, {0x0BB6, 0x0BCB}}, + {"TAMIL SYLLABLE SHRII", 4, {0x0BB6, 0x0BCD, 0x0BB0, 0x0BC0}}, + {"TAMIL SYLLABLE SHU", 2, {0x0BB6, 0x0BC1}}, + {"TAMIL SYLLABLE SHUU", 2, {0x0BB6, 0x0BC2}}, + {"TAMIL SYLLABLE SI", 2, {0x0BB8, 0x0BBF}}, + {"TAMIL SYLLABLE SII", 2, {0x0BB8, 0x0BC0}}, + {"TAMIL SYLLABLE SO", 2, {0x0BB8, 0x0BCA}}, + {"TAMIL SYLLABLE SOO", 2, {0x0BB8, 0x0BCB}}, + {"TAMIL SYLLABLE SSAA", 2, {0x0BB7, 0x0BBE}}, + {"TAMIL SYLLABLE SSAI", 2, {0x0BB7, 0x0BC8}}, + {"TAMIL SYLLABLE SSAU", 2, {0x0BB7, 0x0BCC}}, + {"TAMIL SYLLABLE SSE", 2, {0x0BB7, 0x0BC6}}, + {"TAMIL SYLLABLE SSEE", 2, {0x0BB7, 0x0BC7}}, + {"TAMIL SYLLABLE SSI", 2, {0x0BB7, 0x0BBF}}, + {"TAMIL SYLLABLE SSII", 2, {0x0BB7, 0x0BC0}}, + {"TAMIL SYLLABLE SSO", 2, {0x0BB7, 0x0BCA}}, + {"TAMIL SYLLABLE SSOO", 2, {0x0BB7, 0x0BCB}}, + {"TAMIL SYLLABLE SSU", 2, {0x0BB7, 0x0BC1}}, + {"TAMIL SYLLABLE SSUU", 2, {0x0BB7, 0x0BC2}}, + {"TAMIL SYLLABLE SU", 2, {0x0BB8, 0x0BC1}}, + {"TAMIL SYLLABLE SUU", 2, {0x0BB8, 0x0BC2}}, + {"TAMIL SYLLABLE TAA", 2, {0x0BA4, 0x0BBE}}, + {"TAMIL SYLLABLE TAI", 2, {0x0BA4, 0x0BC8}}, + {"TAMIL SYLLABLE TAU", 2, {0x0BA4, 0x0BCC}}, + {"TAMIL SYLLABLE TE", 2, {0x0BA4, 0x0BC6}}, + {"TAMIL SYLLABLE TEE", 2, {0x0BA4, 0x0BC7}}, + {"TAMIL SYLLABLE TI", 2, {0x0BA4, 0x0BBF}}, + {"TAMIL SYLLABLE TII", 2, {0x0BA4, 0x0BC0}}, + {"TAMIL SYLLABLE TO", 2, {0x0BA4, 0x0BCA}}, + {"TAMIL SYLLABLE TOO", 2, {0x0BA4, 0x0BCB}}, + {"TAMIL SYLLABLE TTAA", 2, {0x0B9F, 0x0BBE}}, + {"TAMIL SYLLABLE TTAI", 2, {0x0B9F, 0x0BC8}}, + {"TAMIL SYLLABLE TTAU", 2, {0x0B9F, 0x0BCC}}, + {"TAMIL SYLLABLE TTE", 2, {0x0B9F, 0x0BC6}}, + {"TAMIL SYLLABLE TTEE", 2, {0x0B9F, 0x0BC7}}, + {"TAMIL SYLLABLE TTI", 2, {0x0B9F, 0x0BBF}}, + {"TAMIL SYLLABLE TTII", 2, {0x0B9F, 0x0BC0}}, + {"TAMIL SYLLABLE TTO", 2, {0x0B9F, 0x0BCA}}, + {"TAMIL SYLLABLE TTOO", 2, {0x0B9F, 0x0BCB}}, + {"TAMIL SYLLABLE TTU", 2, {0x0B9F, 0x0BC1}}, + {"TAMIL SYLLABLE TTUU", 2, {0x0B9F, 0x0BC2}}, + {"TAMIL SYLLABLE TU", 2, {0x0BA4, 0x0BC1}}, + {"TAMIL SYLLABLE TUU", 2, {0x0BA4, 0x0BC2}}, + {"TAMIL SYLLABLE VAA", 2, {0x0BB5, 0x0BBE}}, + {"TAMIL SYLLABLE VAI", 2, {0x0BB5, 0x0BC8}}, + {"TAMIL SYLLABLE VAU", 2, {0x0BB5, 0x0BCC}}, + {"TAMIL SYLLABLE VE", 2, {0x0BB5, 0x0BC6}}, + {"TAMIL SYLLABLE VEE", 2, {0x0BB5, 0x0BC7}}, + {"TAMIL SYLLABLE VI", 2, {0x0BB5, 0x0BBF}}, + {"TAMIL SYLLABLE VII", 2, {0x0BB5, 0x0BC0}}, + {"TAMIL SYLLABLE VO", 2, {0x0BB5, 0x0BCA}}, + {"TAMIL SYLLABLE VOO", 2, {0x0BB5, 0x0BCB}}, + {"TAMIL SYLLABLE VU", 2, {0x0BB5, 0x0BC1}}, + {"TAMIL SYLLABLE VUU", 2, {0x0BB5, 0x0BC2}}, + {"TAMIL SYLLABLE YAA", 2, {0x0BAF, 0x0BBE}}, + {"TAMIL SYLLABLE YAI", 2, {0x0BAF, 0x0BC8}}, + {"TAMIL SYLLABLE YAU", 2, {0x0BAF, 0x0BCC}}, + {"TAMIL SYLLABLE YE", 2, {0x0BAF, 0x0BC6}}, + {"TAMIL SYLLABLE YEE", 2, {0x0BAF, 0x0BC7}}, + {"TAMIL SYLLABLE YI", 2, {0x0BAF, 0x0BBF}}, + {"TAMIL SYLLABLE YII", 2, {0x0BAF, 0x0BC0}}, + {"TAMIL SYLLABLE YO", 2, {0x0BAF, 0x0BCA}}, + {"TAMIL SYLLABLE YOO", 2, {0x0BAF, 0x0BCB}}, + {"TAMIL SYLLABLE YU", 2, {0x0BAF, 0x0BC1}}, + {"TAMIL SYLLABLE YUU", 2, {0x0BAF, 0x0BC2}}, +}; diff --git a/Tools/unicode/makeunicodedata.py b/Tools/unicode/makeunicodedata.py --- a/Tools/unicode/makeunicodedata.py +++ b/Tools/unicode/makeunicodedata.py @@ -25,7 +25,12 @@ # written by Fredrik Lundh (fredrik at pythonware.com) # -import sys, os, zipfile +import os +import sys +import zipfile + +from textwrap import dedent +from operator import itemgetter SCRIPT = sys.argv[0] VERSION = "3.2" @@ -39,6 +44,8 @@ DERIVED_CORE_PROPERTIES = "DerivedCoreProperties%s.txt" DERIVEDNORMALIZATION_PROPS = "DerivedNormalizationProps%s.txt" LINE_BREAK = "LineBreak%s.txt" +NAME_ALIASES = "NameAliases%s.txt" +NAMED_SEQUENCES = "NamedSequences%s.txt" old_versions = ["3.2.0"] @@ -692,6 +699,40 @@ print("/* name->code dictionary */", file=fp) codehash.dump(fp, trace) + print(dedent(""" + typedef struct Alias { + char *name; + int namelen; + int codepoint; + } alias; + """), file=fp) + + print('static const int aliases_count = %d;' % len(unicode.aliases), file=fp) + + print('static const alias name_aliases[] = {', file=fp) + for name, codepoint in unicode.aliases: + print(' {"%s", %d, 0x%04X},' % (name, len(name), codepoint), file=fp) + print('};', file=fp) + + # the Py_UCS2 seq[4] should use Py_UCS4 if non-BMP chars are added to the + # sequences and have an higher number of elements if the sequences get longer + print(dedent(""" + typedef struct NamedSequence { + char *name; + int seqlen; + Py_UCS2 seq[4]; + } named_sequence; + """), file=fp) + + print('static const int named_sequences_count = %d;' % len(unicode.named_sequences), + file=fp) + + print('static const named_sequence named_sequences[] = {', file=fp) + for name, sequence in unicode.named_sequences: + seq_str = ', '.join('0x%04X' % cp for cp in sequence) + print(' {"%s", %d, {%s}},' % (name, len(sequence), seq_str), file=fp) + print('};', file=fp) + fp.close() @@ -855,6 +896,31 @@ self.table = table self.chars = list(range(0x110000)) # unicode 3.2 + self.aliases = [] + with open_data(NAME_ALIASES, version) as file: + for s in file: + s = s.strip() + if not s or s.startswith('#'): + continue + char, name = s.split(';') + char = int(char, 16) + self.aliases.append((name, char)) + + self.named_sequences = [] + with open_data(NAMED_SEQUENCES, version) as file: + for s in file: + s = s.strip() + if not s or s.startswith('#'): + continue + name, chars = s.split(';') + chars = tuple(int(char, 16) for char in chars.split()) + # check that the structure defined in makeunicodename is OK + assert 2 <= len(chars) <= 4, "change the Py_UCS2 array size" + assert all(c <= 0xFFFF for c in chars), "use Py_UCS4 instead" + self.named_sequences.append((name, chars)) + # sort names to enable binary search + self.named_sequences.sort(key=itemgetter(0)) + self.exclusions = {} with open_data(COMPOSITION_EXCLUSIONS, version) as file: for s in file: From report at bugs.python.org Thu Oct 6 12:55:51 2011 From: report at bugs.python.org (Ezio Melotti) Date: Thu, 06 Oct 2011 10:55:51 +0000 Subject: [issue2771] Test issue In-Reply-To: <1210005645.74.0.283923986194.issue2771@psf.upfronthosting.co.za> Message-ID: <1317898551.38.0.485566163143.issue2771@psf.upfronthosting.co.za> Ezio Melotti added the comment: Adding # ignore html part of multipart/alternative ignore_alternatives = yes to the config.ini seems to get rid of the "unnamed" attachments. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:01:34 2011 From: report at bugs.python.org (Nadeem Vawda) Date: Thu, 06 Oct 2011 11:01:34 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317898894.75.0.362794527014.issue6715@psf.upfronthosting.co.za> Nadeem Vawda added the comment: Wow, this discussion has gotten quite busy while I've been travelling... Martin, could you explain what the problems are with bundling a precompiled DLL for Windows? I am willing to do the work of getting liblzma to compile with VS if necessary, but I don't know how receptive the upstream maintainer will be to the changes. If I can explain how lack of VS support is a problem for us, the request should carry more weight. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:19:45 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 11:19:45 +0000 Subject: [issue9442] Update sys.version doc In-Reply-To: <1280605593.44.0.301230771819.issue9442@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 9f6704da4abb by ?ric Araujo in branch '2.7': Fix markup used in the documentation of sys.prefix and sys.exec_prefix. http://hg.python.org/cpython/rev/9f6704da4abb ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:24:05 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 11:24:05 +0000 Subject: [issue9442] Update sys.version doc In-Reply-To: <1280605593.44.0.301230771819.issue9442@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 6ea47522f466 by ?ric Araujo in branch '3.2': Fix markup used in the documentation of sys.prefix and sys.exec_prefix. http://hg.python.org/cpython/rev/6ea47522f466 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:24:05 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 11:24:05 +0000 Subject: [issue12167] test_packaging reference leak In-Reply-To: <1306231646.32.0.292171987241.issue12167@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset e76c6aaff135 by ?ric Araujo in branch 'default': Add regrtest check for caches in packaging.database (see #12167) http://hg.python.org/cpython/rev/e76c6aaff135 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:24:06 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 11:24:06 +0000 Subject: [issue12222] All pysetup commands should respect exit codes In-Reply-To: <1306833373.02.0.231767797102.issue12222@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset ab125793243f by ?ric Araujo in branch 'default': Fix return code of ?pysetup run COMMAND? (closes #12222) http://hg.python.org/cpython/rev/ab125793243f ---------- resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:24:06 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 11:24:06 +0000 Subject: [issue11841] Bug in the verson comparison In-Reply-To: <1302686380.66.0.407051830429.issue11841@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 2105ab8553b7 by ?ric Araujo in branch 'default': Add tests for comparing candidate and final versions in packaging (#11841). http://hg.python.org/cpython/rev/2105ab8553b7 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:24:46 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 11:24:46 +0000 Subject: [issue13105] Please elaborate on how 2.x and 3.x are different heads In-Reply-To: <1317770656.72.0.880626385137.issue13105@psf.upfronthosting.co.za> Message-ID: <1317900286.18.0.498876119995.issue13105@psf.upfronthosting.co.za> ?ric Araujo added the comment: Can you paste the email for a starting point? ---------- nosy: +eric.araujo, ncoghlan _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:25:12 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 11:25:12 +0000 Subject: [issue13105] Please elaborate on how 2.x and 3.x are different heads In-Reply-To: <1317770656.72.0.880626385137.issue13105@psf.upfronthosting.co.za> Message-ID: <1317900312.69.0.938109305477.issue13105@psf.upfronthosting.co.za> Changes by ?ric Araujo : ---------- assignee: -> eric.araujo versions: +3rd party _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:38:12 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 11:38:12 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317370143.22.0.297306447289.issue13073@psf.upfronthosting.co.za> Message-ID: <1317901092.1.0.246854308757.issue13073@psf.upfronthosting.co.za> ?ric Araujo added the comment: It is IMO a source of confusion that the doc talk about a string instead of ?a bytes object? (3.x) or ?a string (str)? (2.x, unless unicode is supported too). ---------- nosy: +eric.araujo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:40:56 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 11:40:56 +0000 Subject: [issue11841] Bug in the verson comparison In-Reply-To: <1302686380.66.0.407051830429.issue11841@psf.upfronthosting.co.za> Message-ID: <1317901256.29.0.168196143099.issue11841@psf.upfronthosting.co.za> ?ric Araujo added the comment: I couldn?t reproduce the bugs but added the tests. Thanks! ---------- resolution: -> out of date stage: -> committed/rejected status: open -> closed versions: +3rd party _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:44:28 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 11:44:28 +0000 Subject: [issue12167] test_packaging reference leak In-Reply-To: <1306231646.32.0.292171987241.issue12167@psf.upfronthosting.co.za> Message-ID: <1317901468.42.0.106403165457.issue12167@psf.upfronthosting.co.za> Changes by ?ric Araujo : ---------- resolution: -> fixed stage: needs patch -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:45:57 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Thu, 06 Oct 2011 11:45:57 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: <1317901557.39.0.918529521475.issue13070@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: Your application does not segfault with 2.7 because buffered files and sockets use a very different implementation. The io module is present in all versions, but only Python3 uses it for all file-like objects. If the unit test (test_rwpair_cleared_before_textio) crashes 2.7, the fix should be applied. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 13:55:00 2011 From: report at bugs.python.org (telmich) Date: Thu, 06 Oct 2011 11:55:00 +0000 Subject: [issue13113] Wrong error message on class instance, when giving too little positional arguments Message-ID: <1317902100.22.0.844321043638.issue13113@psf.upfronthosting.co.za> New submission from telmich : I've this class: class Path: """Class that handles path related configurations""" def __init__(self, target_host, remote_user, remote_prefix, initial_manifest=False, base_dir=None, debug=False): That is falsely instantiated from a different class with these arguments: self.path = cdist.path.Path(self.target_host, initial_manifest=initial_manifest, base_dir=home, debug=debug) Which results in the following traceback: [13:40] kr:cdist% ./bin/cdist config -d localhost Traceback (most recent call last): File "./bin/cdist", line 119, in commandline() File "./bin/cdist", line 102, in commandline args.func(args) File "/home/users/nico/oeffentlich/rechner/projekte/cdist/lib/cdist/config.py", line 296, in config c = Config(host, initial_manifest=args.manifest, home=args.cdist_home, debug=args.debug) File "/home/users/nico/oeffentlich/rechner/projekte/cdist/lib/cdist/config.py", line 52, in __init__ debug=debug) TypeError: __init__() takes at least 4 arguments (5 given) Problem: - there are 5 arguments, so an error message indicating there are at least 4 needed is not helpful Proposal (pseudocode): Change to "Only %d of %d required positional arguments given" required_positional, giving_positional ---------- components: Interpreter Core messages: 145002 nosy: telmich priority: normal severity: normal status: open title: Wrong error message on class instance, when giving too little positional arguments type: behavior versions: Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 14:44:44 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 12:44:44 +0000 Subject: [issue12943] tokenize: add python -m tokenize support back In-Reply-To: <1315537867.15.0.614423357455.issue12943@psf.upfronthosting.co.za> Message-ID: <1317905084.47.0.61944677483.issue12943@psf.upfronthosting.co.za> ?ric Araujo added the comment: I made a few last remarks on Rietveld; feel free to address or ignore them and commit right away. ---------- nosy: +eric.araujo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 14:48:21 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 12:48:21 +0000 Subject: [issue7833] bdist_wininst installers fail to load extensions built with Issue4120 patch In-Reply-To: <1265062373.01.0.461114831555.issue7833@psf.upfronthosting.co.za> Message-ID: <1317905301.33.0.834326623825.issue7833@psf.upfronthosting.co.za> ?ric Araujo added the comment: Can the patch include regression tests? ---------- components: +Distutils2 nosy: +alexis, eric.araujo title: Bdist_wininst installers fail to load extensions built with Issue4120 patch -> bdist_wininst installers fail to load extensions built with Issue4120 patch versions: +3rd party, Python 3.2 -Python 3.4 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:06:03 2011 From: report at bugs.python.org (xy zzy) Date: Thu, 06 Oct 2011 13:06:03 +0000 Subject: [issue13109] telnetlib insensitive to connection loss In-Reply-To: <1317848889.2.0.2461830982.issue13109@psf.upfronthosting.co.za> Message-ID: <1317906363.83.0.512535489156.issue13109@psf.upfronthosting.co.za> xy zzy added the comment: Cfrom class(): # see if we can connect to pcPart try: self.pcPart = telnetlib.Telnet(IP, PORT) # clear the buffer for i in range(10): self.pcPart.write('\n') r = self.pcPart.read_until('Prompt>', 1) except socket.error, e: logging.debug('socket.error: %d: %s' % (e.args[0], e.args[1])) self.pcPart.close() self.pcPart = None from init(): def talk(self,cmd,ret): """talk to the device""" read_chars = "" while (read_chars == ""): try: read_chars="" # get to the Prompt> prompt # logging.debug('seeking prompt') while (read_chars != 'Prompt>'): self.pcPart.write("\n") raw_data = self.pcPart.read_until('Prompt>', 1).split('\n') # logging.debug('raw_data: %i %s' % (len(raw_data), raw_data)) read_chars = raw_data[2] # logging.debug('read_chars: %s' % (read_chars)) # send the command # logging.debug('found prompt') cmdx = (('xyzzy:%s\n') % cmd) self.pcPart.write(cmdx) # logging.debug('command %s, %s' % (cmd, cmdx)) if (ret): while ((len(raw_data) > 0) and ('{' not in read_chars)): raw_data = self.pcPart.read_until('Prompt>', 1) # logging.debug('raw_data: %i %s' % (len(raw_data), raw_data)) try: read_chars = str(raw_data.split('\n\r')[1][1:-1]) except: read_chars = '' # logging.debug('read_chars: %s' % (read_chars)) else: raw_data = self.pcPart.read_until('Prompt>', 1) # logging.debug('ret read: %s' % (raw_data)) read_chars = '@' return read_chars except IndexError, e: logging.debug('IndexError: %d: %s' % (e.args[0], e.args[1])) traceback.print_exc(file=open(LOG_FILENAME, 'a')) read_chars = '@' time.sleep(1) except (IOError, socket.error), e: logging.debug('socket.error: %d: %s' % (e.args[0], e.args[1])) traceback.print_exc(file=open(LOG_FILENAME, 'a')) self.pcPart.close() self.pcPart = None logging.debug('reconnecting...') self.pcPart = telnetlib.Telnet(IP, PORT) # clear the buffer for i in range(10): self.pcPart.write('\n') r = self.pcPart.read_until('Prompt>', 1) read_chars = '@' logging.debug('reconnected') # clear the buffer for i in range(2): self.pcPart.write('\n') r = self.pcPart.read_until('Prompt>', 1) # logging.debug('Data Read: ' + read_chars) return read_chars called from: DATA = self.talk('cmd', True) logging.debug('talk: %s' % (DATA)) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:22:07 2011 From: report at bugs.python.org (Benjamin Peterson) Date: Thu, 06 Oct 2011 13:22:07 +0000 Subject: [issue13113] Wrong error message on class instance, when giving too little positional arguments In-Reply-To: <1317902100.22.0.844321043638.issue13113@psf.upfronthosting.co.za> Message-ID: <1317907327.65.0.0350958487229.issue13113@psf.upfronthosting.co.za> Benjamin Peterson added the comment: Fixed in 3.3 Traceback (most recent call last): File "x.py", line 16, in debug=0) TypeError: __init__() missing 2 required positional arguments: 'remote_user' and 'remote_prefix' ---------- nosy: +benjamin.peterson resolution: -> out of date status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:22:42 2011 From: report at bugs.python.org (=?utf-8?b?0JrQuNGA0LjQu9C7INCa0YPQt9GM0LzQuNC90YvRhQ==?=) Date: Thu, 06 Oct 2011 13:22:42 +0000 Subject: [issue13114] UnicodeDecodeError in command `register` due to using not ASCII chars in long_description Message-ID: <1317907362.37.0.133459111826.issue13114@psf.upfronthosting.co.za> New submission from ?????? ????????? : Command `register` (and `check -r` too) raises the exception UnicodeDecodeError if the long_description (stored as unicode) contains not ASCII chars. This is because the Docutils, called from Distutils, accepts only ASCII or Unicode. But Distutils passes to Docutils text as a `str` (ASCII or UTF-8). PS: sorry for my English ---------- assignee: tarek components: Distutils files: trace.log messages: 145007 nosy: Cykooz, eric.araujo, tarek priority: normal severity: normal status: open title: UnicodeDecodeError in command `register` due to using not ASCII chars in long_description type: crash versions: Python 2.7 Added file: http://bugs.python.org/file23327/trace.log _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:24:33 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 13:24:33 +0000 Subject: [issue13114] UnicodeDecodeError in command `register` due to using not ASCII chars in long_description In-Reply-To: <1317907362.37.0.133459111826.issue13114@psf.upfronthosting.co.za> Message-ID: <1317907473.26.0.189005836353.issue13114@psf.upfronthosting.co.za> ?ric Araujo added the comment: Thank you for the report. Can you give us a short setup.py that reproduces the bug? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:31:23 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 13:31:23 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset db3e15017172 by Antoine Pitrou in branch 'default': Issue #3163: The struct module gets new format characters 'n' and 'N' http://hg.python.org/cpython/rev/db3e15017172 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:31:55 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Thu, 06 Oct 2011 13:31:55 +0000 Subject: [issue3163] module struct support for ssize_t and size_t In-Reply-To: <1214071551.33.0.650558563727.issue3163@psf.upfronthosting.co.za> Message-ID: <1317907915.8.0.021759576072.issue3163@psf.upfronthosting.co.za> Antoine Pitrou added the comment: Thanks for the reviews! ---------- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:33:16 2011 From: report at bugs.python.org (=?utf-8?b?0JrQuNGA0LjQu9C7INCa0YPQt9GM0LzQuNC90YvRhQ==?=) Date: Thu, 06 Oct 2011 13:33:16 +0000 Subject: [issue13114] UnicodeDecodeError in command `register` due to using not ASCII chars in long_description In-Reply-To: <1317907362.37.0.133459111826.issue13114@psf.upfronthosting.co.za> Message-ID: <1317907996.86.0.534698826845.issue13114@psf.upfronthosting.co.za> ?????? ????????? added the comment: > Can you give us a short setup.py that reproduces the bug? Ok. Command that reproduces the bug: python setup.py check -r PS: I use: Ubuntu 11.04 Python 2.7.1 Docutils 0.8.1 ---------- Added file: http://bugs.python.org/file23328/setup.py _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 15:40:31 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 13:40:31 +0000 Subject: [issue13114] check -r fails with non-ASCII unicode long_description In-Reply-To: <1317907362.37.0.133459111826.issue13114@psf.upfronthosting.co.za> Message-ID: <1317908431.04.0.885898253817.issue13114@psf.upfronthosting.co.za> ?ric Araujo added the comment: Your file uses setuptools, which is not part of Python, but I can reproduce the same bug with distutils. ---------- assignee: tarek -> eric.araujo components: +Distutils2 nosy: +alexis stage: -> needs patch title: UnicodeDecodeError in command `register` due to using not ASCII chars in long_description -> check -r fails with non-ASCII unicode long_description type: crash -> behavior versions: +3rd party, Python 3.2, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:10:26 2011 From: report at bugs.python.org (Andrew Wilkins) Date: Thu, 06 Oct 2011 14:10:26 +0000 Subject: [issue13115] tp_as_{number, sequence, mapping} can't be set using PyType_FromSpec Message-ID: <1317910226.56.0.365091018544.issue13115@psf.upfronthosting.co.za> New submission from Andrew Wilkins : I've written an extension using Py_LIMITED_API, and I've created a type using PyType_FromSpec with the slot "Py_sq_length" defined. The slot is not being picked up, i.e. len(MyType()) fails. I can see that tp_as_sequence has not been set, which explains why. All is well if I set it manually (without Py_LIMITED_API defined), like so: MyType->tp_as_sequence = &((PyHeapTypeObject*)MyType)->as_sequence; As far as I can see (docs are lacking), there's no way of setting tp_as_number, tp_as_sequence or tp_as_mapping in types created with PyType_FromSpec. I would expect the presence of any Py_sq_* slots to set tp_as_sequence (likewise for number and mapping). ---------- components: Interpreter Core messages: 145013 nosy: awilkins priority: normal severity: normal status: open title: tp_as_{number,sequence,mapping} can't be set using PyType_FromSpec versions: Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:24:54 2011 From: report at bugs.python.org (Nick Coghlan) Date: Thu, 06 Oct 2011 14:24:54 +0000 Subject: [issue13105] Please elaborate on how 2.x and 3.x are different heads In-Reply-To: <1317770656.72.0.880626385137.issue13105@psf.upfronthosting.co.za> Message-ID: <1317911094.61.0.837987804948.issue13105@psf.upfronthosting.co.za> Nick Coghlan added the comment: This was from memory, so don't take it as gospel as far as the current security-fix-only branches go, but here's what I sent to Larry: ----------------- We maintain two independent heads in hg: 2.7 and default 3.2 is open for general bugfixes 2.5 (IIRC), 2.6 and 3.1 are open for security fixes Security fixes (if applicable to both heads) go: 2.5 -> 2.6 -> 2.7 3.1 -> 3.2 -> default General bug fixes (if applicable to both heads) go: 2.7 3.2 -> default New features are added to default only The relative ordering of 2.x and 3.x changes doesn't really matter - the important thing is not to merge them in *either* direction. I think you can theoretically do cherry-picking with Hg, but most people seem to just do independent commits to the two streams. ----------------- ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:29:10 2011 From: report at bugs.python.org (Eric V. Smith) Date: Thu, 06 Oct 2011 14:29:10 +0000 Subject: [issue13109] telnetlib insensitive to connection loss In-Reply-To: <1317848889.2.0.2461830982.issue13109@psf.upfronthosting.co.za> Message-ID: <1317911350.21.0.879184331142.issue13109@psf.upfronthosting.co.za> Eric V. Smith added the comment: Assuming that you're unplugging the cable when the code is in the loop that occurs after the line "self.pcPart.write(cmdx)", and also assuming that you haven't turned on keepalives, then the behavior you see is expected. You're just waiting to read some data, and there are no pending writes. Therefore the TCP stack will wait forever if there are no incoming packets. There could be no incoming packets due to no data being ready, of from the network being down. The TCP stack has no way of knowing, so it cannot notify your code. I suggest turning on TCP keepalives, which would then allow the TCP stack to notify your code that the connection has been closed. I'm going to close this issue. If you turn on keepalives and still see this problem, please reopen it. ---------- resolution: -> invalid stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:30:23 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Thu, 06 Oct 2011 14:30:23 +0000 Subject: [issue13111] Error 2203 when installing Python/Perl? In-Reply-To: <1317855327.52.0.585487977144.issue13111@psf.upfronthosting.co.za> Message-ID: <1317911423.89.0.898785459658.issue13111@psf.upfronthosting.co.za> Changes by Antoine Pitrou : ---------- nosy: +loewis _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:31:02 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Thu, 06 Oct 2011 14:31:02 +0000 Subject: [issue13115] tp_as_{number, sequence, mapping} can't be set using PyType_FromSpec In-Reply-To: <1317910226.56.0.365091018544.issue13115@psf.upfronthosting.co.za> Message-ID: <1317911462.7.0.21006255411.issue13115@psf.upfronthosting.co.za> Changes by Antoine Pitrou : ---------- nosy: +amaury.forgeotdarc, loewis type: -> behavior versions: +Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:35:26 2011 From: report at bugs.python.org (=?utf-8?q?Martin_v=2E_L=C3=B6wis?=) Date: Thu, 06 Oct 2011 14:35:26 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1317898894.75.0.362794527014.issue6715@psf.upfronthosting.co.za> Message-ID: <4E8DBCAD.9080404@v.loewis.de> Martin v. L?wis added the comment: Am 06.10.11 13:01, schrieb Nadeem Vawda: > > Nadeem Vawda added the comment: > > Wow, this discussion has gotten quite busy while I've been travelling... > > Martin, could you explain what the problems are with bundling a precompiled DLL > for Windows? Off-hand, it's only minor issues: e.g. when running Python out of its build directory, the DLL must be in the same directory. Now, since the output directory differs depending on build option, getting the DLL there might be tricky. Things I wonder about and couldn't quickly answer from the web: where exactly is the DLL that we would use? is there a AMD64 version of it? Does it come with import libraries usable by VS? ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:57:09 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 14:57:09 +0000 Subject: [issue8668] Packaging: add a 'develop' command In-Reply-To: <1273367946.24.0.0664676682922.issue8668@psf.upfronthosting.co.za> Message-ID: <1317913029.07.0.449401698063.issue8668@psf.upfronthosting.co.za> ?ric Araujo added the comment: higery, can you give us a status update? Do you have the time to address current reviews or would you like me to make an updated patch? I?d like to incorporate this command as soon as possible to let people play with it, and then we?ll see about integration with the install action. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 16:57:28 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 14:57:28 +0000 Subject: [issue12344] Add **kwargs to get_reinitialized_command In-Reply-To: <1308190105.67.0.431736719114.issue12344@psf.upfronthosting.co.za> Message-ID: <1317913048.01.0.0532728453592.issue12344@psf.upfronthosting.co.za> Changes by ?ric Araujo : ---------- priority: normal -> high _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 17:03:04 2011 From: report at bugs.python.org (Senthil Kumaran) Date: Thu, 06 Oct 2011 15:03:04 +0000 Subject: [issue13073] message_body argument of HTTPConnection.endheaders is undocumented In-Reply-To: <1317901092.1.0.246854308757.issue13073@psf.upfronthosting.co.za> Message-ID: <20111006150255.GE1946@mathmagic> Senthil Kumaran added the comment: Yes, I agree. I think, it can be clarified at that point too. Because. in 2.7 the string is being checked and in 3.3 the message_body is checked if it's instance of bytes. But, I think, it should be carefully worded (aligned with how other socket message args are mentioned). ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 17:26:51 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 15:26:51 +0000 Subject: [issue12416] packaging needs {pre,post}-{install,remove} hooks In-Reply-To: <1309132247.11.0.698244418605.issue12416@psf.upfronthosting.co.za> Message-ID: <1317914811.64.0.235212615734.issue12416@psf.upfronthosting.co.za> ?ric Araujo added the comment: Editing title to reflect the scope of the needed feature. ---------- title: packaging does not have hooks callable during distribution removal -> packaging needs {pre,post}-{install,remove} hooks versions: +3rd party _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 17:30:10 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Thu, 06 Oct 2011 15:30:10 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317915010.32.0.959786735075.issue6715@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: On http://tukaani.org/xz, I downloaded the file named xz-5.0.3-windows.zip. It contains precompiled dlls for both platforms: bin_i486/liblzma.dll and bin_x86_64/liblzma.dll Unfortunately, there is no import library for VS. It should not be too difficult to make one, though: the provided headers are C89, so it's enough to write stubs for the functions used by the extension module. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 17:42:42 2011 From: report at bugs.python.org (=?utf-8?q?=C3=89ric_Araujo?=) Date: Thu, 06 Oct 2011 15:42:42 +0000 Subject: [issue13116] setup.cfg in [sb]dists should be static Message-ID: <1317915762.54.0.969795427533.issue13116@psf.upfronthosting.co.za> New submission from ?ric Araujo : Some people (hi Ronny :) want to use a setup hook to get the version from the VCS, but the setup.cfg file in sdists and bdists should be fully static, because getting the VCS info is not possible and maybe for other reasons too (not requiring development dependencies for example, the same argument that makes people include generated HTML docs in sdists). The way to handle that seems simple: sdist runs setup hooks and writes back the modified config object to the setup.cfg file that?s included in sdists and bdists. Command hooks are unaffected, as are post/pre install/remove hooks (to be added in #12416). Another idea would be to split global hooks into two kinds. The code would run the volatile hooks, write the modified config as setup.cfg for *dists, and then run regular hooks. Users installing a *dist will execute the regular hooks. ---------- assignee: tarek components: Distutils2 messages: 145021 nosy: alexis, eric.araujo, tarek priority: normal severity: normal status: open title: setup.cfg in [sb]dists should be static versions: 3rd party, Python 3.3 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 17:43:14 2011 From: report at bugs.python.org (Nadeem Vawda) Date: Thu, 06 Oct 2011 15:43:14 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317915794.42.0.865005174221.issue6715@psf.upfronthosting.co.za> Nadeem Vawda added the comment: Hmm... according to http://git.tukaani.org/?p=xz.git;a=blob;f=windows/README-Windows.txt;hb=HEAD#l80, the MinGW-compiled static libs *can* be used with MSVC. Not sure how reliable the information is, but it's worth a try at least. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 17:51:50 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Thu, 06 Oct 2011 15:51:50 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317916310.56.0.389169148931.issue6715@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: Ah indeed, the zip archive contains a doc/liblzma.def which can be used to build a liblzma.lib ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 17:56:27 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Thu, 06 Oct 2011 15:56:27 +0000 Subject: [issue6715] xz compressor support In-Reply-To: <1250502444.31.0.107447392137.issue6715@psf.upfronthosting.co.za> Message-ID: <1317916587.86.0.26294371016.issue6715@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: Hey, today I learnt something about mingw! """Rename liblzma.a to e.g. liblzma_static.lib and tell MSVC to link against it.""" Apparently mingw can generate COFF libraries. This may simplify things *a lot*. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 19:07:41 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 17:07:41 +0000 Subject: [issue13070] segmentation fault in pure-python multi-threaded server In-Reply-To: <1317336165.27.0.207190934922.issue13070@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 89b9e4bf6f1f by Charles-Fran?ois Natali in branch '2.7': Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle http://hg.python.org/cpython/rev/89b9e4bf6f1f ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 19:13:46 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 17:13:46 +0000 Subject: [issue12911] Expose a private accumulator C API In-Reply-To: <1315311609.99.0.705451675521.issue12911@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset f9f782f2369e by Antoine Pitrou in branch '3.2': Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists. http://hg.python.org/cpython/rev/f9f782f2369e New changeset 656c13024ede by Antoine Pitrou in branch 'default': Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists. http://hg.python.org/cpython/rev/656c13024ede ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 19:15:58 2011 From: report at bugs.python.org (Antoine Pitrou) Date: Thu, 06 Oct 2011 17:15:58 +0000 Subject: [issue12911] Expose a private accumulator C API In-Reply-To: <1315311609.99.0.705451675521.issue12911@psf.upfronthosting.co.za> Message-ID: <1317921358.75.0.646419511346.issue12911@psf.upfronthosting.co.za> Antoine Pitrou added the comment: I added a comment insisting that the API is private and can be changed at any moment. StringIO can actually re-use that API, rather than the reverse. No need to instantiate a full-blown file object when all you want to do is to join a bunch of strings. ---------- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 19:45:32 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 17:45:32 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1287449366.98.0.655876257649.issue10141@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset e767318baccd by Charles-Fran?ois Natali in branch 'default': Issue #10141: socket: add SocketCAN (PF_CAN) support. Initial patch by Matthias http://hg.python.org/cpython/rev/e767318baccd ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 20:28:02 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 18:28:02 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1287449366.98.0.655876257649.issue10141@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset a4af684bb54e by Victor Stinner in branch 'default': Issue #10141: Don't use hardcoded frame size in example, use struct.calcsize() http://hg.python.org/cpython/rev/a4af684bb54e ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 20:32:39 2011 From: report at bugs.python.org (=?utf-8?q?Francisco_Mart=C3=ADn_Brugu=C3=A9?=) Date: Thu, 06 Oct 2011 18:32:39 +0000 Subject: =?utf-8?q?=5Bissue13117=5D_Broken_links_in_the_=E2=80=9Ccompiler=E2=80=9D?= =?utf-8?q?_page=2C_section_=E2=80=9Creferences=E2=80=9D_from_the_devguide?= =?utf-8?q?=2E?= Message-ID: <1317925959.0.0.308627128494.issue13117@psf.upfronthosting.co.za> New submission from Francisco Mart?n Brugu? : Hi, In the page http://docs.python.org/devguide/compiler.html the links in the references [1] (http://www.foretec.com/python/workshops/1998-11/proceedings/papers/montanaro/montanaro.html) and in [Wang97] (http://www.cs.princeton.edu/%7Edanwang/Papers/dsl97/dsl97.html) are broken. Cheers, francis ---------- components: Devguide messages: 145030 nosy: francismb priority: normal severity: normal status: open title: Broken links in the ?compiler? page, section ?references? from the devguide. _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 20:35:25 2011 From: report at bugs.python.org (=?utf-8?q?F=C3=A9lix-Antoine_Fortin?=) Date: Thu, 06 Oct 2011 18:35:25 +0000 Subject: [issue13118] Py_BuildValue format f incorrect description. Message-ID: <1317926125.15.0.437576148609.issue13118@psf.upfronthosting.co.za> New submission from F?lix-Antoine Fortin : Python/C API Reference Manual, section Utilities, Parsing arguments and building values, function Py_BuildValue. The description for the format unit "f" is incorrect. It reads "Same as d.", as it should be "Convert a C float to a Python floating point number." since "f" is not the same as "d" when converting double to Python float. This was corrected in the documentation of Python 3, from which the proposed description comes. ---------- assignee: docs at python components: Documentation messages: 145031 nosy: docs at python, felixantoinefortin priority: normal severity: normal status: open title: Py_BuildValue format f incorrect description. versions: Python 2.6, Python 2.7 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 21:47:48 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Thu, 06 Oct 2011 19:47:48 +0000 Subject: [issue10141] SocketCan support In-Reply-To: <1287449366.98.0.655876257649.issue10141@psf.upfronthosting.co.za> Message-ID: <1317930468.15.0.516932374375.issue10141@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: Committed. Matthias, Tiago, thanks! ---------- resolution: -> fixed stage: commit review -> committed/rejected status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 21:52:53 2011 From: report at bugs.python.org (=?utf-8?q?Francisco_Mart=C3=ADn_Brugu=C3=A9?=) Date: Thu, 06 Oct 2011 19:52:53 +0000 Subject: =?utf-8?q?=5Bissue13117=5D_Broken_links_in_the_=E2=80=9Ccompiler=E2=80=9D?= =?utf-8?q?_page=2C_section_=E2=80=9Creferences=E2=80=9D_from_the_devguide?= =?utf-8?q?=2E?= In-Reply-To: <1317925959.0.0.308627128494.issue13117@psf.upfronthosting.co.za> Message-ID: <1317930773.52.0.892810841415.issue13117@psf.upfronthosting.co.za> Francisco Mart?n Brugu? added the comment: The reference for [1] could be changed to: http://www.python.org/workshops/1998-11/proceedings/papers/montanaro/montanaro.html ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 22:03:09 2011 From: report at bugs.python.org (=?utf-8?q?Francisco_Mart=C3=ADn_Brugu=C3=A9?=) Date: Thu, 06 Oct 2011 20:03:09 +0000 Subject: =?utf-8?q?=5Bissue13117=5D_Broken_links_in_the_=E2=80=9Ccompiler=E2=80=9D?= =?utf-8?q?_page=2C_section_=E2=80=9Creferences=E2=80=9D_from_the_devguide?= =?utf-8?q?=2E?= In-Reply-To: <1317925959.0.0.308627128494.issue13117@psf.upfronthosting.co.za> Message-ID: <1317931389.7.0.835683026722.issue13117@psf.upfronthosting.co.za> Francisco Mart?n Brugu? added the comment: The reference to [Wang97] could be changed to: http://www.cs.princeton.edu/research/techreps/TR-554-97 ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 22:09:22 2011 From: report at bugs.python.org (=?utf-8?q?Francisco_Mart=C3=ADn_Brugu=C3=A9?=) Date: Thu, 06 Oct 2011 20:09:22 +0000 Subject: =?utf-8?q?=5Bissue13117=5D_Broken_links_in_the_=E2=80=9Ccompiler=E2=80=9D?= =?utf-8?q?_page=2C_section_=E2=80=9Creferences=E2=80=9D_from_the_devguide?= =?utf-8?q?=2E?= In-Reply-To: <1317925959.0.0.308627128494.issue13117@psf.upfronthosting.co.za> Message-ID: <1317931762.24.0.382551069393.issue13117@psf.upfronthosting.co.za> Francisco Mart?n Brugu? added the comment: A patch with the links mentioned above. ---------- keywords: +patch Added file: http://bugs.python.org/file23329/issue13117.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 22:20:40 2011 From: report at bugs.python.org (=?utf-8?q?Charles-Fran=C3=A7ois_Natali?=) Date: Thu, 06 Oct 2011 20:20:40 +0000 Subject: [issue8037] multiprocessing.Queue's put() not atomic thread wise In-Reply-To: <1267482006.55.0.281200221208.issue8037@psf.upfronthosting.co.za> Message-ID: <1317932440.44.0.44932267518.issue8037@psf.upfronthosting.co.za> Charles-Fran?ois Natali added the comment: > Modifying an object which is already on a traditional queue can also > change what is received by the other thread (depending on timing). > So Queue.Queue's put() is not "atomic" either. Therefore I do not > believe this behaviour is a bug. Agreed. > However the solution proposed is a good one since it fixes Issue > 10886. In addition it prevents arbitrary code being run in the > background thread by weakref callbacks or __del__ methods. Such > arbitrary code may cause inconsistent state in a forked process if > the fork happens while the queue's thread is running -- see issue > 6271. [...] > I would suggest closing this issue and letting Issue 10886 take it's > place. Makes sense. ---------- nosy: +neologix resolution: -> duplicate stage: test needed -> committed/rejected status: open -> closed superseder: -> Unhelpful backtrace for multiprocessing.Queue _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 22:33:53 2011 From: report at bugs.python.org (Aleksey Frolov) Date: Thu, 06 Oct 2011 20:33:53 +0000 Subject: [issue3244] multipart/form-data encoding In-Reply-To: <1214849078.87.0.171093103517.issue3244@psf.upfronthosting.co.za> Message-ID: <1317933233.18.0.458946283419.issue3244@psf.upfronthosting.co.za> Changes by Aleksey Frolov : ---------- nosy: +atommixz _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 22:35:44 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 20:35:44 +0000 Subject: =?utf-8?q?=5Bissue13117=5D_Broken_links_in_the_=E2=80=9Ccompiler=E2=80=9D?= =?utf-8?q?_page=2C_section_=E2=80=9Creferences=E2=80=9D_from_the_devguide?= =?utf-8?q?=2E?= In-Reply-To: <1317925959.0.0.308627128494.issue13117@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 76159c6d265a by Ned Deily in branch 'default': Issue #13117: Fix broken links in the compiler page of the Developer's Guide. http://hg.python.org/devguide/rev/76159c6d265a ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 22:39:03 2011 From: report at bugs.python.org (Ned Deily) Date: Thu, 06 Oct 2011 20:39:03 +0000 Subject: =?utf-8?q?=5Bissue13117=5D_Broken_links_in_the_=E2=80=9Ccompiler=E2=80=9D?= =?utf-8?q?_page=2C_section_=E2=80=9Creferences=E2=80=9D_from_the_devguide?= =?utf-8?q?=2E?= In-Reply-To: <1317925959.0.0.308627128494.issue13117@psf.upfronthosting.co.za> Message-ID: <1317933543.15.0.963849719555.issue13117@psf.upfronthosting.co.za> Ned Deily added the comment: Thanks for the patch! ---------- nosy: +ned.deily resolution: -> fixed status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 22:59:48 2011 From: report at bugs.python.org (M. Zilmer) Date: Thu, 06 Oct 2011 20:59:48 +0000 Subject: [issue13119] Newline for print() is \n on Windows, and not \r\n as expected Message-ID: <1317934788.08.0.425265217585.issue13119@psf.upfronthosting.co.za> New submission from M. Zilmer : In 3.2.2 the newline for print() is \n on Windows, and not \r\n as expected. In 3.1.4 the newline is \r\n. OS is Win 7, and tried on both 32 and 64 bit. Small example with output is attached. ---------- components: Windows files: newline.py messages: 145039 nosy: M..Z. priority: normal severity: normal status: open title: Newline for print() is \n on Windows, and not \r\n as expected type: behavior versions: Python 3.2 Added file: http://bugs.python.org/file23330/newline.py _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 23:00:22 2011 From: report at bugs.python.org (M. Zilmer) Date: Thu, 06 Oct 2011 21:00:22 +0000 Subject: [issue13119] Newline for print() is \n on Windows, and not \r\n as expected In-Reply-To: <1317934788.08.0.425265217585.issue13119@psf.upfronthosting.co.za> Message-ID: <1317934822.25.0.0556960967364.issue13119@psf.upfronthosting.co.za> Changes by M. Zilmer : Added file: http://bugs.python.org/file23331/newline_3.1.txt _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 23:00:38 2011 From: report at bugs.python.org (M. Zilmer) Date: Thu, 06 Oct 2011 21:00:38 +0000 Subject: [issue13119] Newline for print() is \n on Windows, and not \r\n as expected In-Reply-To: <1317934788.08.0.425265217585.issue13119@psf.upfronthosting.co.za> Message-ID: <1317934838.62.0.823953502761.issue13119@psf.upfronthosting.co.za> Changes by M. Zilmer : Added file: http://bugs.python.org/file23332/newline_3.2.txt _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 23:17:28 2011 From: report at bugs.python.org (Ben Bass) Date: Thu, 06 Oct 2011 21:17:28 +0000 Subject: [issue13120] Default nosigint optionto pdb.Pdb() prevents use in non-main thread Message-ID: <1317935848.16.0.589862913408.issue13120@psf.upfronthosting.co.za> New submission from Ben Bass : The new SIGINT behaviour of pdb.Pdb prevents use of pdb within a non-main thread without explicitly setting nosigint=True. Specifically the 'continue' command causes a traceback as follows: {{{ ... File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/pdb.py", line 959, in do_continue signal.signal(signal.SIGINT, self.sigint_handler) ValueError: signal only works in main thread }}} Since the new behaviour seems to be to gain an enhancement rather than anything fundamentally necessary to pdb, wouldn't it be better if the default was reversed, so the same code would work identically on Python 3.1 (and potentially earlier, i.e. Python2) and Python 3.2? At the moment in my codebase (rpcpdb) I'm using inspect.getargspec sniffing for nosigint on pdb.Pdb.__init__ to determine whether to include a nosigint=True parameter, which clearly isn't ideal! ---------- components: Library (Lib) messages: 145040 nosy: bpb priority: normal severity: normal status: open title: Default nosigint optionto pdb.Pdb() prevents use in non-main thread type: behavior versions: Python 3.2 _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 23:19:19 2011 From: report at bugs.python.org (Ben Bass) Date: Thu, 06 Oct 2011 21:19:19 +0000 Subject: [issue13120] Default nosigint option to pdb.Pdb() prevents use in non-main thread In-Reply-To: <1317935848.16.0.589862913408.issue13120@psf.upfronthosting.co.za> Message-ID: <1317935959.32.0.169959724174.issue13120@psf.upfronthosting.co.za> Changes by Ben Bass : ---------- title: Default nosigint optionto pdb.Pdb() prevents use in non-main thread -> Default nosigint option to pdb.Pdb() prevents use in non-main thread _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 23:43:09 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 21:43:09 +0000 Subject: [issue7367] pkgutil.walk_packages fails on write-only directory in sys.path In-Reply-To: <1258702285.22.0.207199237354.issue7367@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 096b010ae90b by Ned Deily in branch '2.7': Issue #7367: Add test case to test_pkgutil for walking path with http://hg.python.org/cpython/rev/096b010ae90b New changeset 1449095397ae by Ned Deily in branch '2.7': Issue #7367: Fix pkgutil.walk_paths to skip directories whose http://hg.python.org/cpython/rev/1449095397ae New changeset a1e6633ef3f1 by Ned Deily in branch '3.2': Issue #7367: Add test case to test_pkgutil for walking path with http://hg.python.org/cpython/rev/a1e6633ef3f1 New changeset 77bac85f610a by Ned Deily in branch '3.2': Issue #7367: Fix pkgutil.walk_paths to skip directories whose http://hg.python.org/cpython/rev/77bac85f610a New changeset 5a4018570a59 by Ned Deily in branch '3.2': Issue #7367: add NEWS item. http://hg.python.org/cpython/rev/5a4018570a59 New changeset 0408001e4765 by Ned Deily in branch 'default': Issue #7367: merge from 3.2 http://hg.python.org/cpython/rev/0408001e4765 ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Thu Oct 6 23:43:09 2011 From: report at bugs.python.org (Roundup Robot) Date: Thu, 06 Oct 2011 21:43:09 +0000 Subject: [issue7425] Improve the robustness of "pydoc -k" in the face of broken modules In-Reply-To: <1259786972.24.0.35920903506.issue7425@psf.upfronthosting.co.za> Message-ID: Roundup Robot added the comment: New changeset 45862f4ab1c5 by Ned Deily in branch '2.7': Issue #7425: Refactor test_pydoc test case for '-k' behavior and add http://hg.python.org/cpython/rev/45862f4ab1c5 New changeset 3acf90f71178 by Ned Deily in branch '2.7': Issue #7425: Prevent pydoc -k failures due to module import errors. http://hg.python.org/cpython/rev/3acf90f71178 New changeset 6a45f917f167 by Ned Deily in branch '3.2': Issue #7425: Refactor test_pydoc test case for '-k' behavior and add http://hg.python.org/cpython/rev/6a45f917f167 New changeset add444274c3d by Ned Deily in branch '2.7': Issue #7425 and Issue #7367: add NEWS items. http://hg.python.org/cpython/rev/add444274c3d ---------- nosy: +python-dev _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:00:32 2011 From: report at bugs.python.org (Ned Deily) Date: Thu, 06 Oct 2011 22:00:32 +0000 Subject: [issue7367] pkgutil.walk_packages fails on write-only directory in sys.path In-Reply-To: <1258702285.22.0.207199237354.issue7367@psf.upfronthosting.co.za> Message-ID: <1317938432.61.0.579793043397.issue7367@psf.upfronthosting.co.za> Ned Deily added the comment: The applied changesets correct pkgutil's walk_packages for "classic" imports to ignore unreadable directories the same way that the interpreter's import does. With this fix to pkgutil, pydoc -k also no longer fails in this case. Applied in 2.7 (for 2.7.3), 3.2 (for 3.2.3), and default (for 3.3.0). ---------- resolution: -> fixed status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:10:42 2011 From: report at bugs.python.org (Ned Deily) Date: Thu, 06 Oct 2011 22:10:42 +0000 Subject: [issue7425] Improve the robustness of "pydoc -k" in the face of broken modules In-Reply-To: <1259786972.24.0.35920903506.issue7425@psf.upfronthosting.co.za> Message-ID: <1317939042.25.0.67148792288.issue7425@psf.upfronthosting.co.za> Ned Deily added the comment: The applied changesets backport the "ignore exceptions" fix for pydoc -k from 3.x to 2.7 and also refactor test_pydoc to remove unneeded complexity and add test cases for importing bad packages and unreadable package directories (a problem addressed in Issue7367). Applied to 2.7 (for 2.7.3), 3.2 (for 3.2.3), and default (for 3.3.0). ---------- resolution: -> fixed status: open -> closed _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:11:02 2011 From: report at bugs.python.org (Ned Deily) Date: Thu, 06 Oct 2011 22:11:02 +0000 Subject: [issue7425] Improve the robustness of "pydoc -k" in the face of broken modules In-Reply-To: <1259786972.24.0.35920903506.issue7425@psf.upfronthosting.co.za> Message-ID: <1317939062.39.0.887518790824.issue7425@psf.upfronthosting.co.za> Changes by Ned Deily : ---------- stage: patch review -> committed/rejected _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:13:34 2011 From: report at bugs.python.org (Carl Robben) Date: Thu, 06 Oct 2011 22:13:34 +0000 Subject: [issue2945] bdist_rpm does not list dist files (should effect upload) In-Reply-To: <1211466871.95.0.658180804984.issue2945@psf.upfronthosting.co.za> Message-ID: <1317939214.47.0.339777520205.issue2945@psf.upfronthosting.co.za> Carl Robben added the comment: I found that bdist_rpm wasn't registering distributions with dist.dist_files at all. The attached patch should be all that's needed to fix this. ---------- keywords: +patch nosy: +crobben Added file: http://bugs.python.org/file23333/bdist_rpm.patch _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:28:07 2011 From: report at bugs.python.org (Barry A. Warsaw) Date: Thu, 06 Oct 2011 22:28:07 +0000 Subject: [issue11250] 2to3 truncates files at formfeed character In-Reply-To: <1298159845.78.0.925914963765.issue11250@psf.upfronthosting.co.za> Message-ID: <1317940087.32.0.485614173343.issue11250@psf.upfronthosting.co.za> Barry A. Warsaw added the comment: Was this patch ever folded into Python 3.2? Looking at the hg repository, I think the answer is "no". It does appear to have made it into Python 2.7 and trunk though (afaict). In point of fact, this bug is hitting me now with 3.2.2. ---------- nosy: +barry status: closed -> open _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:46:56 2011 From: report at bugs.python.org (Larry Hastings) Date: Thu, 06 Oct 2011 22:46:56 +0000 Subject: [issue13105] Please elaborate on how 2.x and 3.x are different heads In-Reply-To: <1317770656.72.0.880626385137.issue13105@psf.upfronthosting.co.za> Message-ID: <1317941216.64.0.449797500696.issue13105@psf.upfronthosting.co.za> Larry Hastings added the comment: What follows is the original email from Nick. -- We maintain two independent heads in hg: 2.7 and default 3.2 is open for general bugfixes 2.5 (IIRC), 2.6 and 3.1 are open for security fixes Security fixes (if applicable to both heads) go: 2.5 -> 2.6 -> 2.7 3.1 -> 3.2 -> default General bug fixes (if applicable to both heads) go: 2.7 3.2 -> default New features are added to default only The relative ordering of 2.x and 3.x changes doesn't really matter - the important thing is not to merge them in *either* direction. I think you can theoretically do cherry-picking with Hg, but most people seem to just do independent commits to the two streams. If the devguide doesn't align with the above, then a tracker issue pointing that out would be handy :) ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:50:38 2011 From: report at bugs.python.org (Mark Hammond) Date: Thu, 06 Oct 2011 22:50:38 +0000 Subject: [issue7833] bdist_wininst installers fail to load extensions built with Issue4120 patch In-Reply-To: <1265062373.01.0.461114831555.issue7833@psf.upfronthosting.co.za> Message-ID: <1317941438.63.0.617713356867.issue7833@psf.upfronthosting.co.za> Mark Hammond added the comment: I'm reluctant to commit to adding test infrastructure for the distutils build commands - if I've missed existing infrastructure and adding such tests would actually be relatively simple, please educate me! Or if someone else would like to help with the infrastructure so I can test just this patch, that would be awesome. But I don't think this fix should block on tests given it can easily be tested and verified manually. ---------- _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 00:59:50 2011 From: report at bugs.python.org (Mike Hoy) Date: Thu, 06 Oct 2011 22:59:50 +0000 Subject: [issue12436] Missing items in installation/setup instructions In-Reply-To: <1309305395.29.0.101516779086.issue12436@psf.upfronthosting.co.za> Message-ID: <1317941990.78.0.378605947804.issue12436@psf.upfronthosting.co.za> Mike Hoy added the comment: > - How to prepare a text editor See: http://docs.python.org/dev/using/unix.html#editors > - How to run Python code from a file (if the tutorial or using docs don?t already have it). See: http://docs.python.org/dev/using/unix.html#miscellaneous ---------- nosy: +mikehoy _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 01:05:48 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Thu, 06 Oct 2011 23:05:48 +0000 Subject: [issue13119] Newline for print() is \n on Windows, and not \r\n as expected In-Reply-To: <1317934788.08.0.425265217585.issue13119@psf.upfronthosting.co.za> Message-ID: <1317942348.25.0.126263062119.issue13119@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: To people who open the file in their browser: text files are very similar, but newline_3.1.txt has CRLF line endings and newline_3.2.txt has LF line endings. M.Z, how did you obtain them? did you start a subprocess? ---------- nosy: +amaury.forgeotdarc, haypo _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 01:12:35 2011 From: report at bugs.python.org (Amaury Forgeot d'Arc) Date: Thu, 06 Oct 2011 23:12:35 +0000 Subject: [issue13118] Py_BuildValue format f incorrect description. In-Reply-To: <1317926125.15.0.437576148609.issue13118@psf.upfronthosting.co.za> Message-ID: <1317942755.22.0.884446222941.issue13118@psf.upfronthosting.co.za> Amaury Forgeot d'Arc added the comment: I've checked in the code: 'f' and 'd' are really the same (Python/modsupport.c). And in http://en.wikipedia.org/wiki/Stdarg.h, you can read: """A float will automatically be promoted to a double.""" ---------- nosy: +amaury.forgeotdarc _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 01:31:38 2011 From: report at bugs.python.org (Mike Hoy) Date: Thu, 06 Oct 2011 23:31:38 +0000 Subject: [issue12823] Broken link in "SSL wrapper for socket objects" document In-Reply-To: <1314086720.91.0.705244554246.issue12823@psf.upfronthosting.co.za> Message-ID: <1317943898.08.0.537958291243.issue12823@psf.upfronthosting.co.za> Mike Hoy added the comment: Patch to remove broken link. ---------- keywords: +patch nosy: +mikehoy Added file: http://bugs.python.org/file23334/SSL-broken-link.diff _______________________________________ Python tracker _______________________________________ From report at bugs.python.org Fri Oct 7 06:34:10 2011 From: report at bugs.python.org (Terry J. Reedy) Date: Fri, 07 Oct 2011 04:34:10 +0000 Subject: [issue12602] Missing using docs cross-references In-Reply-To: <1311249393.53.0.190511112739.issue12602@psf.upfronthosting.co.za> Message-ID: <1317962050.09.0.148104173773.issue12602@psf.upfronthosting.co.za> Terry J. Reedy added the comment: This is all a puzzle to me. "