Issue1214889
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2005-06-04 17:45 by georg.brandl, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
fileobject-unicodewrite.diff | hyeshik.chang, 2005-06-05 02:26 | fixed styles, added test | ||
fileobject-unicodewrite-4.diff | georg.brandl, 2005-06-05 12:17 | |||
fileobject-unicodewrite-5.diff | georg.brandl, 2005-06-05 16:31 |
Messages (12) | |||
---|---|---|---|
msg48428 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2005-06-04 17:45 | |
Here is a patch that allows Unicode strings written to a file being automatically encoded. It enables Python code to set file.encoding and obeys this encoding when writing Unicode strings with write() or writelines(). It is my first core hackery, so forgive me one leaked ref or the other. I hope I got the error handling right; it is kind of confusing... (btw: Bug #967986 will be fixed with this) |
|||
msg48429 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2005-06-05 02:26 | |
Logged In: YES user_id=55188 The idea looks good to me. I attached a revised patch fixed code style, C99-style local variable declaration and added a regrtest. I think some documentation update will be needed also. |
|||
msg48430 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2005-06-05 07:56 | |
Logged In: YES user_id=1188172 Third revision; adds new documentation and allows Python code to set the encoding to Py_None. |
|||
msg48431 - (view) | Author: George Yoshida (quiver) | Date: 2005-06-05 12:09 | |
Logged In: YES user_id=671362 Reinhold, libstdtypes.tex needs two fixes. \versionadded{2.3} +\versionchanged[The encoding attribute is now writable and is used +for encoding Unicode strings given to \method{write()} and +\method{writelines()}.]{ ~~~ First, versionchanged tag does not have a trailing brace and it resuls in compile error. Second(really trivial), versionchanged macro automatically appends a period at the end of the sentence(see the link [*]), so you don't need to put it by hand. Then the above line would become: +\method{writelines()}]{2.5} [*]: http://docs.python.org/doc/inline-markup.html |
|||
msg48432 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2005-06-05 12:17 | |
Logged In: YES user_id=1188172 Thanks! Corrected in patch #4. |
|||
msg48433 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2005-06-05 15:20 | |
Logged In: YES user_id=55188 Yet another thing to fix: You can't put local namespace declarations after non-declaration statements. Because Python uses C89 as a C source code standard, you should all declarations in the top of functions only. |
|||
msg48434 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2005-06-05 16:31 | |
Logged In: YES user_id=1188172 Okay, put on #5. |
|||
msg48435 - (view) | Author: Petr Prikryl (prikryl) * | Date: 2005-07-12 09:59 | |
Logged In: YES user_id=771873 The title and the comments do not say so, but the patch was created by Reinhold Birkenfeld to solve the bug [ 1099364 ] raw_input() displays wrong unicode prompt As the bug was closed and Reinhold claims to be his "first core hackery", I'd like to ask someone else to revise, whether the patch is the correct solution to the reported bug. The bug seems to be very visible (hence serious) in non-English speaking countries where Unicode promisses to solve many problems. Because of that I ask whether the bug should be closed before accepting the patch. I am adding this text also to link this patch to the original problem. Thanks, Petr |
|||
msg48436 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2005-07-14 07:52 | |
Logged In: YES user_id=38388 This doesn't quite work (yet): you've broken the support for writing binary data to the file via file.write(). Encodings should only be used for non-binary files. Also note that you are not freeing the memory allocated by the "et#" parser for s. Please add some test cases where you open a binary file and write: a) binary strings b) contents of a buffer object c) Unicode objects to it. Case c) should raise an exception. a) and b) should result in the data being written as-is - without doing any recoding. |
|||
msg48437 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2005-07-14 08:19 | |
Logged In: YES user_id=38388 I've thought about this some more: I'm not sure whether it is such a good idea to try to move code from the codecs into the standard file object - after all, the codecs already support all this and do a much better job at handling error cases and the like. Furthermore, codecs support both directions: reading and writing. Your patch only does one way. The encoding support you currently find in the file object is only needed for printing Unicode objects - it is not used anywhere else. |
|||
msg48438 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2005-07-14 08:34 | |
Logged In: YES user_id=1188172 I agree with you that writing Unicode objects to a binary file should raise an exception, but with the 'et#' format string, 8-bit string objects should pass through file.write unrecoded. About your second comment: Yes, codecs is one way to do it, but then I think that the encoding handling for print should be ripped out, too. After all, that's what many people complain about: "print unistr" works, while "sys.stdout.write(unistr)" does not. As the comment below about bug 1099364 shows, this shows up in various locations. If this is rejected, file.write() shouldn't accept Unicode anymore, and print should behave the same way. |
|||
msg48439 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2005-08-08 06:49 | |
Logged In: YES user_id=1188172 Rejecting. This is incomplete and will be addressed more properly in Py3k. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:11 | admin | set | github: 42054 |
2005-06-04 17:45:09 | birkenfeld | create |