[issue15216] Support setting the encoding on a text stream after creation
STINNER Victor
report at bugs.python.org
Tue Aug 7 04:01:37 CEST 2012
STINNER Victor added the comment:
Here is a Python implementation of TextIOWrapper.set_encoding().
The main limitation is that it is not possible to set the encoding on a non-seekable stream after the first read (if the read buffer is not empty, ie. if there are pending decoded characters).
+ # flush read buffer, may require to seek backward in the underlying
+ # file object
+ if self._decoded_chars:
+ if not self.seekable():
+ raise UnsupportedOperation(
+ "It is not possible to set the encoding "
+ "of a non seekable file after the first read")
+ assert self._snapshot is not None
+ dec_flags, next_input = self._snapshot
+ offset = self._decoded_chars_used - len(next_input)
+ if offset:
+ self.buffer.seek(offset, SEEK_CUR)
--
I don't have an use case for setting the encoding of sys.stdout/stderr after startup, but I would like to be able to control the *error handler* after the startup! My implementation permits to change both (encoding, errors, encoding and errors).
For example, Lib/test/regrtest.py uses the following function to force the backslashreplace error handler on sys.stdout. It changes the error handler to avoid UnicodeEncodeError when displaying the result of tests.
def replace_stdout():
"""Set stdout encoder error handler to backslashreplace (as stderr error
handler) to avoid UnicodeEncodeError when printing a traceback"""
import atexit
stdout = sys.stdout
sys.stdout = open(stdout.fileno(), 'w',
encoding=stdout.encoding,
errors="backslashreplace",
closefd=False,
newline='\n')
def restore_stdout():
sys.stdout.close()
sys.stdout = stdout
atexit.register(restore_stdout)
The doctest module uses another trick to change the error handler:
save_stdout = sys.stdout
if out is None:
encoding = save_stdout.encoding
if encoding is None or encoding.lower() == 'utf-8':
out = save_stdout.write
else:
# Use backslashreplace error handling on write
def out(s):
s = str(s.encode(encoding, 'backslashreplace'), encoding)
save_stdout.write(s)
sys.stdout = self._fakeout
----------
keywords: +patch
nosy: +haypo
Added file: http://bugs.python.org/file26715/set_encoding.patch
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15216>
_______________________________________
More information about the Python-bugs-list
mailing list