[Python-Dev] Generalised String Coercion

M.-A. Lemburg mal at egenix.com
Mon Aug 8 13:06:31 CEST 2005


Michael Hudson wrote:
> "M.-A. Lemburg" <mal at egenix.com> writes:
> 
> 
>>Set the external encoding for stdin, stdout, stderr:
>>----------------------------------------------------
>>(also an example for adding encoding support to an
>>existing file object):
>>
>>def set_sys_std_encoding(encoding):
>>    # Load encoding support
>>    (encode, decode, streamreader, streamwriter) = codecs.lookup(encoding)
>>    # Wrap using stream writers and readers
>>    sys.stdin = streamreader(sys.stdin)
>>    sys.stdout = streamwriter(sys.stdout)
>>    sys.stderr = streamwriter(sys.stderr)
>>    # Add .encoding attribute for introspection
>>    sys.stdin.encoding = encoding
>>    sys.stdout.encoding = encoding
>>    sys.stderr.encoding = encoding
>>
>>set_sys_std_encoding('rot-13')
>>
>>Example session:
>>
>>>>>print 'hello'
>>
>>uryyb
>>
>>>>>raw_input()
>>
>>hello
>>h'hello'
>>
>>>>>1/0
>>
>>Genpronpx (zbfg erprag pnyy ynfg):
>>  Svyr "<fgqva>", yvar 1, va ?
>>MrebQvivfvbaReebe: vagrtre qvivfvba be zbqhyb ol mreb
>>
>>Note that the interactive session bypasses the sys.stdin
>>redirection, which is why you can still enter Python
>>commands in ASCII - not sure whether there's a reason
>>for this, or whether it's just a missing feature.
> 
> 
> Um, I'm not quite sure how this would be implemented.  Interactive
> input comes via PyOS_Readline which deals in FILE*s... this area of
> the code always confuses me :(

Me too.

It appears that this part of the Python code
has undergone so many iterations and patches, that the
structure has suffered a lot, e.g. the main() functions calls
PyRun_AnyFileFlags(stdin, "<stdin>", &cf),
but the fp argument stdin is then subsequently
ignored if the tok_nextc() function finds that
a prompt is set.

Anyway, hacking along the same lines, I think
the above can be had by changing tok_stdin_decode()
to use a possibly available sys.stdin.decode()
method for the decoding of the data read by
PyOS_Readline(). This would then return Unicode
which tok_stdin_decode() could then encode to
UTF-8 which is the encoding that the tokenizer
can work on.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 08 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list