[Python-Dev] bytes / unicode

P.J. Eby pje at telecommunity.com
Mon Jun 21 19:46:56 CEST 2010


At 12:56 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
>One comment here -- you can also have uri's that aren't decodable into their
>true textual meaning using a single encoding.
>
>Apache will happily serve out uris that have utf-8, shift-jis, and euc-jp
>components inside of their path but the textual representation that 
>was intended
>will be garbled (or be represented by escaped byte sequences).  For that
>matter, apache will serve requests that have no true textual representation
>as it is working on the byte level rather than the character level.
>
>So a complete solution really should allow the programmer to pass in uris as
>bytes when the programmer knows that they need it.

ebytes(somebytes, 'garbage'), perhaps, which would be like ascii, but 
where combining with non-garbage would results in another 'garbage' ebytes?



More information about the Python-Dev mailing list