A 'raw' codec for binary "strings" in Python?

Tim Roberts timr at probo.com
Tue Mar 2 02:25:43 EST 2004


Bill Janssen <janssen at parc.com> wrote:

>> You could use
>>     "\xc0".decode("iso-8859-1").encode('US-ASCII', 'replace')
>
>Yes, this is what I'm doing at the moment.  But it seems a real hack.
>The string *isn't* in Latin-1; it's binary, it's data, and there
>should be a way of saying that.

If it's binary, then it is completely meaningless to try to encode it into
Unicode in the first place.  0xd9 has absolutely no meaning without an
associated encoding.

I would have guessed that ''.translate() is the kind of thing you want, and
it's probably more efficient than a fake decode/encode.
-- 
- Tim Roberts, timr at probo.com
  Providenza & Boekelheide, Inc.



More information about the Python-list mailing list