"convert" string to bytes without changing data (encoding)

Prasad, Ramit ramit.prasad at jpmorgan.com
Wed Mar 28 15:02:41 EDT 2012


> >The right way to convert bytes to strings, and vice versa, is via
> >encoding and decoding operations.
> 
> If you want to dictate to the original poster the correct way to do
> things then you don't need to do anything more that.  You don't need to
> pretend like Chris Angelico that there's isn't a direct mapping from
> the his Python 3 implementation's internal respresentation of strings
> to bytes in order to label what he's asking for as being "silly".

It might be technically possible to recreate internal implementation,
or get the byte data. That does not mean it will make any sense or
be understood in a meaningful manner. I think Ian summarized it
very well:

>You can't generally just "deal with the ascii portions" without
>knowing something about the encoding.  Say you encounter a byte
>greater than 127.  Is it a single non-ASCII character, or is it the
>leading byte of a multi-byte character?  If the next character is less
>than 127, is it an ASCII character, or a continuation of the previous
>character?  For UTF-8 you could safely assume ASCII, but without
>knowing the encoding, there is no way to be sure.  If you just assume
>it's ASCII and manipulate it as such, you could be messing up
>non-ASCII characters.

Technically, ASCII goes up to 256 but they are not A-z letters.

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  



More information about the Python-list mailing list