Ascii to Unicode.
Ethan Furman
ethan at stoneleaf.us
Thu Jul 29 14:34:18 EDT 2010
Joe Goldthwaite wrote:
> Hi Ulrich,
>
> Ascii.csv isn't really a latin-1 encoded file. It's an ascii file with a
> few characters above the 128 range . . .
It took me a while to get this point too (if you already have "gotten
it", I apologize, but the above comment leads me to believe you haven't).
*Every* file is an encoded file... even your UTF-8 file is encoded using
the UTF-8 format. Someone correct me if I'm wrong, but I believe
lower-ascii (0-127) matches up to the first 128 Unicode code points, so
while those first 128 code-points translate easily to ascii, ascii is
still an encoding, and if you have characters higher than 127, you don't
really have an ascii file -- you have (for example) a cp1252 file (which
also, not coincidentally, shares the first 128 characters/code points
with ascii).
Hopefully I'm not adding to the confusion. ;)
~Ethan~
More information about the Python-list
mailing list