a question about Chinese characters in a Python Program

John Machin sjmachin at lexicon.net
Tue Oct 21 19:10:54 EDT 2008


On Oct 21, 11:03 pm, Ben Finney <bignose+hates-s... at benfinney.id.au>
wrote:
> John Machin <sjmac... at lexicon.net> writes:
> > I don't understand the point or value of filtering out all byte values
> > greater than 127
>
> That's only done if the encoding isn't otherwise specified. In which
> case, ASCII is the documented default encoding. In which case, it
> *must* be restricted to code points 0+IBM-127, otherwise it's not ASCII.
>
> The value of doing this is to make it rapidly and repeatably apparent
> when the programmer's assumptions about character encoding are false,
> allowing the programming error to be fixed early rather than late.

"make it rapidly and repeatably apparent ..." is much better achieved
by raising an exception.

> This is, in my estimation, of more value than heuristic magic to
> +IBw-guess+IB0- the encoding, and the resultant debugging nightmare when
> that guesswork fails in unpredictable ways later in the program's
> life.

Was I suggesting "heuristic magic"?

What is that 0+IBM-127 +IBw-guess+IB0- gibberish in your posting?



More information about the Python-list mailing list