a question about Chinese characters in a Python Program
Ben Finney
bignose+hates-spam at benfinney.id.au
Tue Oct 21 20:07:36 EDT 2008
John Machin <sjmachin at lexicon.net> writes:
> On Oct 21, 11:03 pm, Ben Finney <bignose+hates-s... at benfinney.id.au>
> wrote:
> > John Machin <sjmac... at lexicon.net> writes:
> > > I don't understand the point or value of filtering out all byte values
> > > greater than 127
> >
> > That's only done if the encoding isn't otherwise specified. In which
> > case, ASCII is the documented default encoding. In which case, it
> > *must* be restricted to code points 0+IBM-127, otherwise it's not ASCII.
> >
> > The value of doing this is to make it rapidly and repeatably apparent
> > when the programmer's assumptions about character encoding are false,
> > allowing the programming error to be fixed early rather than late.
>
> "make it rapidly and repeatably apparent ..." is much better achieved
> by raising an exception.
Ah, I misread; I thought you were asking about the value of defaulting
to ASCII and therefore raising an exception. It seems we agree on
that, then.
> What is that 0+IBM-127 +IBw-guess+IB0- gibberish in your posting?
It wasn't in my message as sent to my news server, nor as I read the
message in comp.lang.python. The message was encoded using UTF-8.
Perhaps it's since been munged in transit to your eyeballs by any of a
number of intermediaries.
--
\ “I bought some batteries, but they weren't included; so I had |
`\ to buy them again.” —Steven Wright |
_o__) |
Ben Finney
More information about the Python-list
mailing list