Python 3 is killing Python
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Wed Jul 16 09:22:46 EDT 2014
Le mercredi 16 juillet 2014 15:11:26 UTC+2, Marko Rauhamaa a écrit :
> Steven D'Aprano <steve+comp.lang.python at pearwood.info>:
>
>
>
> > With a few exceptions, /etc is filled with text files, not binary
>
> > files, and half the executables on the system are text (Python, Perl,
>
> > bash, sh, awk, etc.).
>
>
>
> Our debate seems to stem from a different idea of what text is. To me,
>
> text in the Python sense is a sequence of UCS-4 character code points.
>
> The opposite of text is not necessarily binary.
>
>
>
> Most of those "text" files under /etc expect ASCII. In many contexts,
>
> they tolerate UTF-8 or Latin-3 or whatever, but it's a bit iffy (how are
>
> extra-ASCII passwords encoded in the /etc/shadow?). Also, the files
>
> under /etc, /var/log etc should not depend on the locale since they are
>
> typically interpreted by daemons, which typically don't possess locales.
>
>
>
> > Relatively rare. Like, um, email, news, html, Unix config files,
>
> > Windows ini files, source code in just about every language ever,
>
> > SMSes, XML, JSON, YAML, instant messenger apps,
>
>
>
> I would be especially wary of letting Python 3 interpret those files for
>
> me. Python's [text] strings could be a wonderful tool on the inside of
>
> my program, but I definitely would like to micromanage the I/O. Do I
>
> obey the locale or not? That's too big (and painful) a question for
>
> Python to answer on its own (and pretend like everything's under
>
> control).
>
>
>
> > word processors... even *graphic* applications invariably have a text
>
> > tool.
>
>
>
> Thing is, the serious text utilities like word processors probably need
>
> lots of ancillary information so Python's [text] strings might be too
>
> naive to represent even a single character.
>
>
>
> >> More often, len(b'λ') is what I want.
>
> >
>
> > Oh really? Are you sure? What exactly is b'λ'?
>
>
>
> That's something that ought to work in the UTF-8 paradise.
>
> Unfortunately, Python only allows ASCII in bytes. ASCII only! In this
>
> day and age! Even C is not so picky:
>
>
>
> #include <stdio.h>
>
>
>
> int main()
>
> {
>
> printf("Hyvää yötä\n");
>
> return 0;
>
> }
>
>
>
>
>
> Marko
--------
And if you are visiting, spying the bugs tracker,
dev lists and ... you will happily, this perpertual
way of thinking:
if ascii:
do stuff
else:
find a work around
It's quite funny for a tool which pretends
to live in the unicode world.
jmd
More information about the Python-list
mailing list