python for everyday tasks
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Mon Nov 25 11:17:34 EST 2013
Le lundi 25 novembre 2013 16:11:22 UTC+1, Michael Torrie a écrit :
> I only respond here, as unicode in general is an important concept that
>
> the OP will to make sure his students understand in Python, and I don't
>
> want you to dishonestly sow the seeds of uncertainty and doubt.
>
>
>
> On 11/25/2013 03:12 AM, wxjmfauth at gmail.com wrote:
>
> > Your paragraph is mixing different concepts.
>
>
>
> On the contrary, it appears you are the one mixing the concepts, and
>
> confusing a byte-encoding scheme with unicode.
>
>
>
> In an ideal world, the programmer should not need to know or care about
>
> what encoding scheme the language is using internally to store strings.
>
> And it does not matter whether the internal encoding scheme is endorsed
>
> by the unicode commission or not, provided it can handle all the valid
>
> unicode constructs.
>
>
>
> A string is unicode. Period. Hence you must concern yourself with
>
> encoding only when reading or writing a byte stream.
>
>
>
> Inside the language itself, the encoding is irrelevant. Ideally. In
>
> python 3.3+ anyway. Of course reality is different in other languages
>
> which is why programmers are used to worrying about things like exposing
>
> surrogate pairs (as Javascript does), or having to tweak your algorithms
>
> to deal with the fact that UTF-8 indexing is not O(1). To claim that a
>
> programmer has to concern himself with internal language encoding in
>
> Python 3 is not only untrue, it's ingenuousness at best, given the OP's
>
> mission.
>
>
>
> > When it comes to save memory, utf-8 is the choice. It
>
> > beats largely the FSR on the side of memory and on
>
> > the side of performances.
>
>
>
> So you would condemn everyone to use an O(n) encoding for a string when
>
> FSR offers full unicode compliance that optimizes both speed and memory?
>
>
>
> No, D'Aprano is correct. Python 3.3+ indeed does unicode right. It
>
> offers O(1) slicing, is memory efficient, and never exposes things like
>
> surrogate pairs.
>
>
>
> > How and why? I suggest, you have a deeper understanding
>
> > of unicode.
>
>
>
> Indeed I'd say D'Aprano does have a deeper understanding of unicode.
>
>
>
> > May I recall, it is one of the coding scheme endorsed
>
> > by "Unicode.org" and it is intensively used. This is not
>
> > by chance.
>
>
>
> Yes, you keep saying this. Have you encountered a real-world situation
>
> where you are impacted by Python's FSR? You keep posting silly
>
> benchmarks that prove nothing, and continue arguing, yet presumably you
>
> are still using Python. Why haven't you switched to Google Go or
>
> another language that implements unicode strings in UTF-8?
------
Everybody has the right to have an opinion. Understand
I respect Steven's opinion.
---
I'm aware of the utf-8 indexing "effect" (it is in fact the
answer I expected), that's why I proposed to dive a little
bit more in "unicode".
Now something else.
I'm practically no more programming in the sense creating
applications, but mainly interested in unicode. I "toyed" with
many tools, C#, go, ruby2 and my favorite, the TeX unicode engines.
I just happen I have a large experience with Python and I'm finding
this FSR fascinating.
jmf
More information about the Python-list
mailing list