Newbie question about text encoding

Rustom Mody rustompmody at gmail.com
Tue Mar 3 23:45:13 EST 2015


On Wednesday, March 4, 2015 at 12:07:06 AM UTC+5:30, jmf wrote:
> Le mardi 3 mars 2015 19:04:06 UTC+1, Rustom Mody a écrit :
> > On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
> > > On 2/26/2015 8:24 AM, Chris Angelico wrote:
> > > > On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
> > > >> Wrote something up on why we should stop using ASCII:
> > > >> http://blog.languager.org/2015/02/universal-unicode.html
> > > 
> > > I think that the main point of the post, that many Unicode chars are 
> > > truly planetary rather than just national/regional, is excellent.
> > 
> > <snipped>
> > 
> > > You should add emoticons, but not call them or the above 'gibberish'.
> > > I think that this part of your post is more 'unprofessional' than the 
> > > character blocks.  It is very jarring and seems contrary to your main point.
> > 
> > Ok Done
> > 
> > References to gibberish removed from
> > http://blog.languager.org/2015/02/universal-unicode.html 
> > 
> > What I was trying to say expanded here
> > http://blog.languager.org/2015/03/whimsical-unicode.html
> > [Hope  the word 'whimsical' is less jarring and more accurate than 'gibberish']
> 
> ========
> 
> Emoji and Dingbats are now part of Unicode.
> They should be considered as well as a "1" or a "a"
> or a "mathematical alpha".
> So, there is nothing special to say about them.
> 
> jmf

Maybe you missed this section:
http://blog.languager.org/2015/03/whimsical-unicode.html#half-assed

It lists some examples of software that somehow break/goof going from BMP-only 
unicode to 7.0 unicode.

IOW the suggestion is that the the two-way classification
- ASCII
- Unicode

is less useful and accurate than the 3-way

- ASCII
- BMP
- Unicode

Personally I would be pleased if 𝛌 were used for the math-lambda and
λ left alone for Greek-speaking users' identifiers.
However one should draw a line between personal preferences and a univeral(izable) standard.
As of now, λ works in blogger whereas 𝛌 breaks blogger -- gets replaced by �.
Similar breakages are current in Java, Javascript, Emacs, Mysql, Idle and Windows, various fonts etc etc. [Only one of these is remotely connected with python]

So BMP is practical, 7.0 is idealistic. You are free too pick 😏😉



More information about the Python-list mailing list