Newbie question about text encoding
Rustom Mody
rustompmody at gmail.com
Tue Mar 3 23:45:13 EST 2015
On Wednesday, March 4, 2015 at 12:07:06 AM UTC+5:30, jmf wrote:
> Le mardi 3 mars 2015 19:04:06 UTC+1, Rustom Mody a écrit :
> > On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
> > > On 2/26/2015 8:24 AM, Chris Angelico wrote:
> > > > On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
> > > >> Wrote something up on why we should stop using ASCII:
> > > >> http://blog.languager.org/2015/02/universal-unicode.html
> > >
> > > I think that the main point of the post, that many Unicode chars are
> > > truly planetary rather than just national/regional, is excellent.
> >
> > <snipped>
> >
> > > You should add emoticons, but not call them or the above 'gibberish'.
> > > I think that this part of your post is more 'unprofessional' than the
> > > character blocks. It is very jarring and seems contrary to your main point.
> >
> > Ok Done
> >
> > References to gibberish removed from
> > http://blog.languager.org/2015/02/universal-unicode.html
> >
> > What I was trying to say expanded here
> > http://blog.languager.org/2015/03/whimsical-unicode.html
> > [Hope the word 'whimsical' is less jarring and more accurate than 'gibberish']
>
> ========
>
> Emoji and Dingbats are now part of Unicode.
> They should be considered as well as a "1" or a "a"
> or a "mathematical alpha".
> So, there is nothing special to say about them.
>
> jmf
Maybe you missed this section:
http://blog.languager.org/2015/03/whimsical-unicode.html#half-assed
It lists some examples of software that somehow break/goof going from BMP-only
unicode to 7.0 unicode.
IOW the suggestion is that the the two-way classification
- ASCII
- Unicode
is less useful and accurate than the 3-way
- ASCII
- BMP
- Unicode
Personally I would be pleased if 𝛌 were used for the math-lambda and
λ left alone for Greek-speaking users' identifiers.
However one should draw a line between personal preferences and a univeral(izable) standard.
As of now, λ works in blogger whereas 𝛌 breaks blogger -- gets replaced by �.
Similar breakages are current in Java, Javascript, Emacs, Mysql, Idle and Windows, various fonts etc etc. [Only one of these is remotely connected with python]
So BMP is practical, 7.0 is idealistic. You are free too pick 😏😉
More information about the Python-list
mailing list