Extend unicodedata with a name/pattern/regex search for character entity references?

Rustom Mody rustompmody at gmail.com
Sat Sep 10 07:24:29 EDT 2016


On Saturday, September 10, 2016 at 3:56:37 PM UTC+5:30, Veek 'this_is_not_my_name' M wrote:
> Veek 'this_is_not_my_name' M wrote:


Recursion… Self-Reference…Inversion
Heh! On the way to becoming another Gödel/Turing??

You may be interested in this collection of some evidence(s) of recursion being 
one of the most central ideas in computer science see:
http://blog.languager.org/2012/05/recursion-pervasive-in-cs.html

> 
> > Rustom Mody wrote:
> > 
> >> On Saturday, September 3, 2016 at 5:25:48 PM UTC+5:30, Veek. M wrote:
> >>> https://mail.python.org/pipermail//python-ideas/2014-October/029630.htm
> >>> 
> >>> Wanted to know if the above link idea, had been implemented and if
> >>> there's a module that accepts a pattern like 'cap' and give you all
> >>> the instances of unicode 'CAP' characters.
> >>>  ⋂ \bigcap
> >>>  ⊓ \sqcap
> >>>  ∩ \cap
> >>>  ♑ \capricornus
> >>>  ⪸ \succapprox
> >>>  ⪷ \precapprox
> >>> 
> >>> (above's from tex)
> >>> 
> >>> I found two useful modules in this regard: unicode_tex, unicodedata
> >>> but unicodedata is a builtin which does not do globs, regexs - so
> >>> it's kind of limiting in nature.
> >>> 
> >>> Would be nice if you could search html/xml character entity
> >>> references as well.
> >> 
> >> [Not exactly an answer]
> >> 
> >> I use a number of things for such
> >> 1. Google
> >> 2. Xah Lee’s excellent pages which often fit my brain better than
> >> wikipedia:
> >>    http://xahlee.info/comp/unicode_index.html
> >> 3. emacs’ function ucs-insert recently renamed to insert-char
> >>    ie [In emacs] Type Alt-x insert-char
> >>    After that some kind of TAB-globbing (case-insensitive) works
> >>    I wont try with Cap (because the number of *CAPITAL* is in
> >>    thousands!) eg alphaTAB gives nothing. However *alphaTAB gives a
> >>    bunch. Narrow to "greek alpha"TAB and you get a bunch
> >> 
> >> 
> >> The fact that we should have a series of levels for char-input from
> >> most general and unergonomic (google) to most specific and ergonomic
> >> (special purpose keyboard) Ive tried to talk of as 7 levels near end
> >> of http://blog.languager.org/2015/01/unicode-and-universe.html
> > 
> > 
> > got dengu - i'm dead
> sorry false alarm, but i was sick enough to be awol

Given that
>>> "dengue" == "dengu"
False

you should be fine ;-)



More information about the Python-list mailing list