Letter class in re

Serhiy Storchaka storchaka at gmail.com
Mon Mar 9 09:43:00 EDT 2015


On 09.03.15 14:26, Antoon Pardon wrote:
> So if I understand correctly the following should be a regular expression for
> a python3 identifier.
>
>    (?:(?!_|\d)\w)\w+
>
> It seems odd that one should need such an ugly expression for something that is
> used rather frequently for parsing computer languages and the like.

Not all so easy.

 >>> allchars = ''.join(map(chr, range(sys.maxunicode+1)))
>>> ''.join(c for c in allchars if ('a'+c).isidentifier() and not (c+'a').isidentifier() and not c.isdigit())
'·̴̵̶̷̸̡̢̧̨̛̖̗̘̙̜̝̞̟̠̣̤̥̦̩̪̫̬̭̮̯̰̱̲̳̹̺̻̼͇͈͉͍͎̀́̂̃̄̅̆̇̈̉̊̋̌̍̎̏̐̑̒̓̔̽̾̿̀́͂̓̈́͆͊͋͌̕̚ͅ͏͓͔͕͖͙͚͐͑͒͗͛ͣͤͥͦͧͨͩͪͫͬͭͮͯ͘͜͟͢͝͞͠͡·ְֱֲֳִֵֶַָׇֹֺֻּֽֿׁׂًٌٍؘَؙُؚِّْٰܑ֑֖֛֢֣֤֥֦֧֪ׅٕٖٜٟۣ۪ۭܱܴܷܸܹܻܼܾ݂݄݆݈֚֭֮҃҄҅҆҇֒֓֔֕֗֘֙֜֝֞֟֠֡֨֩֫֬֯ׄؐؑؒؓؔؕؖؗٓٔٗ٘ٙٚٛٝٞۖۗۘۙۚۛۜ۟۠ۡۢۤۧۨ۫۬ܰܲܳܵܶܺܽܿ݀݁݃݅݇݉݊ަާިީުޫެޭޮޯްࣰࣱࣲ߲࡙࡚࡛ࣦࣩ࣭࣮࣯ࣶࣹࣺ߫߬߭߮߯߰߱߳ࠖࠗ࠘࠙ࠛࠜࠝࠞࠟࠠࠡࠢࠣࠥࠦࠧࠩࠪࠫࠬ࠭ࣤࣥࣧࣨ࣪࣫࣬ࣳࣴࣵࣷࣸࣻࣼࣽࣾࣿऀँंःऺऻ़ािीुूृॄॅॆेैॉॊोौ्ॎॏ॒॑॓॔ॕॖॗॢॣঁংঃ়ািীুূৃৄেৈোৌ্ৗৢৣਁਂਃ਼ਾਿੀੁੂੇੈੋੌ੍ੑੰੱੵઁંઃ઼ાિીુૂૃૄૅેૈૉોૌ્ૢૣଁଂଃ଼ାିୀୁୂà­
ƒà­„େୈୋୌ୍ୖୗୢୣஂாிீுூெேைொோௌ்ௗఀఁంఃాిీుూృౄెేైొోౌ్ౕౖౢౣಁಂಃ಼ಾಿೀುೂೃೄೆೇೈೊೋೌ್ೕೖೢೣഁംഃാിീുൂൃൄെേൈൊോൌ്ൗൢൣංඃ්ාැෑිීුූෘෙේෛොෝෞෟෲෳ 
ัำิีึืฺุู็่้๊๋์ํ๎ັຳິີຶືຸູົຼ່້໊໋໌ໍ༹༘༙༵༷༾༿ཱཱཱིིུུྲྀཷླྀཹེཻོཽཾཿ྄ཱྀྀྂྃ྆྇ྍྎྏྐྑྒྒྷྔྕྖྗྙྚྛྜྜྷྞྟྠྡྡྷྣྤྥྦྦྷྨྩྪྫྫྷྭྮྯྰྱྲླྴྵྶྷྸྐྵྺྻྼ࿆ါာိီုူေဲဳဴဵံ့း္်ျြွှၖၗၘၙၞၟၠၢၣၤၧၨၩၪၫၬၭၱၲၳၴႂႃႄႅႆႇႈႉႊႋႌႍႏႚႛႜႝ፝፞፟ᜒᜓ᜔ᜲᜳ᜴ᝒᝓᝲᝳ 
ាិីឹឺុូួើឿៀេែៃោៅំះៈ៉៊់៌៍៎៏័៑្៓៝ 
᠋᠌᠍ᢩᤠᤡᤢᤣᤤᤥᤦᤧᤨᤩᤪᤫᤰᤱᤲᤳᤴᤵᤶᤷᤸ᤻᤹᤺ᦰᦱᦲᦳᦴᦵᦶᦷᦸᦹᦺᦻᦼᦽᦾᦿᧀᧈᧉᨘᨗᨙᨚᨛᩕᩖᩗᩘᩙᩚᩛᩜᩝᩞ᩠ᩡᩢᩣᩤᩥᩦᩧᩨᩩᩪᩫᩬᩭᩮᩯᩰᩱᩲᩳᩴ᩿᪵᪶᪷᪸᪹᪺᪽᩵᩶᩷᩸᩹᩺᩻᩼᪰᪱᪲᪳᪴᪻᪼ᬀᬁᬂᬃᬄ᬴ᬵᬶᬷᬸᬹᬺᬻᬼᬽᬾᬿᭀᭁᭂᭃ᭄᭬᭫᭭᭮᭯᭰᭱᭲᭳ᮀᮁᮂᮡᮢᮣᮤᮥᮦᮧᮨᮩ᮪᮫ᮬᮭ᯦ᯧᯨᯩᯪᯫᯬᯭᯮᯯᯰᯱ᯲᯳ᰤᰥᰦᰧᰨᰩᰪᰫᰬᰭᰮᰯᰰᰱᰲᰳᰴᰵᰶ᳔᰷᳕᳖᳗᳘᳙᳜᳝᳞᳟᳐᳑᳒᳚᳛᳠᳡᳢᳣᳤᳥᳦᳧᳨᳭ᳲᳳ᷐᷎᷂᷊᷏᷽᷿᳴᳸᳹᷀᷁᷃᷄᷅᷆᷇᷈᷉᷋᷌᷑᷒ᷓᷔᷕᷖᷗᷘᷙᷚᷛᷜᷝᷞᷟᷠᷡᷢᷣᷤᷥᷦᷧᷨᷩᷪᷫᷬᷭᷮᷯᷰᷱᷲᷳᷴ᷵᷾᷼᷍‿⁀⁔⃒⃓⃘⃙⃚⃥⃦⃪⃫⵿⃨⃬⃭⃮⃯⃐⃑⃔⃕⃖⃗⃛⃜⃡⃧⃩⃰⳯⳰⳱ⷠⷡⷢⷣⷤⷥⷦⷧⷨⷩⷪⷫⷬⷭⷮⷯⷰⷱⷲⷳⷴⷵⷶⷷⷸ
ⷹⷺⷻⷼⷽⷾⷿ 
゙゚〪〭〮〯〫〬 
꙯ꙴꙵꙶꙷꙸꙹꙺꙻ꙼꙽ꚟ꛰꛱ꠂ꠆ꠋꠣꠤꠥꠦꠧꢀꢁꢴꢵꢶꢷꢸꢹꢺꢻꢼꢽꢾꢿꣀꣁꣂꣃ꣄꣠꣡꣢꣣꣤꣥꣦꣧꣨꣩꣪꣫꣬꣭꣮꣯꣰꣱ꤦꤧꤨꤩꤪ꤫꤬꤭ꥇꥈꥉꥊꥋꥌꥍꥎꥏꥐꥑꥒ꥓ꦀꦁꦂꦃ꦳ꦴꦵꦶꦷꦸꦹꦺꦻꦼꦽꦾꦿ꧀ꧥꨩꨪꨫꨬꨭꨮꨯꨰꨱꨲꨳꨴꨵꨶꩃꩌꩍꩻꩼꩽꪴꪰꪲꪳꪷꪸꪾ꪿꫁ꫫꫬꫭꫮꫯꫵ꫶ꯣꯤꯥꯦꯧꯨꯩꯪ꯬꯭ﬞ︀︁︂︃︄︅︆︇︈︉︊︋︌︍︎️︧︨︩︪︫︬︭︠︡︢︣︤︥︦︳︴﹍﹎﹏_゙゚ 
𐇽𐋠𐍶𐍷𐍸𐍹𐍺𐨁𐨂𐨃𐨅𐨆𐨌𐨍𐨎𐨹𐨿𐨺𐫦𐨏𐨸𐫥𑀀𑀁𑀂𑀸𑀹𑀺𑀻𑀼𑀽𑀾𑀿𑁀𑁁𑁂𑁃𑁄𑁅𑁆𑁿𑂀𑂁𑂂𑂰𑂱𑂲𑂳𑂴𑂵𑂶𑂷𑂸𑂺𑂹𑄀𑄁𑄂𑄧𑄨𑄩𑄪𑄫𑄬𑄭𑄮𑄯𑄰𑄱𑄲𑅳𑄳𑄴𑆀𑆁𑆂𑆳𑆴𑆵𑆶𑆷𑆸𑆹𑆺𑆻𑆼𑆽𑆾𑆿𑇀𑈬𑈭𑈮𑈯𑈰𑈱𑈲𑈳𑈴𑈶𑈵𑈷𑋟𑋠𑋡𑋢𑋣𑋤𑋥𑋦𑋧𑋨𑋩𑋪𑌁𑌂𑌃𑌼𑌾𑌿𑍀𑍁𑍂𑍃𑍄𑍇𑍈𑍋𑍌𑍍𑍗𑍢𑍣𑍦𑍧𑍨𑍩𑍪𑍫𑍬𑍰𑍱𑍲𑍳𑍴𑒰𑒱𑒲𑒳𑒴𑒵𑒶𑒷𑒸𑒻𑒻𑒼𑒽𑒾𑒿𑓀𑓁𑓃𑓂𑖯𑖰𑖱𑖲𑖳𑖴𑖵𑖸𑖹𑖺𑖻𑖼𑖽𑖾𑗀𑖿𑘰𑘱𑘲𑘳𑘴𑘵𑘶𑘷𑘸𑘹𑘺𑘻𑘼𑘽𑘾𑘿𑙀𑚫𑚬𑚭𑚮𑚯𑚰𑚱𑚲𑚳𑚴𑚵𖫰𖫱𖫲𖫳𖫴𑚷𑚶𖬰𖬱𖬲𖬳𖬴𖬵𖬶𖽑𖽒𖽓𖽔𖽕𖽖𖽗𖽘𖽙𖽚𖽛𖽜𖽝𖽞𖽟𖽠𖽡𖽢𖽣𖽤𖽥𖽦𖽧𖽨𖽩𖽪ð–½
«ð–½¬ð–½­ð–½®ð–½¯ð–½°ð–½±ð–½²ð–½³ð–½´ð–½µð–½¶ð–½·ð–½¸ð–½¹ð–½ºð–½»ð–½¼ð–½½ð–½¾ð–¾ð–¾ð–¾‘𖾒𛲝𛲞𝅧𝅨𝅩𝅥𝅦𝅮𝅯𝅰𝅱𝅲𝅻𝅼𝅽𝅾𝅿𝆀𝆁𝆂𝆊𝆋𞣐𞣑𞣒𞣓𞣔𞣕𞣖𝅭𝆅𝆆𝆇𝆈𝆉𝆪𝆫𝆬𝆭𝉂𝉃𝉄󠄀󠄁󠄂󠄃󠄄󠄅󠄆󠄇󠄈󠄉󠄊󠄋󠄌󠄍󠄎󠄏󠄐󠄑󠄒󠄓󠄔󠄕󠄖󠄗󠄘󠄙󠄚󠄛󠄜󠄝󠄞󠄟󠄠󠄡󠄢󠄣󠄤󠄥󠄦󠄧󠄨󠄩󠄪󠄫󠄬󠄭󠄮󠄯󠄰󠄱󠄲󠄳󠄴󠄵󠄶󠄷󠄸󠄹󠄺󠄻󠄼󠄽󠄾󠄿󠅀󠅁󠅂󠅃󠅄󠅅󠅆󠅇󠅈󠅉󠅊󠅋󠅌󠅍󠅎󠅏󠅐󠅑󠅒󠅓󠅔󠅕󠅖󠅗󠅘󠅙󠅚󠅛󠅜󠅝󠅞󠅟󠅠󠅡󠅢󠅣󠅤󠅥󠅦󠅧󠅨󠅩󠅪󠅫󠅬󠅭󠅮󠅯󠅰󠅱󠅲󠅳󠅴󠅵󠅶󠅷󠅸󠅹󠅺󠅻󠅼󠅽󠅾󠅿󠆀󠆁󠆂󠆃󠆄󠆅󠆆󠆇󠆈󠆉󠆊󠆋󠆌󠆍󠆎󠆏󠆐󠆑󠆒󠆓󠆔󠆕󠆖󠆗󠆘󠆙󠆚󠆛󠆜󠆝󠆞󠆟󠆠󠆡󠆢󠆣󠆤󠆥󠆦󠆧󠆨󠆩󠆪󠆫󠆬󠆭󠆮󠆯󠆰󠆱󠆲󠆳󠆴󠆵󠆶󠆷ó 
†¸ó †¹ó †ºó †»ó †¼ó †½ó †¾ó †¿ó ‡€ó ‡ó ‡‚󠇃󠇄󠇅󠇆󠇇󠇈󠇉󠇊󠇋󠇌󠇍󠇎󠇏󠇐󠇑󠇒󠇓󠇔󠇕󠇖󠇗󠇘󠇙󠇚󠇛󠇜󠇝󠇞󠇟󠇠󠇡󠇢󠇣󠇤󠇥󠇦󠇧󠇨󠇩󠇪󠇫󠇬󠇭󠇮󠇯'





More information about the Python-list mailing list