From kirby.urner at gmail.com Sun May 10 19:29:06 2015 From: kirby.urner at gmail.com (kirby urner) Date: Sun, 10 May 2015 10:29:06 -0700 Subject: [I18n-sig] basic question about collation strategies Message-ID: My nonprofit is only beginning to address non-Latin-1 characters in full names in corporate listings. My current plan is to allow a Full_name field in any script e.g. Devanagari, but then insist on at least single letters A-Z in Last and First name fields. Some examples: https://flic.kr/p/sqV14G (using religious types from Wikipedia for pseudo-records) Although I've worked in libraries which have alphabetization worked out across multiple languages (I could return Arabic titles to their proper place in my hey day), I am less sure of how Unicode handles collations across all language boundaries. It seemed easier to use the Roman alphabet to force a simple last, first collation, whereas Full_name is not used for collation at all and may be in any character set supported by Unicode. Given Roman letters have phonetic value, one looks for the Full_name based on how you'd sound it out in "Romanji" (the Nipponese name for Roman letter scripts, such as Python's keywords and Standard Library). Is there an industry standard I should know about and is my simplification of alpha searching an accepted strategy? Kirby -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirby.urner at gmail.com Tue May 19 16:28:10 2015 From: kirby.urner at gmail.com (kirby urner) Date: Tue, 19 May 2015 07:28:10 -0700 Subject: [I18n-sig] [ case study ] putting i18n into practice Message-ID: Here's a post to another listserv where I talk about putting my Romanized Collation into practice, such that fullnames, even if not Latin-1 in spelling, show up phonetically in a place that makes sense to browsers / app users. Links back to this archive. The ds.npym or ds-npym links below are science fiction at this point as the app in question, though partially implemented in Django-on-Heroku, is not resident in the npym.org domain (yet). Kirby From: npym-it-discuss Date: Tue, May 19, 2015 at 7:24 AM Subject: Fwd: The "Name Tag" problem (and proposed solution) On Monday, May 18, 2015 at 1:15:59 PM UTC-7, kirby urner wrote: > > > This is not in specific reference to anything in the Registration > Application. > > The Reg App works well based on Last to anchor household, with exceptions > OK, then each in the household goes by First. This is "business casual" > ala NPYM and works well for us. > > I'm more focusing on my Django on Heroku App, a prototype for a new > directory service, not either / or with CD / DVD. > > Looking to the future, I'm anticipating a smattering of events, if not > Annual Session, say at a Monthly Meeting, where name tags would NOT be all > Roman letters (or Latin-1). > > AFSC has need for non-Latin-1 a lot. Half the people I meet at the > airport may not have Latin-1 names (just kidding, more like one fourth). > > As I put it to William, the Russian-speaking returned vet, now civilian: > > === > > The "nametag problem" is listing full names alphabetically across multiple > languages. The "When in Rome..." solution I come up with is: ask > Russians, Chinese, Koreans etc. to sound it out and Romanize phonetically, > in order to poke it in using some standard A-Z collation. > > Picture an SQL database: Full_name, Last, First > > My name is U Thant in Burmese but how do I get my name to show up under > Thant? Easy right? > > > https://www.flickr.com/photos/kirbyurner/17347213022/in/album-72157649301627162/ > https://mail.python.org/pipermail/i18n-sig/2015-May/002131.html > > === > > I'm playing with the idea that when we go: > > http://ds-npym.org/F > > to get all names starting with F, that we might *only* get a listing of > fullnames, with some in the collation thanks to Romanized / phoneticized > values we do not display, in Last and First taken alone. > > In other words, if ????? ????????? appears between Joe Fergler and Linda > Findhorne, that would make sense, because in the Last, First fields we have > something Romanized (but not displayed). > > Joe Fergler (efm) > ????? ????????? (ttlm) > Linda Findhorne (mmm) > > You know these are lastname F people, so even if you don't read Cyrillic, > if you have a mental sound for the name, you have a chance of getting to it > this way (alphabetically, using an alphabet you already know :-D). > > Kirby > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: