Unicode normalisation [was Re: [beginner] What's wrong?]

Ben Bacarisse ben.usenet at bsb.me.uk
Sat Apr 9 14:27:01 EDT 2016


Rustom Mody <rustompmody at gmail.com> writes:

> On Saturday, April 9, 2016 at 7:14:05 PM UTC+5:30, Ben Bacarisse wrote:
>> The problem with that theory is that 'er/re' (this is e and r in either
>> order) is the 3rd most common pair in English but have been placed
>> together.  ou and et (in either order) are the 15th and 22nd most common
>> and they are separated by only one hammer position.  On the other hand,
>> the QWERTY layout puts jk together, but they almost never appear
>> together in English text.
>
> Where do you get this (kind of) statistical data?

It was generated by counting the pairs found in a corpus of texts taken
from Project Gutenberg.  The numbers do very depending on what you pick
(for the complete works of Mark Twain er/re is second, for example), and
the none of the texts are very modern (because of the source) but I
doubt that matters too much.

-- 
Ben.



More information about the Python-list mailing list