[Python-Dev] PEP 277 (unicode filenames): please review

Aahz aahz@pythoncraft.com
Wed, 21 Aug 2002 13:31:57 -0400


[doing an archeological dig through e-mail]

On Tue, Aug 13, 2002, M.-A. Lemburg wrote:
>
> At least is good :-) NFC is NFD + canonical composition. Decomposition
> isn't all that hard (using unicodedata.decomposition()). For
> composition the situation is different: not all information is
> available in the unicodedata database (the exclusion list) and
> the database also doesn't provide the reverse mapping from
> decomposed code points to composed one. See the Annexes to the
> tech report to get an impression of just how hard combining is...

In a message just prior to this one, you wrote:

    The recommended way of doing normalization is to go by
    Normalization Form C: Canonical Decomposition,
    followed by Canonical Composition.

So, um, which way is it?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/