API for custom Unicode error handlers

Serhiy Storchaka storchaka at gmail.com
Fri Oct 4 15:35:47 EDT 2013


04.10.13 16:56, Steven D'Aprano написав(ла):
> I have some custom Unicode error handlers, and I'm looking for advice on
> the right API for dealing with them.
>
> I have a module containing custom Unicode error handlers. For example:
>
> # Python 3
> import unicodedata
> def namereplace_errors(exc):
>      c = exc.object[exc.start]
>      try:
>          name = unicodedata.name(c)
>      except (KeyError, ValueError):
>          n = ord(c)
>          if n <= 0xFFFF:
>              replace = "\\u%04x"
>          else:
>              assert n <= 0x10FFFF
>              replace = "\\U%08x"
>          replace = replace % n
>      else:
>          replace = "\\N{%s}" % name
>      return replace, exc.start + 1

I'm planning to built this error handler in 3.4 (see 
http://comments.gmane.org/gmane.comp.python.ideas/21296).

Actually Python implementation should looks like:

def namereplace_errors(exc):
     if not isinstance(exc, UnicodeEncodeError):
         raise exc
     replace = []
     for c in exc.object[exc.start:exc.end]:
         try:
             replace.append(r'\N{%s}' % unicodedata.name(c))
         except KeyError:
             n = ord(c)
             if n < 0x100:
                 replace.append(r'\x%02x' % n)
             elif n < 0x10000:
                 replace.append(r'\u%04x' % n)
             else:
                 replace.append(r'\U%08x' % n)
     return ''.join(replace), exc.end

> Now, my question:
>
> Should the module holding the error handlers automatically register them?

This question interesting me too.





More information about the Python-list mailing list