String comparison question

Michael Spencer mahs at telcopartners.com
Mon Mar 20 02:20:22 EST 2006


Alex Martelli wrote:
> Michael Spencer <mahs at telcopartners.com> wrote:

>>
>> Here, str.translate deletes the characters in its optional second argument.
>> Note that this does not work with unicode strings.
> 
> With unicode, you could do something strictly equivalent, as follows:
> 
> nowhite = dict.fromkeys(ord(c) for c in string.whitespace)
> 
> and then
> 
>   return a.translate(nowhite) == b.translate(nowhite)
> 
> 
> Alex
Interesting!  But annoying to have to use unicode.translate differently from 
str.translate:

import string
NULL = string.maketrans("","")
WHITE = string.whitespace
NO_WHITE_MAP = dict.fromkeys(ord(c) for c in WHITE)
def compare(a,b):
     """Compare two basestrings, disregarding whitespace -> bool"""
     if isinstance(a, unicode):
         astrip = a.translate(NO_WHITE_MAP)
     else:
         astrip = a.translate(NULL, WHITE)
     if isinstance(b, unicode):
         bstrip = b.translate(NO_WHITE_MAP)
     else:
         bstrip = b.translate(NULL, WHITE)
     return astrip == bstrip

In fact, now that you've pointed it out, I like the unicode.translate interface 
much better than str.translate(translation_table, deletechars = None).  But it 
would also be nice if these interfaces were compatible.

Perhaps str.translate could be extended to take a single mapping (optionally) as 
its argument:

i.e., behavior like:

def translate(self, table, deletechars=None):
     """S.translate(table [,deletechars]) -> string

     Return a copy of the string S, where all characters occurring
     in the optional argument deletechars are removed, and the
     remaining characters have been mapped through the given
     translation table, which must be either a string of length 256
     or a map of str ordinals to str ordinals, strings or None.
     Unmapped characters are left untouched. Characters mapped to None
     are deleted."""
     if hasattr(table, "keys"):
         if deletechars:
             raise ValueError, "Can't specify deletechars with a mapping table"
         table_map = table
         table = ""
         deletechars = ""
         for key in range(256):
             if key in table_map:
                 val = table_map[key]
                 if val is None:
                     deletechars += chr(key)
                     val = chr(key)
                 if not isinstance(val, str):
                     val = chr(val)
             else:
                 val = chr(key)
             table += val

     return str.translate(self, table, deletechars)

Michael




More information about the Python-list mailing list