Pattern Matching Given # of Characters and no String Input; useRegularExpressions?

Kent Johnson kent37 at tds.net
Sun Apr 17 07:31:51 EDT 2005


tiissa wrote:
> Synonymous wrote:
> 
>> Can regular expressions compare file names to one another. It seems RE
>> can only compare with input i give it, while I want it to compare
>> amongst itself and give me matches if the first x characters are
>> similiar.
> 
> Do you have to use regular expressions?
> 
> If you know the number of characters to match can't you just compare 
> slices?
> 
> It seems to me you just have to compare each file to the next one (after 
> having sorted your list).

itertools.groupby() can do the comparing and grouping:

  >>> import itertools
  >>> def groupbyPrefix(lst, n):
  ...   lst.sort()
  ...   def key(item):
  ...     return item[:n]
  ...   return [ list(items) for k, items in itertools.groupby(lst, key=key) ]
  ...
  >>> names = ['cccat', 'cccap', 'cccan', 'cccbt', 'ccddd', 'dddfa', 'dddfg', 'dddfz']
  >>> groupbyPrefix(names, 3)
[['cccat', 'cccap', 'cccan', 'cccbt'], ['ccddd'], ['dddfa', 'dddfg', 'dddfz']]
  >>> groupbyPrefix(names, 2)
[['cccat', 'cccap', 'cccan', 'cccbt', 'ccddd'], ['dddfa', 'dddfg', 'dddfz']]

Kent



More information about the Python-list mailing list