Pattern Matching Given # of Characters and no String Input; useRegularExpressions?
Kent Johnson
kent37 at tds.net
Sun Apr 17 07:31:51 EDT 2005
tiissa wrote:
> Synonymous wrote:
>
>> Can regular expressions compare file names to one another. It seems RE
>> can only compare with input i give it, while I want it to compare
>> amongst itself and give me matches if the first x characters are
>> similiar.
>
> Do you have to use regular expressions?
>
> If you know the number of characters to match can't you just compare
> slices?
>
> It seems to me you just have to compare each file to the next one (after
> having sorted your list).
itertools.groupby() can do the comparing and grouping:
>>> import itertools
>>> def groupbyPrefix(lst, n):
... lst.sort()
... def key(item):
... return item[:n]
... return [ list(items) for k, items in itertools.groupby(lst, key=key) ]
...
>>> names = ['cccat', 'cccap', 'cccan', 'cccbt', 'ccddd', 'dddfa', 'dddfg', 'dddfz']
>>> groupbyPrefix(names, 3)
[['cccat', 'cccap', 'cccan', 'cccbt'], ['ccddd'], ['dddfa', 'dddfg', 'dddfz']]
>>> groupbyPrefix(names, 2)
[['cccat', 'cccap', 'cccan', 'cccbt', 'ccddd'], ['dddfa', 'dddfg', 'dddfz']]
Kent
More information about the Python-list
mailing list