find sublist inside list

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Mon May 4 08:46:32 EDT 2009


Matthias Gallé:
> My problem is to replace all occurrences of a sublist with a new element.
> Example:
> Given ['a','c','a','c','c','g','a','c'] I want to replace all
> occurrences of ['a','c'] by 6 (result [6,6,'c','g',6]).

There are several ways to solve this problem. Representing a string as
a list of "chars" (1-strings) is fine if you work with small strings,
but once they get longer you need too much memory and time.

This is a starting point:

>>> mapper = {"a":6, "c":6}
>>> data = 'acaccgac'
>>> [mapper.get(c, c) for c in data]
[6, 6, 6, 6, 6, 'g', 6, 6]

If you need higher performance, to represent few small numbers into a
well defined string like genomic data, you can use ASCII values of 6
and 11:

>>> from string import maketrans
>>> tab = maketrans("acgt", "".join([chr(6), chr(6), "gt"]))
>>> s.translate(tab)
'\x06\x06\x06\x06\x06g\x06\x06'

Later in processing it's easy to tell apart normal genome bases from
those small numbers.

Note that there is the array.array function too in the standard lib,
and elsewhere there is Numpy too.

There are several other possible solutions, but I stop here so you can
explain your purposes and constraints better.

Bye,
bearophile



More information about the Python-list mailing list