Processing a large string

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Aug 11 22:30:57 EDT 2011


goldtech wrote:

> Hi,
> 
> Say I have a very big string with a pattern like:
> 
> akakksssk3dhdhdhdbddb3dkdkdkddk3dmdmdmd3dkdkdkdk3asnsn.....


Define "big".

What seems big to you is probably not big to your computer.


> I want to split the sting into separate parts on the "3" and process
> each part separately. I might run into memory limitations if I use
> "split" and get a big array(?)  I wondered if there's a way I could
> read (stream?) the string from start to finish and read what's
> delimited by the "3" into a variable, process the smaller string
> variable then append/build a new string with the processed data?
> 
> Would I loop it and read it char by char till a "3"...? Or?

You could, but unless there are a lot of 3s, it will probably be slow. If
the 3s are far apart, it will be better to do this:

# untested
def split(source):
    start = 0
    i = source.find("3")
    while i >= 0:
        yield source[start:i]
        start = i+1
        i = source.find("3", start)


That should give you the pieces of the string one at a time, as efficiently
as possible.




-- 
Steven




More information about the Python-list mailing list