Splitting strings - by iterators?

Francis Girard francis.girard at free.fr
Fri Feb 25 15:15:45 EST 2005


Hi,

Using finditer in re module might help. I'm not sure it is lazy nor 
performant. Here's an example :

=== BEGIN SNAP
import re

reLn = re.compile(r"""[^\n]*(\n|$)""")

sStr = \
"""
This is a test string.
It is supposed to be big.
Oh well.
"""

for oMatch in reLn.finditer(sStr):
  print oMatch.group()
=== END SNAP

Regards,

Francis Girard

Le vendredi 25 Février 2005 16:55, Jeremy Sanders a écrit :
> I have a large string containing lines of text separated by '\n'. I'm
> currently using text.splitlines(True) to break the text into lines, and
> I'm iterating over the resulting list.
>
> This is very slow (when using 400000 lines!). Other than dumping the
> string to a file, and reading it back using the file iterator, is there a
> way to quickly iterate over the lines?
>
> I tried using newpos=text.find('\n', pos), and returning the chopped text
> text[pos:newpos+1], but this is much slower than splitlines.
>
> Any ideas?
>
> Thanks
>
> Jeremy




More information about the Python-list mailing list