Frankenstring

Bengt Richter bokr at oz.net
Tue Jul 12 16:59:12 EDT 2005


On Tue, 12 Jul 2005 22:08:55 +0200, Thomas Lotze <thomas at thomas-lotze.de> wrote:

>Hi,
>
>I think I need an iterator over a string of characters pulling them out
>one by one, like a usual iterator over a str does. At the same time the
>thing should allow seeking and telling like a file-like object:
>
>>>> f = frankenstring("0123456789")
>>>> for c in f:
>...     print c
>...     if c == "2":
>...         break
>... 
>0
>1
>2
>>>> f.tell()
>3L
>>>> f.seek(7)
>>>> for c in f:
>...     print c
>... 
>7
>8
>9
>>>>
>
>It's definitely no help that file-like objects are iterable; I do want
>to get a character, not a complete line, at a time.
>
>I can think of more than one clumsy way to implement the desired
>behaviour in Python; I'd rather like to know whether there's an
>implementation somewhere that does it fast. (Yes, it's me and speed
>considerations again; this is for a tokenizer at the core of a library,
>and I'd really like it to be fast.) I don't think there's anything like
>it in the standard library, at least not anything that would be obvious
>to me.
>
>I don't care whether this is more of a string iterator with seeking and
>telling, or a file-like object with a single-character iterator; as long
>as it does both efficiently, I'm happy.
>
>I'd even consider writing such a beast in C, albeit more as a learning
>exercise than as a worthwhile measure to speed up some code.
>
>Thanks for any hints.
>
I'd probably subclass file to buffer in good-sized chunks and override the
iteration to go by characters through the buffer, updating the buffer
when you get to its end, and overriding seek and tell to do the right thing
re the buffer and where you are in it for the character iteration via next.

Regards,
Bengt Richter



More information about the Python-list mailing list