regexp search on infinite string?

John Machin sjmachin at lexicon.net
Sat Sep 15 07:22:49 EDT 2007


On Sep 15, 4:36 pm, Paddy <paddy3... at googlemail.com> wrote:
> On Sep 15, 2:57 am, James Stroud <jstr... at mbi.ucla.edu> wrote:
>
>
>
> > Paddy wrote:
> > > Lets say i have a generator running that generates successive
> > > characters of a 'string'
> > >>From what I know, if I want to do a regexp search for a pattern of
> > > characters then I would have to 'freeze' the generator  and pass the
> > > characters so far to re.search.
> > > It is expensive to create successive characters, but caching could be
> > > used for past characters. is it possible to wrap the generator in a
> > > class, possibly inheriting from string, that would allow the regexp
> > > searching of the string but without terminating the generator? In
> > > other words duck typing for the usual string object needed by
> > > re.search?
>
> > > - Paddy.
>
> > re.search & re.compile checks for str or unicode types explicitly, so
> > you need to turn your data into one of those before using the module.

Aaaarbejaysus. Since when?

>>> import array
>>> ba = array.array('c', 'A hollow voice says "Plugh".')
>>> import re
>>> mobj = re.search('voi', ba)
>>> mobj
<_sre.SRE_Match object at 0x00B99598>
>>> mobj.span()
(9, 12)
>>>

>
> > buffer = []
> > while True:
> >    buffer.append(mygerator.next())
> >    m = re.search(pattern, "".join(buffer))
> >    if m:
> >      process(m)
> >      buffer = []
>
> > James
>
> Thanks James.

So:
buffer = array.array('c')
and flick the "".join() etc etc

HTH,
John






More information about the Python-list mailing list