The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?)

Jussi Piitulainen jussi.piitulainen at helsinki.fi
Tue Mar 22 08:52:47 EDT 2016


BartC writes:

> Not everything fits into a for-loop you know! Why, take my own
> readtoken() function:
>
>   symbol = anything_other_than_skip_sym
>
>   while symbol != skip_sym:
>      symbol = readnextsymbol()
>
> Of course, a repeat-until or repeat-while would suit this better (but
> I don't know how it fits into Python syntax). So there's a case here
> for increasing the number of loop statements not reducing them.

Not sure why nobody seems to respond to this part. Perhaps I just missed
it? It's true that while has its uses, or at least I think I've used it
in Python once or twice. But there's more fun to be had by turning your
data into a stream-like object.

stream = iter('   /* this is C! */') # <-- produces a character at a time

Now you can ask for the next item that satisfies a condition using a
generator expression:

next(symbol for symbol in stream if not symbol.isspace())
---> '/'

next(symbol for symbol in stream if not symbol.isspace())
---> '*'

Or collect the remaining items:

list(symbol for symbol in stream if not symbol.isspace())
---> ['t', 'h', 'i', 's', 'i', 's', 'C', '!', '*', '/']

You could also say:

for symbol in stream:
   if symbol.isspace(): continue
   ...

But this particular stream is empty by now. I work with long streams of
tokenized and annotated sentences (which for me are streams of tokens)
that sometimes come packed in streams of paragraphs packed in streams of
texts. I build whatever stream I happen to want by nesting generator
functions and generator expressions and some related machinery. (You
could build on a character stream or a byte stream that you obtain by
opening a file for reading; I tend to read line by line through
itertools.groupby, because that's what I do.)

These things compose well.



More information about the Python-list mailing list