Eliminate "extra" variable
Peter Otten
__peter__ at web.de
Sun Dec 8 09:04:13 EST 2013
Tim Chase wrote:
> On 2013-12-06 11:37, Igor Korot wrote:
>> def MyFunc(self, originalData):
>> data = {}
>> for i in xrange(0, len(originalData)):
>> dateStr, freq, source = originalData[i]
>> data[str(dateStr)] = {source: freq}
>
> this can be more cleanly/pythonically written as
>
> def my_func(self, original_data):
> for date, freq, source in original_data
> data[str(date)] = {source: freq}
>
> or even just
>
> data = dict(
> (str(date), {source: freq})
> for date, freq, source in original_data
> )
or even just
data = {str(date): {source: freq}
for date, freq, source in original_data}
But do you really need a dict with a single key? And is it even correct? If
a date occurs twice only the last source:freq pair is kept. Without knowing
the context the humble
data = {}
for date, freq, source in original_data:
source_to_freq = data.setdefault(date, {})
if source in source_to_freq:
raise ValueError(
"Multiple frequencies for one source not supported")
source_to_freq[source] = freq
appears so much more plausible...
> You're calling it a "dateStr", which suggests that it's already a
> string, so I'm not sure why you're str()'ing it. So I'd either just
> call it "date", or skip the str(date) bit if it's already a string.
> That said, do you even need to convert it to a string (as
> datetime.date objects can be used as keys in dictionaries)?
>
>> for i in xrange(0, len(dateStrs) - 1):
>> currDateStr = str(dateStrs[i])
>> nextDateStrs = str(dateStrs[i + 1])
>>
>> It seems very strange that I need the dateStrs list just for the
>> purpose of looping thru the dictionary keys.
>> Can I get rid of the "dateStrs" variable?
>
> Your code isn't actually using the data-dict at this point. If you
> were doing something with it, it might help to know what you want to
> do.
>
> Well, you can iterate over the original data, zipping them together:
>
> for (cur, _, _), (next, _, _) in zip(
> original_data[:-1],
> original_data[1:]
> ):
> do_something(cur, next)
This reminds me that I am a proponent of small dumb helper functions ;)
I find
def sliding_window(items):
a, b = itertools.tee(items)
next(b, None)
return zip(a, b)
dates = (date for date, _freq, _source in original_data)
for from_date, to_date in sliding_window(dates):
do_something(from_date, to_date)
much more accessible. Plus, I can apply arbitrary improvements to the
sliding_window() implementation or switch to a library version of that
function without fear of messing things up.
Likewise, should original_data become a sequence of namedtuples it is
straightforward to propagate this change with
dates = (item.date for item in original_data)
> If your purpose for the "data" dict is to merely look up stats from
> the next one, the whole batch of your original code can be replaced
> with:
>
> for (
> (cur_dt, cur_freq, cur_source),
> (next_dt, next_freq, next_source)
> ) in zip(original_data[:-1], original_data[1:]):
> # might need to do str(cur_dt) and str(next_dt) instead?
> do_things_with(cur_dt, cur_freq, cur_source,
> next_dt, next_freq, next_source)
>
> That eliminates the dict *and* the extra variable name. :-)
Smileys are overused ;) Anyway, with namedtuples this ... would become
for cur_item, next_item in zip(original_data, original_data[1:]):
do_things_with(cur_item, next_item)
Note that there's no need to slice the first argument as zip() ignores extra
items.
More information about the Python-list
mailing list