[SPAM] RE: Cleaning up conditionals

MRAB python at mrabarnett.plus.com
Sat Dec 31 19:06:02 EST 2016


On 2016-12-31 22:35, Deborah Swanson wrote:
> Peter Otten wrote:
>> Deborah Swanson wrote:
>>
>> > Here I have a real mess, in my opinion:
>>
>> [corrected code:]
>>
>> >         if len(l1[st]) == 0:
>> >             if len(l2[st]) > 0:
>> >                 l1[st] = l2[st]
>> >         elif len(l2[st]) == 0:
>> >             if len(l1[st]) > 0:
>> >                 l2[st] = l1[st]
>>
>> > Anybody know or see an easier (more pythonic) way to do
>> this? I need
>> > to do it for four fields, and needless to say, that's a really long
>> > block of ugly code.
>>
>> By "four fields", do you mean four values of st, or four
>> pairs of l1, l2, or
>> more elif-s with l3 and l4 -- or something else entirely?
>>
>> Usually the most obvious way to avoid repetition is to write
>> a function, and
>> to make the best suggestion a bit more context is necessary.
>>
>
> I did write a function for this, and welcome any suggestions for
> improvement.
>
> The context is comparing 2 adjacent rows of data (in a list of real
> estate listings sorted by their webpage titles and dates) with the
> assumption that if the webpage titles are the same, they're listings for
> the same property. This assumption is occasionally bad, but in far less
> than one per 1000 unique listings. I'd rather just hand edit the data in
> those cases so one webpage title is slightly different, than writing and
> executing all the code needed to find and handle these corner cases.
> Maybe that will be a future refinement, but right now I don't really
> need it.
>
> Once two rows of listing data have been identified as different dates
> for the same property, there are 4 fields that will be identical for
> both rows. There can be up to 10 (or even more) listings identical
> except for the date, but typically I'm just adding a new one and want to
> copy the field data from its previous siblings, so the copying is just
> from the last listing to the new one.
>
> Here's the function I have so far:
>
> def comprows(l1,l2,st,ki,no):
>     ret = ''
>     labels = {st: 'st/co', ki: 'kind', no: 'notes'}
>     for v in (st,ki,no):
>         if len(l1[v]) == 0 and len(l2[v]) != 0:
>             l1[v] = l2[v]
>         elif len(l2[v]) == 0 and len(l1[v]) != 0:
>             l2[v] = l1[v]
>         elif l1[v] != l2[v]:
>             ret += ", " + labels[v] + " diff" if len(ret) > 0 else
>             labels[v] + " diff"
>     return ret
>
> The 4th field is a special case and easily dispatched in one line of
> code before this function is called for the other 3.
>
> l1 and l2 are the 2 adjacent rows of listing data, with st,ki,no holding
> codes for state/county, kind (of property) and notes. I want the
> checking and copying to go both ways because sometimes I'm backfilling
> old listings that I didn't pick up in my nightly copies on their given
> dates, but came across them later.
>
> ret is returned to a field with details to look at when I save the list
> to csv and open it in Excel. The noted diffs will need to be reconciled.
>
> I tried to use Jussi Piitulainen's suggestion to chain the conditionals,
> but just couldn't make it work for choosing list elements to assign to,
> although the approach is perfect if you're computing a value.
>
> Hope this is enough context... ;)
> D
>
Here's a slightly different way of doing it:

def comprows(l1, l2, st, ki, no):
     ret = ''
     labels = {st: 'st/co', ki: 'kind', no: 'notes'}
     for v in (st, ki, no):
         t = list({l1[v], l2[v]} - {''})
         if len(t) == 1:
             l1[v] = l2[v] = t[0]
         elif len(t) == 2:
             ret += ", " + labels[v] + " diff"
     return ret[2 : ]


And here's a summary of what it does:

If l1[v] == l2[v], then {l1[v], l2[v]} will contain 1 string, otherwise 
it'll contain 2 strings. Then remove any empty string.

If the set now contains 1 string, then either they were the same, or one 
of them was empty; in either case, just make them the same.

On the other hand, if the set contains 2 strings, then report that they 
were different.




More information about the Python-list mailing list