pylint woes

DFS nospam at dfs.com
Sun May 8 17:24:09 EDT 2016


On 5/8/2016 7:36 AM, Steven D'Aprano wrote:
> On Sun, 8 May 2016 11:16 am, DFS wrote:
>
>> address data is scraped from a website:
>>
>> names = tree.xpath()
>> addr  = tree.xpath()
>
> Why are you scraping the data twice?


Because it exists in 2 different sections of the document.

names     = tree.xpath('//span[@class="header_text3"]/text()')
addresses = tree.xpath('//span[@class="text3"]/text()')


I thought you were a "master who knew her tools", and I was the 
apprentice?

So why did "the master" think xpath() was magic?






> names = addr = tree.xpath()
>
> or if you prefer the old-fashioned:
>
> names = tree.xpath()
> addr = names
>
> but that raises the question, how can you describe the same set of data as
> both "names" and "addr[esses]" and have them both be accurate?
>
>
>> I want to store the data atomically,
>
> I'm not really sure what you mean by "atomically" here. I know what *I* mean
> by "atomically", which is to describe an operation which either succeeds
> entirely or fails.

That's atomicity.



 > But I don't know what you mean by it.

http://www.databasedesign-resource.com/atomic-database-values.html



>> so I parse street, city, state, and
>> zip into their own lists.
>
> None of which is atomic.

All of which are atomic.



>> "1250 Peachtree Rd, Atlanta, GA 30303
>>
>> street = [s.split(',')[0] for s in addr]
>> city   = [c.split(',')[1].strip() for c in addr]
>> state  = [s[-8:][:2] for s in addr]
>> zipcd  = [z[-5:] for z in addr]
>
> At this point, instead of iterating over the same list four times, doing the
> same thing over and over again, you should do things the old-fashioned way:
>
> streets, cities, states, zipcodes = [], [], [], []
> for word in addr:
>     items = word.split(',')
>     streets.append(items[0])
>     cities.append(items[1].strip())
>     states.append(word[-8:-2])
>     zipcodes.append(word[-5:])



That's a good one.

Chris Angelico mentioned something like that, too, and I already put it 
place.



> Oh, and use better names. "street" is a single street, not a list of
> streets, note plural.


I'll use whatever names I like.








More information about the Python-list mailing list