Best way to clean up list items?

DFS nospam at dfs.com
Mon May 2 14:06:34 EDT 2016


On 5/2/2016 12:57 PM, Jussi Piitulainen wrote:
> DFS writes:
>
>> Have: list1 = ['\r\n   Item 1  ','  Item 2  ','\r\n  ']
>> Want: list1 = ['Item 1','Item 2']
>>
>>
>> I wrote this, which works fine, but maybe it can be tidier?
>>
>> 1. list2 = [t.replace("\r\n", "") for t in list1]   #remove \r\n
>> 2. list3 = [t.strip(' ') for t in list2]            #trim whitespace
>> 3. list1  = filter(None, list3)                     #remove empty items
>>
>> After each step:
>>
>> 1. list2 = ['   Item 1  ','  Item 2  ','  ']   #remove \r\n
>> 2. list3 = ['Item 1','Item 2','']              #trim whitespace
>> 3. list1 = ['Item 1','Item 2']                 #remove empty items
>
> Try filter(None, (t.strip() for t in list1)). The default.

Works and drops a line of code.  Thx.



> Funny-looking data you have.

I know - sadly, it's actual data:

--------------------------------------------------------------------
from lxml import html
import requests

webpage = 
"http://www.usdirectory.com/ypr.aspx?fromform=qsearch&qs=TN&wqhqn=2&qc=Nashville&rg=30&qhqn=restaurant&sb=zipdisc&ap=2"

page  = requests.get(webpage)
tree  = html.fromstring(page.content)
addr1 = tree.xpath('//span[@class="text3"]/text()')
print 'Addresses: ', addr1
--------------------------------------------------------------------

I couldn't figure out a better way to extract it from the HTML (maybe 
XML and DOM?)



More information about the Python-list mailing list