Formatting Results so that They Can be Nicely Imported into a Spreadsheet.

Jim Langston tazmaster at rocketmail.com
Sat Aug 4 22:21:17 EDT 2007


<mensanator at aol.com> wrote in message 
news:1186278638.931477.39760 at z24g2000prh.googlegroups.com...
> On Aug 4, 6:35?pm, SMERSH009 <SMERSH0... at gmail.com> wrote:
>> Hi All.
>> Let's say I have some badly formatted text called doc:
>>
>> doc=
>> """
>> friendid
>> Female
>>
>>                             23 years old
>>
>>                             Los Gatos
>>
>>                             United States
>> friendid
>> Male
>>
>>                             24 years old
>>
>>                             San Francisco, California
>>
>>                             United States
>> """
>>
>> How would I get these results to be displayed in a format similar to:
>> friendid;Female;23 years old;Los Gatos;United States
>> friendid;Male; 24 years old;San Francisco, California;United States
>>
>> The latter is a lot easier to organize and can be quickly imported
>> into Excel's column format.
>>
>> Thanks Much,
>> Sam
>
> d = doc.split('\n')
>
> f = [i.split() for i in d if i]
>
> g = [' '.join(i) for i in f]
>
> rec = []
> temprec = []
> for i in g:
>    if i:
>        if i == 'friendid':
>            rec.append(temprec)
>            temprec = [i]
>        else:
>            temprec.append(i)
> rec.append(temprec)
>
> output = [';'.join(i) for i in rec if i]
>
> for i in output: print i
>
> ##    friendid;Female;23 years old;Los Gatos;United States
> ##    friendid;Male;24 years old;San Francisco, California;United States

also, I would suggest you use CSV format.  CSV stands for "Comma Seperated 
Variable" and Excel can load such a sheet directly.

Instead of seperating using ; seperate using ,  Of course, this provides a 
problem when there is a , in a string.  Resolution is to quote the string. 
Being such, you can just go ahead and quote all strings.  So you would want 
the output to be:

"friendid","Female","23 years old","Los Gatos","United States"
"friendid","Male","24 years old","San Francisco, California","United States"

Numbers should not be quoted if you wish to treat them as numeric and not 
text. 





More information about the Python-list mailing list