[Tutor] parse text file

Norman Khine norman at khine.net
Tue Feb 2 19:39:33 CET 2010


On Tue, Feb 2, 2010 at 4:19 PM, Kent Johnson <kent37 at tds.net> wrote:
> On Tue, Feb 2, 2010 at 9:33 AM, Norman Khine <norman at khine.net> wrote:
>> On Tue, Feb 2, 2010 at 1:27 PM, Kent Johnson <kent37 at tds.net> wrote:
>>> On Tue, Feb 2, 2010 at 4:16 AM, Norman Khine <norman at khine.net> wrote:
>>>
>>>> here are the changes:
>>>>
>>>> import re
>>>> file=open('producers_google_map_code.txt', 'r')
>>>> data =  repr( file.read().decode('utf-8') )
>>>
>>> Why do you use repr() here?
>>
>> i have latin-1 chars in the producers_google_map_code.txt' file and
>> this is the only way to get it to read the data.
>>
>> is this incorrect?
>
> Well, the repr() call is after the file read. If your data is latin-1
> you should decode it as latin-1, not utf-8:
> data = file.read().decode('latin-1')
>
> Though if the decode('utf-8') succeeds, and you do have non-ascii
> characters in the data, they are probably encoded in utf-8, not
> latin-1. Are you sure you have latin-1?
>
> The repr() call converts back to ascii text, maybe that is what you want?
>
> Perhaps you put in the repr because you were having trouble printing?
>
> It smells of programming by guess rather than a correct solution to
> some problem. What happens if you take it out?

when i take it out, i get an empty list.

whereas both
data = repr( file.read().decode('latin-1') )
and
data = repr( file.read().decode('utf-8') )

returns the full list.

here is the file
http://cdn.admgard.org/documents/producers_google_map_code.txt

>
> Kent
>


More information about the Tutor mailing list