Becoming Unicode Aware
Bengt Richter
bokr at oz.net
Wed Oct 27 19:33:52 EDT 2004
On Wed, 27 Oct 2004 12:56:32 +0200, "Diez B. Roggisch" <deetsNOSPAM at web.de> wrote:
>> My main problem with udnerstanding unicode is what to do with
>> arbitrary text without an encoding specified. To the best of my
>> knowledge the technical term for this situation is 'buggered'. E.g. I
>> have a CGI guestbook script. Is the only way of knowing what encodign
>> the user is typing in, to ask them ?
>
>Unfortunately the http standard seems to lack a specification how form data
>encoding is to be transferred. But it seems that most browser which
>understand a certain encoding your page is delivered in will use that for
>replying.
>
>
>> Anyway - ConfigObj reads config files from plain text files. Is there
>> a standard for specifying the encoding within the text file ? I know
>> python scripts have a method - should I just use that ?
>
>No idea what configobj is - is it you own config parser?
>
>> Also - suppose I know the encoding, or let the programmer specify, is
>> the following sufficient for reading the files in :
>>
>> def afunction(setoflines, encoding='ascii'):
>> for line in setoflines:
>> if encoding:
>> line = line.decode(encoding)
>
>Yes, it should be - but why the if? It is unnecessary, as its condition will
>always be true - and you _want_ it that way, as the result of afunction
^^^^^^^^^^^^^^
afunction(lines, None)
would seem to be a feasible call ;-)
>should always be unicode objects, no matter what encoding was used.
>
Regards,
Bengt Richter
More information about the Python-list
mailing list