Becoming Unicode Aware

Bengt Richter bokr at oz.net
Wed Oct 27 19:33:52 EDT 2004


On Wed, 27 Oct 2004 12:56:32 +0200, "Diez B. Roggisch" <deetsNOSPAM at web.de> wrote:

>> My main problem with udnerstanding unicode is what to do with
>> arbitrary text without an encoding specified. To the best of my
>> knowledge the technical term for this situation is 'buggered'. E.g. I
>> have a CGI guestbook script. Is the only way of knowing what encodign
>> the user is typing in, to ask them ?
>
>Unfortunately the http standard seems to lack a specification how form data
>encoding is to be transferred. But it seems that most browser which
>understand a certain encoding your page is delivered in will use that for
>replying.
>
> 
>> Anyway - ConfigObj reads config files from plain text files. Is there
>> a standard for specifying the encoding within the text file ? I know
>> python scripts have a method - should I just use that ?
>
>No idea what configobj is - is it you own config parser?
> 
>> Also - suppose I know the encoding, or let the programmer specify, is
>> the following sufficient for reading the files in :
>> 
>> def afunction(setoflines, encoding='ascii'):
>>     for line in setoflines:
>>         if encoding:
>>             line = line.decode(encoding)
>
>Yes, it should be - but why the if? It is unnecessary, as its condition will
>always be true - and you _want_ it that way, as the result of afunction
 ^^^^^^^^^^^^^^

    afunction(lines, None)

would seem to be a feasible call ;-)

>should always be unicode objects, no matter what encoding was used.
>

Regards,
Bengt Richter



More information about the Python-list mailing list