kdialog and unicode

John Machin sjmachin at lexicon.net
Wed Apr 27 08:15:56 EDT 2005


On 26 Apr 2005 19:16:25 -0700, dmbkiwi at gmail.com wrote:

>
>John Machin wrote:
>> On 26 Apr 2005 13:39:26 -0700, dmbkiwi at gmail.com (dumbkiwi) wrote:
>>
>> >Peter Otten <__peter__ at web.de> wrote in message
>news:<d4l92e$9di$05$1 at news.t-online.com>...
>> >> Dumbkiwi wrote:
>> >>
>> >> >> Just encode the data in the target encoding before passing it
>to
>> >> >> os.popen():
>>
>> >
>> >Anyway, from your post, I've done some more digging, and found the
>> >command:
>> >
>> >sys.setappdefaultencoding()
>> >
>> >which I've used, and it's fixed the problem (I think).
>> >
>>
>> Dumb Kiwi, eh? Maybe not so dumb -- where'd you find
>> sys.setappdefaultencoding()? I'm just a dumb Aussie [1]; I looked in
>> the 2.4.1 docs and also did import sys; dir(sys) and I can't spot it.
>
>Hmmm. See post above, seems to be something generated by eric3.  So
>this may not be the fix I'm looking for.
>
>>
>> In any case, how could the magical sys.setappdefaultencoding() fix
>> your problem? From your description, your problem appeared to be that
>> you didn't know what encoding to use.
>
>I knew what encoding to use,

Would you mind telling us (a) what that encoding is (b) how you came
to that knowledge (c) why you just didn't do

test = os.popen('kdialog --inputbox %s'
%(data.encode('that_encoding'))) 

instead of

test = os.popen('kdialog --inputbox %s' %(data.encode('utf-8'))) 

> the problem was that the text was being
>passed to kdialog as ascii.

It wasn't being passed to kdialog; there was an attempt which failed.

>  The .encode('utf-8') at least allows
>kdialog to run, but the text still looks like crap.  Using
>sys.setappdefaultencoding() seemed to help.  The text looked a bit
>better - although not entirely perfect - but I think that's because the
>font I was using didn't have the correct characters (they came up as
>square boxes).

And the font you *were* using is what? And the font you are now using
is what? What facilities do you have to use different fonts?

>>
>> What is the essential difference between
>>
>>    send(u_data.encode('polish'))
>>
>> and
>>
>>    sys.setappdefaultencoding('polish')
>>    ...
>>    send(u_data)
>
>Not sure - I'm new to character encoding, and most of this seems like
>black magic to me.

The essential difference is that setting a default encoding is a daft
idea. 


>
>>
>> [1]: Now that's *TWO* contenders for TautologyOTW :-)
>> 

Before I retract that back to one contender, I'll give it one more
shot:

1. Your data: you say it is Polish text, and is utf-8. This implies
that it is in Unicode, encoded as utf-8. What evidence do you have?
Have you been able to display it anywhere so that it "looks good"?
If it's not confidential, can you show us a dump of the first say 100
bytes of text, in an unambiguous form, like this:

print repr(open('polish.text', 'rb').read(100))

2. Your script: You say "I then manipulate the data to break it down
into text snippets" - uh-huh ... *what* manipulations? Care to tell
us? Care to show us the code?

3. kdialog: I know nothing of KDE and its toolkit. I would expect
either (a) it should take utf-8 and be able to display *any* of the
first 64K (nominal) Unicode characters, given a Unicode font or (b)
you can encode your data in a legacy charset, *AND* tell it what that
charset is, and have a corresponding font or (c) you have both
options. Which is correct, and what are the details of how you can
tell kdialog what to do -- configuration? command-line arguments?

HTHYTHYS,

John



More information about the Python-list mailing list