Characters aren't displayed correctly

Hussein B hubaghdadi at gmail.com
Tue Mar 3 04:49:01 EST 2009


On Mar 3, 11:05 am, Hussein B <hubaghd... at gmail.com> wrote:
> On Mar 2, 5:40 pm, John Machin <sjmac... at lexicon.net> wrote:
>
>
>
> > On Mar 3, 1:50 am, Hussein B <hubaghd... at gmail.com> wrote:
>
> > > On Mar 2, 4:31 pm, John Machin <sjmac... at lexicon.net> wrote:> On Mar 2, 7:30 pm, Hussein B <hubaghd... at gmail.com> wrote:
>
> > > > > On Mar 1, 4:51 pm, Philip Semanchuk <phi... at semanchuk.com> wrote:
>
> > > > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:
>
> > > > > > > Hey,
> > > > > > > I'm retrieving records from MySQL database that contains non english
> > > > > > > characters.
>
> > > > Can you reveal which language???
>
> > > Arabic
>
> > > > > > > Then I create a String that contains HTML markup and column values
> > > > > > > from the previous result set.
> > > > > > > +++++
> > > > > > > markup = u'''<table>.....'''
> > > > > > > for row in rows:
> > > > > > >     markup = markup + '<tr><td>' + row['id']
> > > > > > > markup = markup + '</table>
> > > > > > > +++++
> > > > > > > Then I'm sending the email according to this tip:
> > > > > > >http://code.activestate.com/recipes/473810/
> > > > > > > Well, the email contains ????? characters for each non english ones.
> > > > > > > Any ideas?
>
> > > > > > There's so many places where this could go wrong and you haven't  
> > > > > > narrowed down the problem.
>
> > > > > > Are the characters stored in the database correctly?
>
> > > > > Yes they are.
>
> > > > How do you KNOW that they are stored correctly? What makes you so
> > > > sure?
>
> > > Because MySQL Query Browser displays them correctly, in addition I use
> > > BIRT as the reporting system and it shows them correctly.
>
> > > > > > Are they stored consistently (i.e. all using the same encoding, not  
> > > > > > some using utf-8 and others using iso-8859-1)?
>
> > > > > Yes.
>
> > > > So what is the encoding used to store them?
>
> > > Tables are created with UTF-8 encoding option
>
> > > > > > What are you getting out of the database? Is it being converted to  
> > > > > > Unicode correctly, or at all?
>
> > > > > I don't know, how to make sure of this point?
>
> > > > You could show us some of the output from the database query. As well
> > > > as
> > > >    print the_output
> > > > you should
> > > >    print repr(the_output)
> > > > and show us both, and also tell us what you *expect* to see.
>
> > > The result of print repr(row['name']) is '??? ??????'
> > > The '?' characters are supposed to be Arabic characters.
>
> > Are you expecting 3 Arabic characters, a space, and then 6 Arabic
> > characters?
>
> > We now have some interesting evidence: row['name'] is NOT a unicode
> > object -- otherwise the print would show u'??? ??????'; it's a str
> > object.
>
> > So: A utf8-encoded string is being decoded to unicode, and then re-
> > encoded to some other encoding, using the "replace" (with "?") error-
> > handling method. That shouldn't be hard to spot! It's about time you
> > showed us the code you are using to extract the data from the
> > database, including the print statements you have put in.
>
> This is how I retrieve the data:
>
> db = MySQLdb.connect(host = "127.0.0.1", port = 3306, user =
> "username",
>                          passwd = "passwd", db = "reporting")
> cr = db.cursor(MySQLdb.cursors.DictCursor)
> cr.execute(sql)
> rows = cr.fetchall()
>
> Thanks all for your nice help.

Hey,
I added use_unicode and charset keyword params to the connect() method
and I got the following:
u'\u062f\u062e\u0648\u0644 \u0633\u0631\u064a\u0639
\u0634\u0647\u0631'
So characters are getting converted successfully.
Well, using the previous recipe for sending the mail:
http://code.activestate.com/recipes/473810/
I got the following error:

Traceback (most recent call last):
  File "HtmlMail.py", line 52, in <module>
    s.sendmail(sender, receiver , msg.as_string())
  File "/usr/lib/python2.5/email/message.py", line 131, in as_string
    g.flatten(self, unixfrom=unixfrom)
  File "/usr/lib/python2.5/email/generator.py", line 84, in flatten
    self._write(msg)
  File "/usr/lib/python2.5/email/generator.py", line 109, in _write
    self._dispatch(msg)
  File "/usr/lib/python2.5/email/generator.py", line 135, in _dispatch
    meth(msg)
  File "/usr/lib/python2.5/email/generator.py", line 201, in
_handle_multipart
    g.flatten(part, unixfrom=False)
  File "/usr/lib/python2.5/email/generator.py", line 84, in flatten
    self._write(msg)
  File "/usr/lib/python2.5/email/generator.py", line 109, in _write
    self._dispatch(msg)
  File "/usr/lib/python2.5/email/generator.py", line 135, in _dispatch
    meth(msg)
  File "/usr/lib/python2.5/email/generator.py", line 201, in
_handle_multipart
    g.flatten(part, unixfrom=False)
  File "/usr/lib/python2.5/email/generator.py", line 84, in flatten
    self._write(msg)
  File "/usr/lib/python2.5/email/generator.py", line 109, in _write
    self._dispatch(msg)
  File "/usr/lib/python2.5/email/generator.py", line 135, in _dispatch
    meth(msg)
  File "/usr/lib/python2.5/email/generator.py", line 178, in
_handle_text
    self._fp.write(payload)
UnicodeEncodeError: 'ascii' codec can't encode characters in position
115-118: ordinal not in range(128)


Again, any ideas guys? :)
Thanks to you all, you rocks !



More information about the Python-list mailing list