[XML-SIG] [ pyxml-Bugs-573015 ] adr_parse does not properly handle utf8

noreply@sourceforge.net noreply@sourceforge.net
Mon, 24 Jun 2002 01:08:29 -0700


Bugs item #573015, was opened at 2002-06-24 10:08
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=573015&group_id=6473

Category: XBEL
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Alexandre Fayolle (afayolle)
Assigned to: Nobody/Anonymous (nobody)
Summary: adr_parse does not properly handle utf8

Initial Comment:
Please see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=150765
for a full description of this bug.

------------------------------------

When converting my Opera 6 bookmarks to xbel I found
what could be
considered a bug in adr_parse. Bookmarks containing
non-ASCII characters
(in my case Umlauts äöüß) are represented in utf8 in
the opera6.adr file
according to its header:

Opera Hotlist version 2.0
Options: encoding = utf8, version=3


Using the output of adr_parse (utf8 code was copied
unchanged by adr_parse)
as bookmarks in galeon reveals the problem: Bookmark
entries are
truncated at the first non-ASCII character.

A manual fix of the bookmarks.xbel seems to have solved
the problem:
- -<?xml version="1.0"?>
+<?xml version="1.0" encoding="UTF-8"?>


This may not be a bug in adr_parse, but in bookmark.py,
as it seems to
do UTF-8 encoding without declaring it.

Kai

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=573015&group_id=6473