[XML-SIG] [ pyxml-Bugs-540663 ] ns_parse parsing error for titles

noreply@sourceforge.net noreply@sourceforge.net
Sun, 07 Apr 2002 11:53:48 -0700


Bugs item #540663, was opened at 2002-04-07 11:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=540663&group_id=6473

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Alan W. Irwin (airwin)
Assigned to: Nobody/Anonymous (nobody)
Summary: ns_parse parsing error for titles

Initial Comment:
This bug has also been reported at 
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=141549&repeatmerged=yes
but the maintainer there suggested I forward it to upstream
which is here!

Here is a fragment of a bookmarks.html file that generates the problem.

</DL><p>
<DT><H3 ADD_DATE="921214822" LAST_MODIFIED="1016325685"
ID="NC:BookmarksRoot#$ad5df158">Online Stores and Commercial Stuff</H3>
<DD>&lt;sort order=&quot;normal&quot;&gt;title&lt;/sort&gt;.
<DL><p>
Here is the xbel result fragment generated by ns_parse

  <folder>
    <title>Online Stores and Commercial Stuf</title>
  <folder>
    <title>f</title>

The "Stuff" in the title of the folder has been split into two folders,
...Stuf and f.  Later there is a closing </folder> for the "f" folder.  So
to fix the problem I have to apply the patch (not for general use) below to
my specific xbel file which gives another view of the effect of the error.

I am a sax newbie, but I think the effect of the
p=saxexts.SGMLParserFactory.make_parser() call in ns_parse is to let the
system use a default parser.  So it may be that this problem can be worked
around by specifying a particular parser without this 33 character
limitation on the title length.  Please let me know if you have a parser
suggestion which I could try.


BTW, the bookmarks file that triggers the error has 3000+ bookmarks in it
and is the data that generates the Loads of Linux Links site at
loll.sf.net/linux/links.  It's 0.6 MB, uncompressed.  Let me know if you
would like me to send you a compressed version of this file by e-mail to
help verify this bug.

***********diff between hand-corrected and generated xbel file.
--- bookmarks.xbel      Sat Apr  6 15:26:43 2002
+++ bookmarks.xbel_fixed        Sat Apr  6 15:39:41 2002
@@ -8132,9 +8132,7 @@
     </bookmark>
   </folder>
   <folder>
-    <title>Online Stores and Commercial Stuf</title>
-  <folder>
-    <title>f</title>
+    <title>Online Stores and Commercial Stuff</title>
     <bookmark href="http://www.mvista.com/company/" added="1008473434"
visited="1008473426" modified="1008473426" >
       <title>MontaVista Software - Company</title>
     </bookmark>
@@ -10307,6 +10305,5 @@
     <bookmark
href="http://www.linux.com/interact/links/Software/X_Window_System/"
added="1010369447" modified="1013736816" >
       <title>Linux.com : Links - Software: X_Window_System</title>
     </bookmark>
-  </folder>
   </folder>
 </xbel>

***********



----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=540663&group_id=6473