[XML-SIG] [Bug #125004] 4xslt: XPath doesn't like ISO-8859-1

noreply@sourceforge.net noreply@sourceforge.net
Fri, 8 Dec 2000 08:22:59 -0800


Bug #125004, was updated on 2000-Dec-08 08:22
Here is a current snapshot of the bug.

Project: Python/XML
Category: 4Suite
Status: Open
Resolution: None
Bug Group: None
Priority: 5
Submitted by: ornicar
Assigned to : Nobody
Summary: 4xslt: XPath doesn't like ISO-8859-1

Details: 
Hello,

  Today, I'm using 4xslt XSLT engine to transform an XML file into another
nicer XML file. In the attached example, there is a data.xml file that
contains a short description of my agenda but I don't like this
representation because the <appointment> node contains the time of the
meeting (before the ' Durée: ' word) and the duration of the meeting
(after the ' Durée: ' word) (in french, 'Durée' means 'Duration') :
    <appointment>11h00 Durée: 20mn</appointment>

  Therefore, I constructed an xslt file to turn my agenda into a new
agenda with separated nodes for the time of the meeting and the duration 
of the meeting :
     <appointment>
       <time>11h00</time>
       <duration>20mn</duration>
     </appointment>

   The xslt stylesheet has to take the content of the text node child of
the <appointement> node and to divide it in two parts: what is before '
Durée: ' and what is after. This can be easily done in xslt by using the 
substring-before() and substring-after() functions :
    <xsl:value-of select="substring-before(text(),' Durée: ')" />
and
    <xsl:value-of select="substring-after(text(),' Durée: ')" />

  Unfortunately, 4xslt doesn't like the "é" character in an XPath
expression (the expression inside the select) and returns the attached
stacktrace ending with:
    xml.xpath.XPathParserBase.SyntaxException: 
    ********** Syntax Exception **********
    Exception at or near "Ã"
      Line: 0, Production Number: 0
Of course changing "Durée" with "Duree" bothly in the xml file and in
the xslt stylesheet fixes the bug but this is not very satisfying.  Using
another xslt engine (e.g. Xalan) allows transformation even in the case
with an "é" character.

  This seems to be a bug in XPath expression processing (4xpath doesn't
like ISO-8859-1 characters).

    O. CAYROL.

PS: see attached files are below ...
_________________________________________________________________________
Olivier CAYROL LOGILAB - Paris (France)
                                                 http://www.logilab.com/
For Christmas, give yourself an Intelligent Personal Assistant (free)
Pour Noël, offrez-vous un Assistant Personnel Intelligent (c'est gratuit)
_________________________________________________________________________

_________________________________________________________________________
data.xml "Initial XML file"

<?xml version="1.0" encoding="ISO-8859-1"?>

<agenda>
  <appointment>11h00 Durée: 20mn</appointment>
  <appointment>11h30 Durée: 40mn</appointment>
</agenda>
_________________________________________________________________________
transf.xslt "XSLT stylesheet"

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">

  <xsl:strip-space elements="*"/>
  <xsl:output method="xml" 
              encoding="ISO-8859-1" 
              indent="yes" />

  <xsl:template match="/">
<xmlagenda>
    <xsl:apply-templates select="agenda/appointment"/>
</xmlagenda>
  </xsl:template>

  <xsl:template match="appointment">
<appointment>
<time>
    <xsl:value-of select="substring-before(text(),' Durée: ')"/>
</time>
<duration>
    <xsl:value-of select="substring-after(text(),' Durée: ')"/>
<//duration>
</appointment>
  </xsl:template>

</xsl:stylesheet>
_________________________________________________________________________
agenda.xml "Expected XML output"

<?xml version="1.0" encoding="ISO-8859-1"?>
<xmlagenda>
    <appointment>
        <time>11h00</time>
        <duration>20mn</duration>
    </appointment>
    <appointment>
        <time>11h30</time>
        <duration>40mn</duration>
    </appointment>
</xmlagenda>
_________________________________________________________________________
Stacktrace

 $ 4xslt data.xml transf.xslt
Traceback (innermost last):
  File "/usr/bin/4xslt", line 5, in ?
    _4xslt.Run(sys.argv)
  File "/usr/lib/python1.5/site-packages/xml/xslt/_4xslt.py", line 85, in
Run
    processor.appendStylesheetUri(sty)
  File "/usr/lib/python1.5/site-packages/xml/xslt/Processor.py", line 86,
in appendStylesheetUri
    sty = self._styReader.fromUri(styleSheetUri)
  File "/usr/lib/python1.5/site-packages/Ft/Lib/ReaderBase.py", line 99,
in fromUri
    rt = self.fromStream(stream, baseUri, ownerDoc, stripElements)
  File "/usr/lib/python1.5/site-packages/xml/xslt/StylesheetReader.py",
line 300, in fromStream
    sheet.setup()
  File "/usr/lib/python1.5/site-packages/xml/xslt/Stylesheet.py", line
144, in setup
    curr_node.setup()
  File "/usr/lib/python1.5/site-packages/xml/xslt/ValueOfElement.py", line
34, in setup
    self.__dict__['_expr'] = parser.parseExpression(self._select)
  File "/usr/lib/python1.5/site-packages/xml/xpath/XPathParser.py", line
36, in parseExpression
    XPathParserBase.XPathParserBase.parse(self, st)
  File "/usr/lib/python1.5/site-packages/xml/xpath/XPathParserBase.py",
line 60, in parse
    XPath.cvar.g_prodNum)
xml.xpath.XPathParserBase.SyntaxException: 
********** Syntax Exception **********
Exception at or near "Ã"
  Line: 0, Production Number: 0

_________________________________________________________________________



For detailed info, follow this link:
http://sourceforge.net/bugs/?func=detailbug&bug_id=125004&group_id=6473