[XML-SIG] [Bug #125004] 4xslt: XPath doesn't like ISO-8859-1
noreply@sourceforge.net
noreply@sourceforge.net
Fri, 8 Dec 2000 08:22:59 -0800
Bug #125004, was updated on 2000-Dec-08 08:22
Here is a current snapshot of the bug.
Project: Python/XML
Category: 4Suite
Status: Open
Resolution: None
Bug Group: None
Priority: 5
Submitted by: ornicar
Assigned to : Nobody
Summary: 4xslt: XPath doesn't like ISO-8859-1
Details:
Hello,
Today, I'm using 4xslt XSLT engine to transform an XML file into another
nicer XML file. In the attached example, there is a data.xml file that
contains a short description of my agenda but I don't like this
representation because the <appointment> node contains the time of the
meeting (before the ' Durée: ' word) and the duration of the meeting
(after the ' Durée: ' word) (in french, 'Durée' means 'Duration') :
<appointment>11h00 Durée: 20mn</appointment>
Therefore, I constructed an xslt file to turn my agenda into a new
agenda with separated nodes for the time of the meeting and the duration
of the meeting :
<appointment>
<time>11h00</time>
<duration>20mn</duration>
</appointment>
The xslt stylesheet has to take the content of the text node child of
the <appointement> node and to divide it in two parts: what is before '
Durée: ' and what is after. This can be easily done in xslt by using the
substring-before() and substring-after() functions :
<xsl:value-of select="substring-before(text(),' Durée: ')" />
and
<xsl:value-of select="substring-after(text(),' Durée: ')" />
Unfortunately, 4xslt doesn't like the "é" character in an XPath
expression (the expression inside the select) and returns the attached
stacktrace ending with:
xml.xpath.XPathParserBase.SyntaxException:
********** Syntax Exception **********
Exception at or near "Ã"
Line: 0, Production Number: 0
Of course changing "Durée" with "Duree" bothly in the xml file and in
the xslt stylesheet fixes the bug but this is not very satisfying. Using
another xslt engine (e.g. Xalan) allows transformation even in the case
with an "é" character.
This seems to be a bug in XPath expression processing (4xpath doesn't
like ISO-8859-1 characters).
O. CAYROL.
PS: see attached files are below ...
_________________________________________________________________________
Olivier CAYROL LOGILAB - Paris (France)
http://www.logilab.com/
For Christmas, give yourself an Intelligent Personal Assistant (free)
Pour Noël, offrez-vous un Assistant Personnel Intelligent (c'est gratuit)
_________________________________________________________________________
_________________________________________________________________________
data.xml "Initial XML file"
<?xml version="1.0" encoding="ISO-8859-1"?>
<agenda>
<appointment>11h00 Durée: 20mn</appointment>
<appointment>11h30 Durée: 40mn</appointment>
</agenda>
_________________________________________________________________________
transf.xslt "XSLT stylesheet"
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:strip-space elements="*"/>
<xsl:output method="xml"
encoding="ISO-8859-1"
indent="yes" />
<xsl:template match="/">
<xmlagenda>
<xsl:apply-templates select="agenda/appointment"/>
</xmlagenda>
</xsl:template>
<xsl:template match="appointment">
<appointment>
<time>
<xsl:value-of select="substring-before(text(),' Durée: ')"/>
</time>
<duration>
<xsl:value-of select="substring-after(text(),' Durée: ')"/>
<//duration>
</appointment>
</xsl:template>
</xsl:stylesheet>
_________________________________________________________________________
agenda.xml "Expected XML output"
<?xml version="1.0" encoding="ISO-8859-1"?>
<xmlagenda>
<appointment>
<time>11h00</time>
<duration>20mn</duration>
</appointment>
<appointment>
<time>11h30</time>
<duration>40mn</duration>
</appointment>
</xmlagenda>
_________________________________________________________________________
Stacktrace
$ 4xslt data.xml transf.xslt
Traceback (innermost last):
File "/usr/bin/4xslt", line 5, in ?
_4xslt.Run(sys.argv)
File "/usr/lib/python1.5/site-packages/xml/xslt/_4xslt.py", line 85, in
Run
processor.appendStylesheetUri(sty)
File "/usr/lib/python1.5/site-packages/xml/xslt/Processor.py", line 86,
in appendStylesheetUri
sty = self._styReader.fromUri(styleSheetUri)
File "/usr/lib/python1.5/site-packages/Ft/Lib/ReaderBase.py", line 99,
in fromUri
rt = self.fromStream(stream, baseUri, ownerDoc, stripElements)
File "/usr/lib/python1.5/site-packages/xml/xslt/StylesheetReader.py",
line 300, in fromStream
sheet.setup()
File "/usr/lib/python1.5/site-packages/xml/xslt/Stylesheet.py", line
144, in setup
curr_node.setup()
File "/usr/lib/python1.5/site-packages/xml/xslt/ValueOfElement.py", line
34, in setup
self.__dict__['_expr'] = parser.parseExpression(self._select)
File "/usr/lib/python1.5/site-packages/xml/xpath/XPathParser.py", line
36, in parseExpression
XPathParserBase.XPathParserBase.parse(self, st)
File "/usr/lib/python1.5/site-packages/xml/xpath/XPathParserBase.py",
line 60, in parse
XPath.cvar.g_prodNum)
xml.xpath.XPathParserBase.SyntaxException:
********** Syntax Exception **********
Exception at or near "Ã"
Line: 0, Production Number: 0
_________________________________________________________________________
For detailed info, follow this link:
http://sourceforge.net/bugs/?func=detailbug&bug_id=125004&group_id=6473