[XML-SIG] 4XPath: parsing Unicode string
Tamito KAJIYAMA
kajiyama@grad.sccs.chukyo-u.ac.jp
Sun, 26 Nov 2000 06:05:21 +0900
Hi,
I've used 4Suite 0.9.2 together with Python 2.0 and PyXML 0.6.2.
I have a problem that I cannot pass a Unicode string containing
Japanese characters to the 4XPath parser. Following reproduces
the problem:
>>> from xml.xpath import XPathParser
>>> p = XPathParser.XPathParser()
>>> path = p.parseExpression(u'substring-after("2000/10/30", "/")')
The expression above does not have any problem, but the next,
very similar one does:
>>> path = p.parseExpression(u'substring-after("2000\u5E7410\u670830\u65E5", "\u6708")')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/opt/lib/python2.0/site-packages/_xmlplus/xpath/XPathParser.py", line 36, in parseExpression
XPathParserBase.XPathParserBase.parse(self, st)
File "/opt/lib/python2.0/site-packages/_xmlplus/xpath/XPathParserBase.py", line 62, in parse
XPath.cvar.g_prodNum)
xml.xpath.XPathParserBase.SyntaxException:
********** Syntax Exception **********
While parsing substring-after("2000YY10MM30DD", "MM")
Exception at or near "10"
Line: 0, Production Number: 9
(YY, MM and DD represent Japanese characters \u5E74, \u6708 and
\u65E5, respectively. They are encoded in the native encoding
in the error message, so I replaced the actual characters for
quotation.)
Actually, the second XPath expression is used in an XSL
stylesheet, but the same error raises.
What's wrong? I wonder if I miss something trivial.
Thanks,
--
KAJIYAMA, Tamito <kajiyama@grad.sccs.chukyo-u.ac.jp>