[Expat-discuss] CRLF conversion question

Armin Bauer armin.bauer at desscon.com
Thu Sep 9 14:31:54 CEST 2004


So what fix do you think would be best?

I have a cdata section like this

<![CDATA[BEGIN:VCARD
VERSION:2.1
X-EVOLUTION-FILE-AS:Mike, Smith
FN:Smith Mike
N:Mike;Smith
TEL;PREF;WORK:+1 469 43220403
EMAIL;INTERNET:mike.smith at yahoo.com
TITLE:Business Developer
UID:pas-id-413DC011000000B2
END:VCARD]]>

the lines are seperated by 0x0d 0x0a
The tree builder make two text nodes out of every line:
the first one hold the text of the line like: "VERSION:2.1"
the second one holds a 0x0a
the 0x0d is lost

the tree is built by XML_Parse. The parser was created by
XML_ParserCreate(NULL)

the wbxml library later just concatenates these text nodes in the
assumptions that they were not altered (Which is wrong since the 0x0d is
lost)

so could you give me an hint how to fix this?

Armin


On Wed, 2004-09-08 at 00:35, Fred L. Drake, Jr. wrote:
> On Tuesday 07 September 2004 06:18 pm, Armin Bauer wrote:
>  > Sorry if its me being stupid but why do CDATA sections contain nodes at
>  > all? As far as my understanding goes the parser has not to touch the
>  > cdata section at all.
> 
> The node structure is defined by whatever API is providing you with nodes 
> (Expat isn't).  If you're using a DOM with wbxml (or if wbxml is using a DOM 
> internally), that's why nodes are used.
> 
> Line-end normalization is required at all times, even inside CDATA marked 
> sections.  CDATA marked sections are not intended as an escape hatch for 
> binary data.
> 
>  > wouldnt it be correct if it created exactly one text node containing all
>  > the text as is? At the moment it creates a lot of nodes for every line
>  > etc.
> 
> What's correct depends on the API.  If these are DOM nodes, then yes, that 
> would be correct, but not required for correctness.  The series of nodes 
> would also be correct.  That's a separate issue from line-end normalization.
> 
>  > wbxml is the library that converts syncml request to wap binary xml ( a
>  > form of conversion before it is send over gprs) :)
> 
> Cool, I guess.  Is it this one?
> 
>     http://libwbxml.aymerick.com/
> 
> The tree described in the documentation looks *very* DOMish to me, though 
> perhaps a little lighter than a full W3C DOM (hard to be heavier!).
> 
> 
>   -Fred



More information about the Expat-discuss mailing list