[Expat-discuss] CRLF conversion question

Armin Bauer armin.bauer at desscon.com
Thu Sep 9 15:57:12 CEST 2004


On Thu, 2004-09-09 at 15:18, Karl Waclawek wrote:
> ----- Original Message ----- 
> From: "Armin Bauer" <armin.bauer at desscon.com>
> To: "Fred L. Drake, Jr." <fdrake at acm.org>
> Cc: <expat-discuss at libexpat.org>
> Sent: Thursday, September 09, 2004 8:31 AM
> Subject: Re: [Expat-discuss] CRLF conversion question
> 
> 
> > So what fix do you think would be best?
> > 
> > I have a cdata section like this
> > 
> > <![CDATA[BEGIN:VCARD
> > VERSION:2.1
> > X-EVOLUTION-FILE-AS:Mike, Smith
> > FN:Smith Mike
> > N:Mike;Smith
> > TEL;PREF;WORK:+1 469 43220403
> > EMAIL;INTERNET:mike.smith at yahoo.com
> > TITLE:Business Developer
> > UID:pas-id-413DC011000000B2
> > END:VCARD]]>
> > 
> > the lines are seperated by 0x0d 0x0a
> > The tree builder make two text nodes out of every line:
> > the first one hold the text of the line like: "VERSION:2.1"
> > the second one holds a 0x0a
> > the 0x0d is lost
> > 
> > the tree is built by XML_Parse. The parser was created by
> > XML_ParserCreate(NULL)
> 
> To be precise XML_Parse does not build a tree, this
> would rather be the function of the call-backs.
> 
> > the wbxml library later just concatenates these text nodes in the
> > assumptions that they were not altered (Which is wrong since the 0x0d is
> > lost)
> > 
> > so could you give me an hint how to fix this?
> 
> Line breaks are reported by Expat as 0x0A, no matter what
> the input. So, depending on your platform, convert them back
> to the appropriate line break characters.

Yes. That would be one option. The only problem is how the xml document
is parsed. The vcard in the cdata section above is parsed into nodes
like this:

<cdata>
	<text>BEGIN:VCARD
	<text>0x0a
	<text>VERSION:2.1
	<text>0x0a
	...
</cdata>

So... what would be the correct approach? Test if the text node == 0x0a
and replace it with 0x0d 0x0a then?

> 
> Another option would be to Base64 encode your CDATA section.
> You may not even need a CDATA section then, and everything
> is preserved as if it were binary.

This wont work unfortunatly, since the wbxml library receives a syncml
request (which is an xml document with a vcard in a cdata section). So
to be able to convert the vcard to base64 the library would have to
detect the CDATA section first...

> 
> Karl
> _______________________________________________
> Expat-discuss mailing list
> Expat-discuss at libexpat.org
> http://mail.libexpat.org/mailman/listinfo/expat-discuss



More information about the Expat-discuss mailing list