[Expat-discuss] binary in XML?

Josh Martin Josh.Martin@abq.sc.philips.com
Tue, 17 Sep 2002 22:14:50 -0600 (MDT)


> Hi all,
> 	I am aware that it is not advisable to include binary in a XML document. 
>  Yet I would like to know if and how I can use expat to parse a XML 
> document with binary in it.
> 	For example,
> 	<file>
> 		<data length=100>...</data>
> 	</file>
> 
> 	It is a file element with data of 100 bytes.  "..." is 100 bytes of 
binary 
> data.  Is it possible to ask expat to simply copy 100 bytes of data to a 
> buffer after it sees <data length=100>, then skips that 100 bytes and start 
> parsing the doc as text from the end tag again?
> 	Thanks,
> 
> Desmond

While expat itself does not have any functionality to do what you are talking 
about, it would be easy to have your application do this. In your 
StartElementHandler() function you can have your application read in the 100 
bytes of data into a buffer when you handle the <data> tag and the "length" 
attribute, and then process the information as desired. Your application can 
then resume sending the document to expat, and because you start reading after 
the binary information expat will never know the difference. However, there are 
some things to watch out for when using this method. In order to make sure that 
the binary data is not included in a buffer read to expat you will either have 
to send the document to expat one byte at a time, or you will have to make sure 
the <data> tags and the binary information are on separate lines (as follows) 
and read the document in one line at a time.
<file>
	<data length="100">
...
	</data>
</file>

These caveats are why it's generally advisable not to use inline binary 
information in an XML document, but they do not preclude the possibility.

Good luck.

 - Josh Martin