[Expat-discuss] Bug? Illegal parameter reference error for valid document
Karl Waclawek
karl@waclawek.net
Tue Nov 6 17:50:03 2001
> Karl,
>
> I'm sorry if I was vague, but you misunderstood me. It's not the size of the
> buffer per se that matters, it's how much of the file you read in at a time and
> send to the parser. Let me see if I can make this clearer.
I think I understand you, but when I set the buffer size to 1, I automatically
read one byte at a time and send one byte at a time to the parser.
>
> Let's say we're parsing "test.xml":
>
> <?xml version="1.0" standalone="no"?>
> <!DOCTYPE test SYSTEM "test.dtd">
> <thing>My name is &bob.</thing>
>
> , "test.dtd":
> <!ENTITY % TESTent SYSTEM "test.ent">
> %TESTent;
> <!ENTITY bob %myname;>
> <!ELEMENT thing (#PCDATA)>
>
> and "test.ent":
> <!ENTITY % myname ""Bob"">
>
<snip> some code </snip>
> You would think this should work. When the external parser is told to parse the
> file "test.dtd" it reads the entire file into the buffer and parses it. It then
> sees an external reference to the file "test.ent" and spawns off another
> external parser to parse it. However, when that parser returns and parsing of
> "test.dtd" continues it discards the rest of the XML stored in the buffer and
> reads in a new buffer (which is empty because the end of the file has been
> reached). This causes the entity declaration for 'bob' to go unparsed, and when
> the main parser encounters the XML containing '&bob' it will generate an error.
Actually, your example above works for me. I replaced the dot after &bob. with
a semi-colon, renamed the DOCTYPE name to "thing", and got an error free parse
with a buffer size of 16KByte.
>
> To get around this problem instead of reading the file in one buffer-sized chunk
> at a time, read it in one line at a time, that way when the external parser for
> "test.ent" is done the next buffer that will be parsed by the external parser
> for "test.dtd" will be the line containing the entity declaration for 'bob'.
>
> If we do this the external parsing loop will look something like this (again,
> ignoring errors for clarity):
>
> // Parse until the file is done
> while (!feof(ext))
> {
> // Read sizeof(buff) number of characters the file into BUFF
> // until a new-line character is read or EOF is encountered
> // and store the number of characters successfuly read into LENGTH.
> length = strlen(fgets(buff, sizeof(buff), ext));
> // Parse the contents of BUFF.
> XML_Parse(e, buff, length, length == 0);
> }
>
> One could also parse one character of the file at a time (possibly using
> fgetc(3S)) and be able to catch all of the external references, but this would
> probably be less efficient.
It was clearer, but I think I don't have the same problem as you.
Actually, I think your problem may not exist in my version of Expat (1.95.2).
Regards,
Karl