[Expat-bugs] [ expat-Bugs-707469 ] external param entities

SourceForge.net noreply at sourceforge.net
Fri Mar 21 08:34:21 EST 2003


Bugs item #707469, was opened at 2003-03-21 07:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=707469&group_id=10127

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Pavel Hlavnicka (pavel_hlavnicka)
Assigned to: Nobody/Anonymous (nobody)
Summary: external param entities

Initial Comment:

Hi all,
perhaps, this is not a bug, perhaps it is. Give me some
explanation in the first case, please.

1) look at the example
2) It happens with 1.95.5 and 1.95.4 at least

We use expat in sablotron xslt processor. We do support
external entity parsing, and in addition we allow users
to tell, whether they want to parse public external
entities or not. In such a case we just return 1 from
our entity reference handler. (No entity parser is
created).

It results to a strange behavior, if following scenario
happens:

- the parsed document contains <!DOCTYPE blabla SYSTEM
"file.dtd">

- file.dtd contains the declaration of external public
parameter entity

- file.dtd references this entity

- file dtd defines other entities

This is sample file.dtd:
<!ENTITY % foo PUBLIC "xxx cvv" "data-e.dtd">
%foo;

<!ENTITY % baz "baz">
<!ENTITY % bar "somedata">
<!ENTITY %baz; "%bar;">

If %foo; is not referenced, all works fine, and &baz;
means "somedata". If %foo; is referenced, an error
occurs while parsing <!ENTITY %baz; "%bar;">, it seems,
that the entity expansion of %baz; and %bar; is empty.
If I replace <!ENTITY %baz; "%bar;"> with <!ENTITY baz
"%bar;"> paring goes fine, but &baz; points to an empty
string.

I played with a debugger a bit, and what I found was,
that if entity parser is not created in entity ref.
handler. dtd.paramEntityRead is not set to true, and
dtd.keepProcessing is set to false consequently. I'm
not sure, what the mission of dtd.keepProcessing is,
but it seems, that it stays valid a bit longer then
needed. Or do you really mean, that the whole file.dtd
content should be skipped, if the %foo; reference was
not resolved? If so, why I can see the compilation error?

Thank you very much and thanks for expat.


----------------------------------------------------------------------

>Comment By: Karl Waclawek (kwaclaw)
Date: 2003-03-21 11:34

Message:
Logged In: YES 
user_id=290026

The last line is processed because Expat checks it
for well-formedness. What "processed" means is 
defined in section 5.1:

<quote>
Definition: While they are not required to check the 
document for validity, they are required to process all 
the declarations they read in the internal DTD subset 
and in any parameter entity that they read, up to the 
first reference to a parameter entity that they do not 
read; that is to say, they must use the information in 
those declarations to normalize attribute values, include 
the replacement text of internal entities, and supply 
default attribute values.
</quote>

Now, check the excerpt from 5.1 that I quoted in my 
first reply.

IMO, this implies that the declaration is read and 
checked for well-formedness, but its information
is *not* used for normalizing attribute values, including 
the replacement text of internal entities, and supplying 
default attribute values

Maybe you could also ask this question on the xml-dev
mailing list.

----------------------------------------------------------------------

Comment By: Pavel Hlavnicka (pavel_hlavnicka)
Date: 2003-03-21 11:23

Message:
Logged In: YES 
user_id=302801

Hard to tell. Another question is, why the last line (entity
declariation) is even processed. It should be skipped at
all, or not?

I can understand, that it could be a hrad job. As I looked
into the expat code, it seems, that the parsing is just
ignoring any output, but anything else works as usually.
Perhaps if DTD is not processed (dtd.keepProcessing ==
FALSE) you could expand any entity reference to some fake
value. It looks like a dirty solution, indeed.

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2003-03-21 11:09

Message:
Logged In: YES 
user_id=290026

As far as I can tell, the last line in data.dtd has
a reference to an undeclared entity (since the
corresponding declaration was ignored). That alone
is legal for non-validating processors - see
http://www.w3.org/TR/REC-xml#wf-entdeclared,
but the entity declaration itself would then become
mal-formed.

I think that section 5.1 does not mean that
non-processed entities are allowed to be mal-formed. 

----------------------------------------------------------------------

Comment By: Pavel Hlavnicka (pavel_hlavnicka)
Date: 2003-03-21 09:36

Message:
Logged In: YES 
user_id=302801

Ok, makes sense, actually I've get lost reading this parts
of XML spec. 

But anyway... why the error is reported? If I understand
well,  all following entities should be ignored, but the
error is reported. (for ATTLIST is is the same).

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2003-03-21 09:25

Message:
Logged In: YES 
user_id=290026

This has to do with section 5.1 of the spec
(http://www.w3.org/TR/REC-xml#proc-types):

<excerpt>
Except when standalone="yes", they must not process 
entity declarations or attribute-list declarations 
encountered after a reference to a parameter entity that 
is not read, since the entity may have contained 
overriding declarations.
</excerpt>

So, once %foo is not resolved/read, Expat must
not process any more entity and attribute declarations
in file.dtd. Try it again with standalone="yes".


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=707469&group_id=10127



More information about the Expat-bugs mailing list