[Expat-bugs] [ expat-Bugs-2723522 ] expat memory consumption issue - advise needed

SourceForge.net noreply at sourceforge.net
Tue Mar 31 17:50:47 CEST 2009


Bugs item #2723522, was opened at 2009-03-31 18:50
Message generated for change (Tracker Item Submitted) made by alexmanovbg
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2723522&group_id=10127

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Not a Bug
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Alex Manov (alexmanovbg)
Assigned to: Nobody/Anonymous (nobody)
Summary: expat memory consumption issue - advise needed

Initial Comment:

We have a application which uses expat to convert a xml data file into a binary version of the file. The file at the moment is about 600M but will grow.
We encountered a blocking problem - while parsing the file the application starts using a huge amount of memory it needs 4G of RAM to finish successfuly a 600MB file.
Our engineers explained that this is due to block memory management in expat when it builds the xml tree. They explained that our xml has alot of tags which in turn requires separate 4K memory pages for even 3 bytes of actual data.

Is there any way to improve this? Could anyone suggest how we can optimize this process? Is there any settings which we can use to make it work?

Here is the file structure ( I am not uploading the file since it is 600M I can provide it though ).
<?xml version="1.0" encoding="utf-8" ?>
<Groups>
<Group>
<ID>9</ID>
<Status>Active</Status>
<EffectiveDate>196912311900</EffectiveDate>
<ExpireDate>203012301700</ExpireDate>
<Elements>
<Element>
<ID>2345737</ID>
<StartDate>20000101</StartDate>
<EndDate>20351231</EndDate>
<StartTime>00:00</StartTime>
<EndTime>00:00</EndTime>
<DayOfWeek>0,1,2,3,4,5,6</DayOfWeek>
<DayOfMonth></DayOfMonth>
<Month></Month>
<Data>1619</Data>
<Value_1>0.0000</Value_1>
<Value_2 type ="RELATIVE">0.0000</Value_2>
<Subelements>
<Subelement>
<ID>1</ID>
<Value_3>0</Value_3>
<Value_4>1</Value_4>
<Value_5>0.0000</Value_5>
<Value_6 type="FIXED">0.0000</Value_6>
</Subelement>
<Subelement>
<ID>Default</ID>
<Value_3>0</Value_3>
<Value_4>1</Value_4>
<Value_5>0.0000</Value_5>
<Value_6 type="FIXED">0.0000</Value_6>
</Subelement>
</Subelements>
</Element>
</Elements>
</Group>
</Groups>

There can be many Groups - in practice about 100
Each Group can have many elements - in practice about 100,000
Each Element can have many subelements - in practice about 4


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2723522&group_id=10127


More information about the Expat-bugs mailing list