xml + mmap cross

Stefan Behnel stefan_ml at behnel.de
Fri Sep 5 01:13:37 EDT 2008


Hi,

this discussion seems pretty much off-topic for a Python list.

castironpi wrote:
> In an XML file, entries are stored in serial, sort of like this.
> 
> AAA BBB CCC DDD
> 
> Or more recognizably,
> 
> <A><B><C>something</C><D>something</D></B></A>
> 
> Point is, to change <C>something</C> to <C>something else</C>, you
> have to recopy everything after that.
> 
> AAA BBB CCC DDD
> AAA BBBb CCC DDD
> 
> requires 7 writes, 'b CCC DDD', not 1.
> 
> I want to use a simple tree structure to store:
> 
> 0 A-> None, 1
> 1 B-> None, 2
> 2 C-> 3, None
> 3 D-> None, None
> 
> Each node maps to 'Next, Child', or more accurately, 'Next Sibling,
> First Child'.

Do I understand you right: you want to access serialised XML data as a memory
mapped file and operate directly on the byte data? You would still have to
decode the complete byte sequence to parse it and build your index structure
in that case.

How do you plan to store the pointers to a node's next sibling/child? And how
do you keep them updated over insertions/deletions? That, plus the copy
overhead in a sequential file, will be very costly on each change.


> You get constant time updates to contents, and log-time searches.

Every XML tree structure gives you log-time searches. But how do you achieve
constant time updates in a sequential file?

Stefan



More information about the Python-list mailing list