dynamic allocation file buffer
Fredrik Lundh
fredrik at pythonware.com
Thu Sep 11 04:34:07 EDT 2008
Steven D'Aprano wrote:
> I'm no longer *claiming* anything, I'm *asking* whether random access to
> a 4GB XML file is something that is credible or useful. It is my
> understanding that XML is particularly ill-suited to random access once
> the amount of data is too large to fit in RAM.
An XML file doesn't contain any indexing information, so random access
to a large XML file is very inefficient. You can build (or precompute)
index information and store in a separate file, of course, but that's
hardly something that's useful in the general case.
And as I said before, the only use case for *huge* XML files I've ever
seen used in practice is to store large streams of record-style data;
data that's intended to be consumed by sequential processes (and you can
do a lot with sequential processing these days; for those interested in
this, digging up a few review papers on "data stream processing" might
be a good way to waste some time).
Document-style XML usually fits into memory on modern machines;
structures larger than that are usually split into different parts (e.g.
using XInclude) and stored in a container file.
Random *modifications* to an arbitrary XML file cannot be done, as long
as you store the file in a standard file system. And if you invent your
own format, it's no longer an XML file.
</F>
More information about the Python-list
mailing list