In need of a virtual filesystem / archive

Steven D'Aprano steve at REMOVEMEcyber.com.au
Tue Feb 21 03:42:32 EST 2006


Enigma Curry wrote:

> I need to store a large number of files in an archive. From Python, I
> need to be able to create an archive, put files into it, modify files
> that are already in it, and delete files already in it.
> 
> The easy solution would be to use a zip file or a tar file. Python has
> good standard modules for accessing those types. However, I would tend
> to think that modifying or deleting files in the archive would require
> rewriting the entire archive.
> 
> Is there any archive format that can allow Python to modify a file in
> the archive *in place*? That is to say if my archive is 2GB large and I
> have a small text file in the archive I want to be able to modify that
> small text file (or delete it) without having to rewrite the entire
> archive to disk.

Yes. I believe your common or garden variety file 
manager can handle this task, by storing files in an 
archive called "a directory". For example, many mail 
systems use the "maildir" archive for storing email 
while still being able to access it quickly and robustly.

Do you really need to store your files in a single 
meta-file? Do you need compression? How much overhead 
for the archive structure are you prepared to carry? Do 
you expect the archive to shrink when you delete a file 
from the middle?

I suspect you can pick any two of the following three:

1. single file
2. space used for deleted files is reclaimed
3. fast performance

Using a proper database will give you 2 and 3, but at 
the cost of a lot of overhead, and typically a 
relational database is not a single file.



-- 
Steven.




More information about the Python-list mailing list