parse tar file with python

Daniel Fackrell dfackrell at DELETETHIS.linuxmail.org
Thu Jun 13 09:59:00 EDT 2002


Just tossing out some ideas here that may or may not pay off (not in any
particular order).

1. Try splitting the files between different directories like another post
mentioned.
2. Try using the tar module on one large tar file.
3. Try using the tar module on several medium-sized tar files.
4. (Was going to suggest open()ing a bunch of files at once in case there is
some delay there and separating the open() and read() calls might show some
speed improvement, but I think I won't.)
5. Consider threads?  They can help in some cases where I/O is the
speed-limiting factor.
6. Post a sample of the code you have right now so that we can see what
you're doing, perhaps including data extracted with the profile module.

--
Daniel Fackrell (dfackrell at linuxmail.org)
When we attempt the impossible, we can experience true growth.

"Shagshag13" <shagshag13 at yahoo.fr> wrote in message
news:aea0hs$54qq4$1 at ID-146704.news.dfncis.de...
> In fact, i don't want to tar/untar files, and especially not in main
memory !
>
> I wish i could read the tar-ed file line by line (f.readline) and be able
to check when i find the beginning of an "inside file" and
> get some info about it like name, how and so on... (that's because my
original files are plain text file, and i think that tar will
> let them unchanged)
>
> In my tar file there is, for example, a kind of separator like this (but
with everything in one long line) with :
>
>
shag.py_0100744_0002033_0001750_00000004414_07500237361_0015314_0_ustar_00_s
hagshag_user_0000040_0000417_beginofmyfilehere
>
> where _ stand for a variable amount of another ascii code that i can't
cut/paste...
>
> Do you kwow what each means (for example the first one is undoubtly the
file name, but then...) ?
> And what are the fixed position of each of theses ?
> Have a clue ?
>
> Many thanks,
>
> s13.
>
>
>
>
>





More information about the Python-list mailing list