Most space-efficient way to store log entries

Chris Angelico rosuav at gmail.com
Wed Oct 28 21:47:33 EDT 2015


On Thu, Oct 29, 2015 at 12:09 PM, Cameron Simpson <cs at zip.com.au> wrote:
> On 29Oct2015 11:39, Chris Angelico <rosuav at gmail.com> wrote:
>>>
>>> If it's only zipped, it's not opaque.  Just `zcat` or `zgrep` and
>>> process away.  The whole base64+minus_newlines thing does opaquify
>>> and doesn't really save all that much for the trouble.
>>
>>
>> If you zip the whole file as a whole, yes. If you zip individual
>> pieces, you can't zcat it (at least, I don't think so?).
>
>
> If it is pure gzip, then yes you can. So this:
>
>  gunzip < file1.gz; gunzip < file2.gz
>
> and this:
>
>  cat file1.gz file2.gz | gunzip
>
> should produce the same output. I think this works at the record level too.
>
> Of course all bets are off once you wrap the records in some outer layer (I
> have a file format with is little records which may have the data section
> zipped).

I was thinking in terms of having them wrapped, yes. Though I didn't
think of the possibility of merely abutting compressed streams; with a
bit of seeking and footling around, you could possibly make the file
more tailable. Lots of options, but I still think uncompressed text is
best.

ChrisA



More information about the Python-list mailing list