[Python-Dev] PEP 3147: PYC Repository Directories

Barry Warsaw barry at python.org
Sun Feb 7 19:32:47 CET 2010


On Feb 06, 2010, at 04:02 PM, Guido van Rossum wrote:

>On Sat, Feb 6, 2010 at 3:28 PM, Barry Warsaw <barry at python.org> wrote:
>> On Feb 01, 2010, at 02:04 PM, Paul Du Bois wrote:
>>
>>>It's an interesting challenge to write the file in such a way that
>>>it's safe for a reader and writer to co-exist. Like Brett, I
>>>considered an append-only scheme, but one needs to handle the case
>>>where the bytecode for a particular magic number changes. At some
>>>point you'd need to sweep garbage from the file. All solutions seem
>>>unnecessarily complex, and unnecessary since in practice the case
>>>should not come up.
>>
>> I don't think that part's difficult.  The byte code's only going to change if
>> the source file has changed, and in that case, /all/ the byte code in the "fat
>> pyc" file will be invalidated, so the whole thing can be deleted by the first
>> writer.  I'd worked that out in the original fat pyc version of the PEP.
>
>I'm sorry, but I'm totally against fat bytecode files. They make
>things harder for all tools. The beauty of the existing bytecode
>format is that it's totally trivial: magic number, source mtime,
>unmarshalled code object. You can't beat the beauty of that.

Just for the record, I totally agree.  I was just explaining something I had
figured out in the original version of the PEP, which wasn't published but
which Martin had seen an early draft of.  When Martin made the suggestion of
sibling cache directories, I immediately realized that it was much cleaner,
better, and easier to implement than fat files (especially because I already
had some nasty complex code that implemented the fat files ;).  I'm beginning
to be convinced <wink> that a folder-per-folder approach is the best take on
this yet.

>For the traditional "skinny" bytecode files, I believe that the
>existing algorithm which writes zeros in the place of the magic number
>first, writes the rest of the file, and then goes back to write the
>correct magic number, is correct with a single writer and multiple
>readers (assuming the readers ignore the file if its magic number is
>invalid). The creat(O_EXCL) option ensures that there won't be
>multiple writers. No rename() is necessary; POSIX rename() may be
>atomic, but it's a directory modification which makes it potentially
>slow.

Agreed, and the current approach is time and battle tested.  I don't think we
need to be mucking around with it.

My current effort on this PEP will be spent on fleshing out the
folder-per-folder approach, understanding the implications of that, and
integrating all the other great comments in this thread.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100207/3103a2d1/attachment.pgp>


More information about the Python-Dev mailing list