[Python-ideas] Move optional data out of pyc files

Steve Barnes gadgetsteve at live.co.uk
Tue Apr 10 14:52:36 EDT 2018



On 10/04/2018 18:54, Zachary Ware wrote:
> On Tue, Apr 10, 2018 at 12:38 PM, Chris Angelico <rosuav at gmail.com> wrote:
>> A deployed Python distribution generally has .pyc files for all of the
>> standard library. I don't think people want to lose the ability to
>> call help(), and unless I'm misunderstanding, that requires
>> docstrings. So this will mean twice as many files and twice as many
>> file-open calls to import from the standard library. What will be the
>> impact on startup time?
> 
> What about instead of separate files turning the single file into a
> pseudo-zip file containing all of the proposed files, and provide a
> simple tool for removing whatever parts you don't want?
> 

Personally I quite like the idea of having the doc strings, and possibly 
other optional components, in a zipped section after a marker for the 
end of the operational code. Possibly the loader could stop reading at 
that point, (reducing load time and memory impact), and only load and 
unzip on demand.

Zipping the doc strings should have a significant reduction in file 
sizes but it is worth remembering a couple of things:

  - Python is already one of the most compact languages for what it can 
do - I have had experts demanding to know where the rest of the program 
is hidden and how it is being downloaded when they noticed the size of 
the installed code verses the functionality provided.
  - File size <> disk space consumed - on most file systems each file 
typically occupies 1 + (file_size // allocation_size) clusters of the 
drive and with increasing disk sizes generally the allocation_size is 
increasing both of my NTFS drives currently have 4096 byte allocation 
sizes but I am offered up to 2 MB allocation sizes - splitting a .pyc 
10,052 byte .pyc file, (picking a random example from my drive) into a 
5,052 and 5,000 byte files will change the disk space occupied  from 
3*4,096 to 4*4,096 plus the extra directory entry.
  - Where absolute file size is critical you, (such as embedded 
systems), can always use the -O & -OO flags.
-- 
Steve (Gadget) Barnes
Any opinions in this message are my personal opinions and do not reflect 
those of my employer.

---
This email has been checked for viruses by AVG.
http://www.avg.com



More information about the Python-ideas mailing list