crossplatform py2exe - would it be useful?

Bengt Richter bokr at oz.net
Sun Aug 10 00:39:10 EDT 2003


On Sat, 09 Aug 2003 17:11:57 GMT, Alex Martelli <aleax at aleax.it> wrote:

>Bengt Richter wrote:
>   ...
>> Is there a big difference for you between e.g.,
>> 
>>     wget <some url>/py2exefiedapp.exe
>>     py2exefiedapp
>> 
>> and
>> 
>>     wget <some url>/py2exefiedapp.uff
>>     uffunwrap --launch exefiedapp.uff
>
>Yes.  Specifically, in the first case (on a Unix-like system) I could
>interpose a suitable set-userid setting change such as:
>
>    sudo chmod u+s py2exefiedapp
>
>while in the second case I couldn't.

If you know enough to use sudo, I think you would find it trivial to
use uffunwrap without auto-launch (i.e., just like an installer or
untarrer or unzipper etc.) and do your sudo chmod on whatever you liked.
I don't think this is a areal argument against uffunwrap. In fact, by
including a cmd/bash/sh/whatever file and launching that instead of
the main app, you could have a prompted automated sequence where you might
just have to remember the password when prompted (and maybe confirm that, yes,
you really mean it ;-)

I think there is good reason to minimize the number of times one gives control
to .exe's one hasn't very well authenticated. Imagine if all zip files came
in the form of .exe's! Wouldn't that make you nervous? Much better IMO to have
a single trusted tool that deals with them all as safely as possible.

>
>Furthermore, my cousin, who has installed no extras at all compared
>to what comes with his operating system (and runs an operating system
>without 'apt-get', 'urpmi', or similar 'download-on-demand' functionality)
>would still be able to take full advantage of the first approach w/o
>having to previously install ANY other piece of software; to take
>advantage of the second approach, he would have to first download and
>install 'uffunwrap', and he just ain't gonna do that.
For a one-time thing, I think he *should*, and you should twist his arm,
because I think it would result in a safer modus operandi. If the plan for
Python is to generate .exe's per se as self-executing distribution containers,
(other than the major wise installer distribution files, which have md5's posted)
then I really think some considerable thought ought to be expended on making
that safe. I have no problem checking md5's, but your cousin might.

I would envision uffunwrap to be a small executable that can unwrap,launch,
check integrity, and potentially do an automatic autenticity check (though
featuritis can cause growth). The packing partner program would be in Python,
say uff.py, and be able to package a uff file automatically based on a uff header
template edited to specify the requisite file sources on lines inserted following each
normal header line (which specifies an included file and where it goes in the unwrapping
context). The normal header lines also have text/binary flags and either actual sizes
and dates and optional digests, or placeholders for them to be created from specified sources.
That way an option to revalidate a new packing with only one file changed can be supported,
and only the header line for the new source should be updated.

I like short names, so maybe uffunwrap should just be uff.exe and the packer uff.py
(which also can have command line options for unpacking and various fancy stuff,
since it is python.

But if the heart it set on a single .exe for the user experience of having an
apparent single executable to run, then as I mentioned previously, it's
not that hard, and as you mentioned, there are various .exe builder out there.

Indeed, I believe winzip can generate an auto-extracting, auto-launching exe that
it can also recognize as a zip archive. Mabe even old pkzip could so something similar.

A single exe could be built on tar, tgz, zip or most any archive format as a multi-file
container embedded as a single binary resource.

I got interested in factoring out the container. The information itself is not that different
from a zip or tar file, but I have long been bugged by the IMO severely kludgy way file content
types are represented and/or inferred through magic and/or extension hints etc. etc.

IMO there ought to be a way of associating file type and other metadata with file data
other than through file names and/or extensions or multi/nefarious magic. So I thought,
what about introducing a single open-ended magic prefix (postfix or indirection can work too)
to data that could do the job. So I got to thinking, maybe a utf-8 header having some general
structure that can identify what you'd like to say about the data itself, as opposed the the
particular file system container that happens to contain it.

My original thought was to have a metadata prefix for single files, and just tweak a file
system implementation to keep this data in the first n 512-byte blocks of what would
ordinarily be data space, but add an offset into the file definition, so that the header
could be skipped transparently for seek and open etc., and look like an ordinary file,
but allow some kind of access to the metadata, maybe by opening with an 'm' mode to include
the metadata prefix as part of the apparent file. Or maybe to exclude it, so naive opens
will see the metadata. To have international content description I thought utf-8 would work.
Then it was a matter of choosing a standard format an minimal content for the header. Then
I got interested in something else ;-)

One reason for a universal text header is that then any file can be opened in a text editor
and you should at least see the header. Or just do head -20 some.uff to have a peek.

Well, this latest thread came up, and I thought to expand the idea to a segmented file.
with a header field for every segment. rfc2822 seemed like a possible format for the header,
other than it's supposed to be ascii. I wanted a universal format. So I'm debating utf-8 or -16,
and settled on 8 for now, because it's more readable if you see it raw.

I thought I could fairly easily implement packing methodology and at first I thought to use
the data as embedded/appended .exe resource, but had second thoughts about .exe's. Anyway,
it could obviously work as a microinstaller tool as well as a launcher. So what does it have
that wise or winzip etc don't have?

   ++ Potential for really simple and small open source code, both python and C.
    + support for unicode descriptions etc. (On windows it wouldn't be so hard to send the header
      to the clipboard for insertion into an editor that can show unicode, e.g., notepad.)
    + Potential to detect current console encoding and output header accordingly for
      localized interactive viewing w/o editor (instead of just assuming latin-1 and printing ?'s)
    + probable pretty good portability for a lot of the unwrapper.
   ++ platform independence of the .uff format (since endianness, packing, encoding, whatever can
      all be specified in the utf-8 header, and the rest is binary with specified endianness overall,
      and segment-wise also describable.

To get fast unpacking, I'd probably specify align=512 in the X-UFF-Packing: field.
To get really fast unpacking, I might spawn separate threads to copy segments to files
in parallel, but that's a future optimization. YM would vary with OS, controllers, etc.

Yet another self-unpacking archive is not rocket science or that interesting, but the idea
of a universal, file-system-independent, self-describing data format, seems to me the important
part. It would be like a universal bar code system for data, and would mean you could do
away with file extension associations, and you could see the orignal name for the data in some
native language, no matter how many times it had been contained in variously named and dated files
-- which are really only container names, not data names (except by unreliable dual name usage).
When you make new data, or modify existing data, that's when the data descriptions should change.
The file system names in use will only be temporary locator info, and are really separate
semantically.

Yada, yada ...

It wouldn't be that hard to do a single-exe version that can carry the data appended. (Though I'm
not 100% sure all .exe formats permit that any more, so it might be a matter of getting some
template pieces and faking what the linker does to include a binary resource officially
within an exe. But you still need un unpacking function to put python.exe and theapp.py
and theextension.dll and config.txt etc. into separate files, and maybe setting environment
and path, before kicking off python.exe.

You can also buy a copy of winzip to create self-extracting-and-launching launching zip files,
I think.

UFF is different. For one thing it is platform independent as a container format.
Of course executable binary contents destined for different platforms will be
different (and BTW the opportunity exists to package several versions selectable
at startup, even if just localization strings for a given app). But the basic
content is binary or text, and text is stored in the uff file with \n EOLs, unless
it is some special encoding, in which case it should be flagged as binary. When it
is unwrapped, text destination file is opened with 'w' not 'wb', so it becomes
what is normal for the platform. Of course you can ship ascii as binary too.

My current concept for packing (as opposed to unwrapping) a uff file, is to
drive it using a copy of the header as a template, and just e.g., add in
a source: specification after each line that needs a file to pack into the whole, e.g.,
    ...
    X-UFF-Pkt: 3: t     456 ./myConfig.txt ...
      source:   build2/cfg.dat
    ...
where relative paths are taken re a prefix specified elsewhere, maybe a packing command line option.
Anyway, it becomes a simple and I think flexible framework for lots of possibilites. Since you
can include whatever you want and launch anything you want from the included -- or from an
assumed user environment, since it's like having an internal command line. Speaking of which,
you could possibly prefix e.g. #! uffunwrap -x and make the resulting .uff executable.
BTW, does sudo chmod u+s get the setuid effect passed on to the interpreting executable?

>
>That's two strikes against the ".uff" approach and in favour of the
>'.exe' one.  I can see potential advantages for the '.uff', too, in
>widely different scenarios; but these issues indicate to me that it
>just can't replace the '.exe'.  Therefore, I would suggest you pursue
>the .uff as a third-party alternative -- while, on the other hand,
>"makers of .exe's" have long been available as third-party alternatives,
>and the thrilling aspect of this latest round of ideas is that we seem
>to be very close to being able to integrate them in the Python standard
>distribution, with a resulting potential for an interesting boost to
>Python's popularity.  It makes a psychological difference, quite a big
>one, whether some functionality is integrated in a standard distribution
>or has to be separately downloaded and installed as a third-party add-on.
Agreed. But I don't see why uff.exe and uff.py couldn't be standard. Since
downloading and running .exe's with a big python payload is attractive to
some, why wouldn't downloading a 50k or 100k uff.exe be attractive? ;-)
>
>"Ability to build directly executable files" would make a big 'selling'
>point if it were in Python's standard distribution, while "ability to
>wrap files into an archive which still needs a separate utility to
>unwrap and run", useful as it may be, just doesn't have the same level
>of raw appeal to typical punters currently wondering about Python.
Not that hard to take a copy of uff.exe and append the payload and have your
directly executable file. I can do it, but I'm not sure it's a good idea.
It could be fine for official python stuff, just like the windows installer
.exes are fine (but I only say that because I trust the Timbot ;-)
And I can check the md5's on those.

But in general, I don't see that the executable buys me much except worry.
The final executable is prepared by some few actions in any case. I'd rather
be having a data-driven tool I trust do it than something I don't wholly trust
maybe do it on the fly. Plus if the content really is multiple files, re-executing
the original exe may mean loading the whole thing, even if it notices that it
doesn't need to repeat its initial disgorging of content.

I just don't like the work of first making sure it's really the .exe I intended to get.
If I have a separate trusted tool that makes checking and looking easy, I prefer it.
It's why I prefer zipped or tgz files to gee-whiz Installshield-prepared or any other
installation .exes.
 
You can never trust those buggers to ask you politely whether you would like
their latest and greatest to override current file associations (or which), or
replace ctl3d.dll with something newer and supposedly better, etc. (Unless,
of course, you know the timbot put it together ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list