Python obfuscation

Mike Meyer mwm at mired.org
Fri Nov 11 11:17:43 EST 2005


"Ben Sizer" <kylotan at gmail.com> writes:
> Mike Meyer wrote:
>> There are ways to distribute
>> Python modules so that the user can't just open them in a text
>> editor. There are also ways to get cryptographic security for
>> distributed modules.
> As for cryptographic security, could you provide a link or reference
> for this? I am quite interested for obvious reasons. I'd be concerned
> that there's a weak link in there at the decoding stage, however.

How about some ideas: Store your code in a zip file, and add it to the
search path. That immediately takes you out of the "just open the file
with a text editor" mode. For cryptographic security, use the ihooks
module to make "import" detect and decode encrypted modules before
actually importing them. Or digitally sign the modules, and check the
signature at import time. All of these are dead simple in Python.

> I have considered distributing my program as open source but with
> encrypted data. Unfortunately anyone can just read the source to
> determine the decryption method and password. Maybe I could put that
> into an extension module, but that just moves the weak link along the
> chain.

This isn't aPython problem, it's a problem with what you're doing. Try
Alex's solution, and put the data on a network server that goes
through whatever authentication you want it to.

>> Yes, if you use the same methods you use in C++,
>> it's "much harder". But by the same token, if you tried to use the
>> methods you'd use in a Python program in C++, you'd find that the C++
>> version was "much harder".
> Well, I'm not sure what you mean here. A compiled C++ program is much
> harder to extract information from than a compiled Python program.

It is? Is the Python disassembler so much advanced over the state of
the art of binary disassemblers, then? Or maybe it's the Python
decompilers that are so advanced? As far as I can tell, the only real
difference between Python bytecodes and x86 (for instance) binaries is
that Python bytecodes keep the variable names around so it can do
run-timme lookups. That's not that big a difference.

As for what I meant - Python has ihooks and imp, that make it simple
to customize import behavior. Doing those kinds of things with C++
code requires building the tools to do that kind of thing from
scratch.

>> Of course, as Alex pointed out, all of these are just keeping honest
>> people honest. The crooks have all the advantages in this game, so you
>> really can't expect to win.
> No, certainly not. But if you can mitigate your losses easily enough -
> without infringing upon anyone else's rights, I must add - then why not
> do so.

Elsewhere in the thread, you said:

> I'd just like to make it non-trivial to make or use additional copies.

How do you do that without infringing my fair use rights?

    <mike
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list