Obfuscator, EXE, etc. - a solution

jason willows apocalypznow at yahoo.com
Wed Apr 7 04:52:13 EDT 2004


There have been many many many many discussions about obfuscating 
python.  To my dismay, most who answer are those who frequently post, 
and they say things such as:
1) what's the point, in theory anything could eventually be decompiled
2) python is used for mostly internal stuff anyway, cuz its a "glue" 
language, so why bother
3) use licensing and a good lawyer, it's the ONLY way
4) many programmers seem comfortable releasing their java and .net and 
other interpreted code products into the market, so why not you?

I found most of these comments dismissive, and sometimes quite arrogant. 
  Frankly, the reasons why anyone would want to protect their code is 
simple and should be observed because we are all programmers: we want to 
protect our hard work.

Addressing the above points:
1) Anything could eventually be decompiled.... yes that's true.  In a 
perfect world.  Have you ever tried to decompile C code and make sense 
of it?  Try a large C program.  Good luck, you philosophers.
2) I don't see Python as merely a glue language.  I see it as a serious 
language for serious applications.  Indeed, there are many commercial 
examples of this, and Python works very well and is cost-efficient to 
use.  Incidentally, IBM and Microsoft have adopted Python for various 
applications.... not that in itself should necessarily mean anything.
3) Using licensing and a good lawyer.  I'm all for that!  Now your code 
has been stolen... and you are going to hire a lawyer to fight it out in 
  court.  Months go by, maybe into years.  The law offers no guarantees, 
except to law makers.  You've mortgaged your house to protect your 
investment.  If you win.
4) Others release their java and .net programs.  Many obfuscate their 
code before doing so, for the very same reasons a Python programmer 
would want to do so.

I'm sick and tired of intelligent people acting like idiots. 
Programmers should offer solutions, rather than anecdotal discussions 
based on obvious points.

Here's my solution, it's not perfect, but it works well:
Use Pyrex, which translates your python sources (virtually unchanged) to 
.c and then links them.  You get natively compiled .pyd files (ie: dll), 
just as though you had written a C program and compiled & linked it 
yourself.  I used this on all my source files except the one that starts 
my program.  I used py2exe (latest version) on the source file that 
starts my program to create an EXE, and it also puts all my .pyd files 
into the library.zip.  The result is a program that is as difficult to 
understand after decompile as a natively compiled C program, except for 
the beginning source file (which should contain only a very small 
fraction of your program logic anyway).  I have done this on a 
client-side python program that is composed of over 40 .py files and 
from between 200 to 500 lines each file.  It uses the wxPython widgets 
for the GUI, Twisted for client/server communication, Pyro for 
peer-to-peer communication, and the Crypto package for RSA public key 
encryption.  It runs without problems of any kind, especially ones that 
may be related to the GUI or Twisted or Pyro or Crypto, and the increase 
in speed of execution is very obvious.

Note on Pyrex: it can't handle "import *" and this addition construct "x 
+= 1".  So you may have to do a little bit of recoding, but that is all 
the recoding I found that I had to do.

If you would like to discuss this constructively, email me at 
apocalypznow at yahoo.com .  I welcome a good programmer's discussion.






More information about the Python-list mailing list