Python obfuscation

Ben Sizer kylotan at gmail.com
Tue Nov 15 06:06:31 EST 2005


Mike Meyer wrote:
> "Ben Sizer" <kylotan at gmail.com> writes:
> > Decompyle (http://www.crazy-compilers.com/decompyle/ ) claims to be
> > pretty advanced. I don't know if you can download it any more to test
> > this claim though.
>
> No, it doesn't claim to be advanced. It claims to be good at what it
> does. There's no comparison with other decompilers at all. In
> particular, this doesn't give you any idea whether or not similar
> products exist for x86 or 68k binaries.

That's irrelevant. We don't require a citable source to prove the
simple fact that x86 binaries do not by default contain symbol names
whereas Python .pyc and .pyo files do contain them. So any
decompilation of (for example) C++ code is going to lose all the
readable qualities, as well as missing any symbolic constants,
enumerations, templated classes and functions, macros,  #includes,
inlined functions, typedefs, some distinctions between array indexing
and pointer arithmetic, which inner scope a simple data variable is
declared in, distinctions between functions/member functions declared
as not 'thiscall'/static member functions, const declarations, etc.

> I've dealt with some very powerfull disassemblers and
> decompilers, but none of them worked on modern architectures.

You can definitely extract something useful from them, but without
symbol names you're going to have to be working with a good debugger
and a decent knowledge of how to use it if you want to find anything
specific. Whereas Python could give you something pretty obvious such
as:

   6 LOAD_FAST                0 (licensed)
   9 JUMP_IF_FALSE            9 (to 21)

> > It makes a lot of difference when you're hunting around for something
> > or trying to understand a bit of code. Python bytecode (or at least,
> > the output from dis) is also a lot more straightforward than x86 or 68K
> > assembly to decipher.
>
> I'm not convinced of the former. I'll grant you half of the
> latter. 68K machine language is fairly straightforward. On the other
> hand, it's also seems to be irrelevant. What platform are you
> developing for that's still based on the 68K?

There are several embedded/portable devices based on 68K derivatives.
That's not really the point though. I chose 68K assembly as an example
as it's considered to be simpler than x86 assembly, yet it's still
significantly more complex and less readable than the output from
dis.dis()

> > The term I should
> > probably have used was 'distribute usable additional copies'.
>
> My question still stands, though - and unanswered.

I'm not really sure where we're going here. I have made the point that
I am not obliged to make my software copyable to facilitate your right
to copy it any more than any given newspaper is obliged to publish you
to facilitate your right to free speech. Therefore I find it hard to
see how anything is infringing upon a right here.

My interest lies in being able to use encrypted data (where 'data' can
also include parts of the code) so that the data can only be read by my
Python program, and specifically by a single instance of that program.
You would be able to make a backup copy (or 20), you could give the
whole lot to someone else, etc etc. I would just like to make it so
that you can't stick the data file on Bittorrent and have the entire
world playing with data that was only purchased once.

> But we can be
> explicit if you want: How do you do that without requiring that your
> software be given special consideration in the distaster recovery and
> preparedness planning?

I should state that I am not at all claiming a "one size fits all"
policy for software development. Firstly, from a personal point of view
I am talking about simple consumer entertainment software which is not
mission critical or anything like it. For more important software,
there will surely be different expectations and requirements. In my
case, providing a free download of any lost executables or data upon
presentation of a legitimate license key should be adequate.

-- 
Ben Sizer.




More information about the Python-list mailing list