Python obfuscation

Fri Nov 18 09:56:38 EST 2005

Bengt Richter wrote:
> On Thu, 17 Nov 2005 10:53:24 -0800, aleax at mail.comcast.net (Alex Martelli) wrote:
>
> >Anton Vredegoor <anton.vredegoor at gmail.com> wrote:
> [...]
> >> The idea of using a webservice to hide essential secret parts of your
> >> application can only work well if one makes some random alterations to
> >> the results of the queries. Like GPS signals that are deliberately made
> >
> >I disagree on this general statement and I have already given two
> >counterexamples:
> I agree with your disagreement in general, but I think Antoon may be
> alluding to the "covert channel" problem, where sometimes randomization
> of an external observable is a defense. E.g., if a web site login process
> responds faster with a rejection of a bad user name (i.e. is not in the authorized
> user list) than it does for a valid user name and a bad password, the timing
> difference can be used over time to eke out the private user name list, and
> make subsequent password attacks that much easier.

Pardon me, but I'm Anton, not Antoon (well maybe I am but lets keep
this distinction in order to avoid mental hash collisions)

I agree with Alex and Bengt that my statement was too general and I
even admit that as I wrote it down the thought of making it less
provocative crossed my mind . However I felt safe because I wrote 'only
work *well*' instead of 'only work *if*' and what is working well is
open for discussion isn't it? Further in my post I wrote something
about adding random fluctuations making it harder to reverse engineer a
procedure so I felt even safer. Not so with Alex's thorough analysis
though :-)

What was mostly on my mind (but I didn't mention it) is that for
something to be commercially viable there should be some kind of
pricing strategy (NB in our current economic view of the world) where a
better paying user gets a vip interface and poor people get the
standard treatment.

Since one has to have the optimal result anyway in order to sell it to
the best payers it would be impractical to recompute less accurate
values. Why not just add a random part to make it less valuable for the
unpaying user? I'm thinking about things like specifiying a real value
interval where the user can extract data from (this is also a data
compression method, see arithmetic coding for more info).

<snip>

> Which perhaps gets towards Antoon's point (or my projection thereof ;-) -- i.e.,
> that the anwers provided in an experimental probe of an algorithm are "signal"
> for what you want to detect, and randomization may put noise in the signal to
> defeat detection (even though enough noise might make the algorithm output unsaleable ;-)
>

Yeah, sometimes people measure temperature fluctuactions in the CPU in
order to get clues about how an algorithm works :-) But in fact my mind
works more like some intuitive device that suggests that maybe some
point is safe enough to post or not, without always thinking through
all the details.

> >
> >a. a webservice which, for some amount X of money, gives an excellent
> >heuristic estimate of a good cutting-path for a woodcutting tool (for a
> >set of shapes to be cut out of standard-sized planks of wood by a
> >numerically driven cutter): this is a case where ESR, acting as a
> >consultant, advised his clients (who had developed a heuristic for this
> >task which saved a lot of wood compared to their competitors') to keep
> >their code closed-source, and it makes a good use case for the "hide
> >essential secret parts" in general;
> >

If the heuristic always gives the same answer to the same problem it
would be easier to predict the results. Oh no, now some mathematician
surely will prove me wrong :-)

> >b. a (hypothetical) website that, given time-space coordinates (and some
> >amount Y of money), produces and returns weather predictions that are
> >better than those you can get from its competitors.
> >
> >It appears to me that any application of this kind could work well
> >without at all "making random alterations" to whatever.  Point is, if
> >you develop a better algorithm (or, more likely, heuristic) for good
> >solutions to such problems, or predictions of just about anything which
> >might have economic value to somebody, using a webservice to hide the
> >essential secret parts of your discovery is an option, and it might be a
> >preferable alternative to relying on patents (since software patents may
> >not be enforceable everywhere in the world, and even where they're
> >nominally enforceable it could prove problematic and costly to actually
> >deter all would-be competitors from undercutting you).  I do not see
> >anything in your post that contradicts this, except the bare unsupported
> >assertion that a webservice "can only work well if one makes random
> >alterations".
> Yes, IMO that was an overgeneralization of an idea that may however have
> some actual narrow applicability.

Ok. Although it's a bit tricky to prove this by using an example where
the randomness is already in the problem from the start. If one groups
very chaotic processes in the same category as random processes of
course.

> >> But the more one messes with the ideal output the more often the user
> >> will rather click another link. (or launch another satellite)
> >
> >Of course.  If my "better weather predictor" is in fact based not on
> >inventing some new algorithm/heuristic, but on having better or more
> >abundant raw data due to my private network of satellites or other
> >observation platforms, this doesn't change the economic situation by all
> >that much (except that patenting may not even be an option in the latter
> >case, if there's no patentable innovation in that private network); a
> >competitor *could* reach or surpass my predictions' quality by investing
> >enough to re-develop the heuristic or duplicate the sensors-network.
> >So, my pricing should probably take that risk into account.
> >
> >Deliberately giving predictions worse than I could have given, in this
> >context, seems a deliberate self-sabotage without any return.
> >

Not always, for example with a gradient in user status according to how
much they pay. Note that I don't agree at all with such practice, but
I'm trying to explain how money is made now instead of thinking about
how it should be made.

> >> what's the current exchange rate for clicks and dollars?
> >
> >As far as I know, it varies wildly depending on the context, but I
> >suspect you can find ranges of estimates on the web.
> >
> The growth of virtual worlds with virtual money and virtual/"real"
> currency exchange is interesting. People are actually making real
> money investing in and developing virtual real estate and selling
> virtual currency profits for real-world money ;-)
>

Yes. Someday our past will be just a variation from the ideal
development that was retroactively fitted to the state the future is
in.

 Nice to be speaking to you both.

Anton