Pure Python Data Mangling or Encrypting

Sat Jun 27 16:16:16 EDT 2015

On Sat, Jun 27, 2015 at 11:35 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sun, 28 Jun 2015 01:09 am, Ian Kelly wrote:
>
>> On Sat, Jun 27, 2015 at 2:38 AM, Steven D'Aprano <steve at pearwood.info>
>> wrote:
>>> Can you [generic you] believe that attackers can *reliably* attack remote
>>> systems based on a 20µs timing differences? If you say "No", then you
>>> fail Security 101 and should step away from the computer until a security
>>> expert can be called in to review your code.
>>
>> Of course. I wouldn't bet the house on it, but with the proposed
>> substitution cipher system, I don't see why there would be any
>> measurable timing differences at all based on the choice of key.
>
> I wouldn't bet one wooden nickle on it. Not without a security audit of the
> application. And then what happens when the implementation changes and the
> audit is no longer valid?

I don't disagree about the security audit, although I think you'll
find that such things will require a greater investment of resources
than a wooden nickel.

> Despite his initial claim that he doesn't want to use AES because it's too
> slow implemented as pure Python, Randall has said that the application will
> offer AES encryption as an option.

Once again you're confusing what he said about the server with what he
said about the client. Just because he considers it too slow for data
mangling on the server doesn't make it too slow for any use.

>> The time to obfuscate a single byte is constant,
>
> Are you sure about that? Bet your house? How about your computer?
>
>
> # Python 3.3 on Linux, YMMV
>
> py> text = 'NOBODY expects the Spanish Inquisition!'*50000
> py> import string
> py> s = string.digits + string.ascii_letters
> py> t = (string.ascii_uppercase + string.digits[::-1] +
> ... string.ascii_lowercase)
> py> trans1 = str.maketrans('abcdef', 'fedcba')
> py> trans2 = str.maketrans(s, t)
> py> trans3 = str.maketrans('aZ', 'Za')
> py> with Stopwatch():
> ...     x = str.translate(text, trans1)
> ...
> time taken: 0.427513 seconds
> py> with Stopwatch():
> ...     x = str.translate(text, trans2)
> ...
> time taken: 0.228869 seconds
> py> with Stopwatch():
> ...     x = str.translate(text, trans3)
> ...
> time taken: 0.387105 seconds

Your examples are using partial keys of different sizes. It's hardly
surprising that the timing varies when you pass dicts of varying sizes
as the translation tables.

py> a = list(range(256))
py> b = random.sample(a, 256)
py> c = random.sample(a, 256)
py> d = random.sample(a, 256)
py> min(timeit.repeat("str.translate(text, a)", "from __main__ import
text, a", number=10, repeat=10))
0.9780099680647254
py> min(timeit.repeat("str.translate(text, b)", "from __main__ import
text, b", number=10, repeat=10))
0.9837233647704124
py> min(timeit.repeat("str.translate(text, c)", "from __main__ import
text, c", number=10, repeat=10))
0.9627216667868197
py> min(timeit.repeat("str.translate(text, d)", "from __main__ import
text, d", number=10, repeat=10))
0.9793561780825257
py> min(timeit.repeat("str.translate(text, c)", "from __main__ import
text, c", number=10, repeat=10))
0.9840573272667825

I ran it on c a second time to see if the 0.962 timing was systemic or
a fluke. The fact that c produced both the shortest and longest
timings out of only two runs lends me confidence (for the purpose of
this discussion) that the variation seen in these timings is random
and not correlated to the keys used.