SHA-based subclass for random module
Trevor Perrin
trevp_spam at trevp.net
Wed Mar 24 01:21:03 EST 2004
Josiah Carlson wrote:
>
>> This would be easy to implement with Mersenne Twister too. Though it
>> would involve some rewriting.
>>
>> Do you think this is worthwhile?
>
>
> I don't know, but I'm leaning towards not really.
>
> def getrandstring(self, n):
> out = [struct.pack('>L', self.getrandbits(32)) \
> for i in xrange(0, n, 4)]
> out.extend([chr(self.getrandbits(4)) for i in xrange(n%4)])
> return ''.join(out)
>
> You have your getrandstring for any class that provides getrandbits, and
> is relatively optimized to deal with very long output strings.
If the generator is something like /dev/urandom or Win32's
CryptGenRandom, you don't want the repeated system call overhead of
dipping into it 32 bits at a time.
(I made the mistake of getting a byte at a time out of CryptGenRandom
once, for a few hundred bytes of RSA padding, and it was a big
performance hit).
> def getrandbits(self, n):
> out = 0
> for i in xrange(0, n+16, 16):
> out = out << 16 + struct.unpack('>H', self.getrandstring(2))
> return out >> (16-(n%16))
>
> Now you have getrandbits for any method that provides getrandstring,
But you don't need getrandbits().
Let's distinguish between the API and the SPI (service provider
interface). API is what users access, and the SPI is what new
generators implement.
I argue that getrandstring() makes more sense than getrandbits() in both
the API and SPI.
In the API, getrandstring() could be used for cryptographic keys, for
example. In contrast, getrandbits() is not a useful API function, since
the API already offers randint().
As for the SPI, currently there is random() and getrandbits(). These
require every new generator to be able to produce floats and
arbitrary-sized longs.
Most generators are oriented towards producing strings of bits or bytes
(M-T produces a sequence of 32-bit values, for example), so it would be
easier for them to implement getrandstring() and let the base class
handle conversion.
> On a relatively unrelated note, I'm leaning toward trying to get the
> benefits of both SHA and MT with the following, though I lack the heavy
> mathematical background to prove whether it is better than either.
I agree with Paul, there's no need to combine them. The only benefit of
MT vs. a SHA-based generator is speed.
Trevor
More information about the Python-list
mailing list