[Python-Dev] Fuzzing the Python standard library

Nick Coghlan ncoghlan at gmail.com
Mon Jul 30 11:11:12 EDT 2018


On 18 July 2018 at 17:49, Steve Holden <steve at holdenweb.com> wrote:
> On Tue, Jul 17, 2018 at 11:44 PM, Paul G <paul at ganssle.io> wrote:
>>
>> In many languages numeric types can't hold arbitrarily large values, and I
>> for one hadn't really previously recognized that if you read in a numeric
>> value with an exponent that it would be represented *exactly* in memory (and
>> thus one object with a very compact representation can take up huge amounts
>> of memory). It's also not *inconceivable* that under the hood Python would
>> represent fractions.Fraction("1.64E6646466664") "lazily" in some fashion so
>> that it did not consume all the memory on disk.
>>
> Sooner or later you are going to need the digits of the number to perform a
> computation. Exactly when would you propose the deferred evaluation should
> take place?
>
> There are already occasional inquiries about the effects of creation of such
> large numbers and their unexpected effects, so they aren't completely
> unknown. At the same time, this isn't exactly a mainstream "bug", as
> evidenced by the fact that such issues are relatively rare.

It does mean that if Fraction is being used with untrusted inputs
though, it *does* make sense to put a reasonable upper bound on
permitted exponents.

The default decimal context caps expression results at an exponent of
less than 1 million for example:

  >>> +decimal.Decimal("1e999_999")
  Decimal('1E+999999')
  >>> +decimal.Decimal("1e1_000_000")
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  decimal.Overflow: [<class 'decimal.Overflow'>]

That's already large enough to result in a ~415k integer that takes a
minute or so for my machine to create if I call int() on it.

So I think it does make sense to at least describe how to use the
decimal module to do some initial sanity checking on potentially
exponential inputs, even if the fractions module itself never gains
native support for processing untrusted inputs.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list