Validating regexp

Cameron Simpson cs at cskk.id.au
Wed Aug 9 20:33:42 EDT 2017


On 09Aug2017 10:46, Jon Ribbens <jon+usenet at unequivocal.eu> wrote:
>On 2017-08-09, Cameron Simpson <cs at cskk.id.au> wrote:
>> On 08Aug2017 17:31, Jon Ribbens <jon+usenet at unequivocal.eu> wrote:
>>>... but bear in mind, there have been ways of doing denial-of-service
>>>attacks with valid-but-nasty regexps in the past, and I wouldn't want
>>>to rely on there not being any now.
>>
>> The ones I've seen still require some input length (I'm thinking exponential
>> rematch backoff stuff here). I suspect that if your test query matches the RE
>> against a fixed empty string it is hard to be exploited. i.e. I think most of
>> this stuff isn't expensive in terms of compiling the regexp but in
>> executing it against text.
>
>Well yes, but presumably if the OP is receiving regexps from users
>they will be executed against text sooner or later.

True, but the OP (Larry) was after validation.

The risk then depends on the degree of trust in the user. If the user is a 
random person-from-the-internets, sure there's a risk there. However, if the 
regexp is part of some internal configuration being set up by trusted people 
(eg staff pursuing a goal) then validation will normally be enough.

Of course, that is a call for Larry to make, not us, but it need to be bourne 
in mind by him.

Cheers,
Cameron Simpson <cs at cskk.id.au> (formerly cs at zip.com.au)



More information about the Python-list mailing list