Validating regexp

Jon Ribbens jon+usenet at unequivocal.eu
Wed Aug 9 06:46:19 EDT 2017


On 2017-08-09, Cameron Simpson <cs at cskk.id.au> wrote:
> On 08Aug2017 17:31, Jon Ribbens <jon+usenet at unequivocal.eu> wrote:
>>On 2017-08-08, Chris Angelico <rosuav at gmail.com> wrote:
>>> On Wed, Aug 9, 2017 at 2:57 AM, Larry Martell <larry.martell at gmail.com> wrote:
>>>> Yeah, it does not throw for 'A|B|' - but mysql chokes on it with empty
>>>> subexpression for regexp' I'd like to flag it before it gets to SQL.
>>>
>>> Okay, so your definition of validity is "what MySQL will accept". In
>>> that case, I'd feed it to MySQL and see if it accepts it. Regexps are
>>> sufficiently varied that you really need to use the same engine for
>>> validation as for execution.
>>
>>... but bear in mind, there have been ways of doing denial-of-service
>>attacks with valid-but-nasty regexps in the past, and I wouldn't want
>>to rely on there not being any now.
>
> The ones I've seen still require some input length (I'm thinking exponential 
> rematch backoff stuff here). I suspect that if your test query matches the RE 
> against a fixed empty string it is hard to be exploited. i.e. I think most of 
> this stuff isn't expensive in terms of compiling the regexp but in
> executing it against text.

Well yes, but presumably if the OP is receiving regexps from users
they will be executed against text sooner or later.



More information about the Python-list mailing list