[issue39280] Don't allow datetime parsing to accept non-Ascii digits

Steven D'Aprano report at bugs.python.org
Thu Jan 9 17:38:43 EST 2020


Steven D'Aprano <steve+python at pearwood.info> added the comment:

> If user code were to check for uniqueness of a datetime by comparing it as a string, this is where an attacker could fool this logic, by using a non-Ascii digit.

To me, this seems like a pretty thin justification for calling this a security vulnerability.

Using the exact same reasoning, one could argue that "If user code were to check for uniqueness of a float by comparing it as a string, this is where an attacker could fool this logic, by using leading or trailing spaces, extra non-significant digits, upper- or lowercase 'E', etc."

py> float(" +00012.145000000000000099999e00 ") == float("12.145")
True

Referring specifically to strptime(), there are many format codes which break uniqueness by allowing optional leading zeroes, and month names are case insensitive e.g. %b accepts 'jAn' as well as 'Jan'.

https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior

As far as the inconsistency, I think that's an argument for being less strict, not more, and allowing non-ASCII digits in more places not just the first. Why shouldn't (let's say) a Bengali user specify the day of the month using Bengali digits?

----------
nosy: +steven.daprano

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39280>
_______________________________________


More information about the Python-bugs-list mailing list