[issue11909] Doctest sees directives in strings when it should only see them in comments
Devin Jeanpierre
report at bugs.python.org
Fri Jun 24 10:40:45 CEST 2011
Devin Jeanpierre <jeanpierreda at gmail.com> added the comment:
You're right, and good catch. If a doctest starts with a "#coding:XXX" line, this should break.
One option is to replace the call to tokenize.tokenize with a call to tokenize._tokenize and pass 'utf-8' as a parameter. Downside: that's a private and undocumented API. The alternative is to manually add a coding line that specifies UTF-8, so that any coding line in the doctest would be ignored.
My preferred option would be to add the ability to read unicode to the tokenize API, and then use that. I can file a separate ticket if that sounds good, since it's probably useful to others too.
One other thing to be worried about -- I'm not sure how doctest would treat tests with leading "coding:XXX" lines. I'd hope it ignores them, if it doesn't then this is more complicated and the above stuff wouldn't work.
I'll see if I have the time to play around with this (and add more test cases to the patch, correspondingly) this weekend.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11909>
_______________________________________
More information about the Python-bugs-list
mailing list