[Python-checkins] gh-107559: Argument Clinic: complain about non-ASCII chars in param docstrings (#107560)

erlend-aasland webhook-mailer at python.org
Wed Aug 2 08:40:28 EDT 2023


https://github.com/python/cpython/commit/9ff7b4af137b8028b04b52addf003c4b0607113b
commit: 9ff7b4af137b8028b04b52addf003c4b0607113b
branch: main
author: Erlend E. Aasland <erlend at python.org>
committer: erlend-aasland <erlend.aasland at protonmail.com>
date: 2023-08-02T12:40:23Z
summary:

gh-107559: Argument Clinic: complain about non-ASCII chars in param docstrings (#107560)

Previously, only function docstrings were checked for non-ASCII characters.
Also, improve the warn() message.

Co-authored-by: Alex Waygood <Alex.Waygood at Gmail.com>

files:
M Lib/test/test_clinic.py
M Tools/clinic/clinic.py

diff --git a/Lib/test/test_clinic.py b/Lib/test/test_clinic.py
index 6bdc571dd4d5a..6f53036366891 100644
--- a/Lib/test/test_clinic.py
+++ b/Lib/test/test_clinic.py
@@ -1427,6 +1427,25 @@ def test_scaffolding(self):
         actual = stdout.getvalue()
         self.assertEqual(actual, expected)
 
+    def test_non_ascii_character_in_docstring(self):
+        block = """
+            module test
+            test.fn
+                a: int
+                    á param docstring
+            docstring fü bár baß
+        """
+        with support.captured_stdout() as stdout:
+            self.parse(block)
+        # The line numbers are off; this is a known limitation.
+        expected = dedent("""\
+            Warning on line 0:
+            Non-ascii characters are not allowed in docstrings: 'á'
+            Warning on line 0:
+            Non-ascii characters are not allowed in docstrings: 'ü', 'á', 'ß'
+        """)
+        self.assertEqual(stdout.getvalue(), expected)
+
 
 class ClinicExternalTest(TestCase):
     maxDiff = None
diff --git a/Tools/clinic/clinic.py b/Tools/clinic/clinic.py
index 5f7d41e441551..1f461665003c8 100755
--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -785,9 +785,6 @@ def docstring_for_c_string(
             self,
             f: Function
     ) -> str:
-        if re.search(r'[^\x00-\x7F]', f.docstring):
-            warn("Non-ascii character appear in docstring.")
-
         text, add, output = _text_accumulator()
         # turn docstring into a properly quoted C string
         for line in f.docstring.split('\n'):
@@ -5266,6 +5263,11 @@ def state_parameter_docstring_start(self, line: str) -> None:
 
     def docstring_append(self, obj: Function | Parameter, line: str) -> None:
         """Add a rstripped line to the current docstring."""
+        matches = re.finditer(r'[^\x00-\x7F]', line)
+        if offending := ", ".join([repr(m[0]) for m in matches]):
+            warn("Non-ascii characters are not allowed in docstrings:",
+                 offending)
+
         docstring = obj.docstring
         if docstring:
             docstring += "\n"



More information about the Python-checkins mailing list