[Python-checkins] bpo-29237: Create enum for pstats sorting options (GH-5103)

Ethan Furman webhook-mailer at python.org
Thu Jan 25 23:49:59 EST 2018


https://github.com/python/cpython/commit/863b1e4d0e95036bca4e97c1b8b2ca72c19790fb
commit: 863b1e4d0e95036bca4e97c1b8b2ca72c19790fb
branch: master
author: mwidjaja <mwidj at yahoo.com>
committer: Ethan Furman <ethan at stoneleaf.us>
date: 2018-01-25T20:49:56-08:00
summary:

bpo-29237: Create enum for pstats sorting options (GH-5103)

files:
A Misc/NEWS.d/next/Library/2018-01-04-14-45-33.bpo-29237.zenYA6.rst
M Doc/library/profile.rst
M Lib/pstats.py
M Lib/test/test_pstats.py
M Misc/ACKS

diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst
index 48426a00c9a..a6dc56f43cb 100644
--- a/Doc/library/profile.rst
+++ b/Doc/library/profile.rst
@@ -139,6 +139,7 @@ The :mod:`pstats` module's :class:`~pstats.Stats` class has a variety of methods
 for manipulating and printing the data saved into a profile results file::
 
    import pstats
+   from pstats import SortKey
    p = pstats.Stats('restats')
    p.strip_dirs().sort_stats(-1).print_stats()
 
@@ -148,14 +149,14 @@ entries according to the standard module/line/name string that is printed. The
 :meth:`~pstats.Stats.print_stats` method printed out all the statistics.  You
 might try the following sort calls::
 
-   p.sort_stats('name')
+   p.sort_stats(SortKey.NAME)
    p.print_stats()
 
 The first call will actually sort the list by function name, and the second call
 will print out the statistics.  The following are some interesting calls to
 experiment with::
 
-   p.sort_stats('cumulative').print_stats(10)
+   p.sort_stats(SortKey.CUMULATIVE).print_stats(10)
 
 This sorts the profile by cumulative time in a function, and then only prints
 the ten most significant lines.  If you want to understand what algorithms are
@@ -164,20 +165,20 @@ taking time, the above line is what you would use.
 If you were looking to see what functions were looping a lot, and taking a lot
 of time, you would do::
 
-   p.sort_stats('time').print_stats(10)
+   p.sort_stats(SortKey.TIME).print_stats(10)
 
 to sort according to time spent within each function, and then print the
 statistics for the top ten functions.
 
 You might also try::
 
-   p.sort_stats('file').print_stats('__init__')
+   p.sort_stats(SortKey.FILENAME).print_stats('__init__')
 
 This will sort all the statistics by file name, and then print out statistics
 for only the class init methods (since they are spelled with ``__init__`` in
 them).  As one final example, you could try::
 
-   p.sort_stats('time', 'cumulative').print_stats(.5, 'init')
+   p.sort_stats(SortKey.TIME, SortKey.CUMULATIVE).print_stats(.5, 'init')
 
 This line sorts statistics with a primary key of time, and a secondary key of
 cumulative time, and then prints out some of the statistics. To be specific, the
@@ -250,12 +251,13 @@ functions:
    without writing the profile data to a file::
 
       import cProfile, pstats, io
+      from pstats import SortKey
       pr = cProfile.Profile()
       pr.enable()
       # ... do something ...
       pr.disable()
       s = io.StringIO()
-      sortby = 'cumulative'
+      sortby = SortKey.CUMULATIVE
       ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
       ps.print_stats()
       print(s.getvalue())
@@ -361,60 +363,65 @@ Analysis of the profiler data is done using the :class:`~pstats.Stats` class.
    .. method:: sort_stats(*keys)
 
       This method modifies the :class:`Stats` object by sorting it according to
-      the supplied criteria.  The argument is typically a string identifying the
-      basis of a sort (example: ``'time'`` or ``'name'``).
+      the supplied criteria.  The argument can be either a string or a SortKey
+      enum identifying the basis of a sort (example: ``'time'``, ``'name'``,
+      ``SortKey.TIME`` or ``SortKey.NAME``). The SortKey enums argument have
+      advantage over the string argument in that it is more robust and less
+      error prone.
 
       When more than one key is provided, then additional keys are used as
       secondary criteria when there is equality in all keys selected before
-      them.  For example, ``sort_stats('name', 'file')`` will sort all the
-      entries according to their function name, and resolve all ties (identical
-      function names) by sorting by file name.
-
-      Abbreviations can be used for any key names, as long as the abbreviation
-      is unambiguous.  The following are the keys currently defined:
-
-      +------------------+----------------------+
-      | Valid Arg        | Meaning              |
-      +==================+======================+
-      | ``'calls'``      | call count           |
-      +------------------+----------------------+
-      | ``'cumulative'`` | cumulative time      |
-      +------------------+----------------------+
-      | ``'cumtime'``    | cumulative time      |
-      +------------------+----------------------+
-      | ``'file'``       | file name            |
-      +------------------+----------------------+
-      | ``'filename'``   | file name            |
-      +------------------+----------------------+
-      | ``'module'``     | file name            |
-      +------------------+----------------------+
-      | ``'ncalls'``     | call count           |
-      +------------------+----------------------+
-      | ``'pcalls'``     | primitive call count |
-      +------------------+----------------------+
-      | ``'line'``       | line number          |
-      +------------------+----------------------+
-      | ``'name'``       | function name        |
-      +------------------+----------------------+
-      | ``'nfl'``        | name/file/line       |
-      +------------------+----------------------+
-      | ``'stdname'``    | standard name        |
-      +------------------+----------------------+
-      | ``'time'``       | internal time        |
-      +------------------+----------------------+
-      | ``'tottime'``    | internal time        |
-      +------------------+----------------------+
+      them.  For example, ``sort_stats(SortKey.NAME, SortKey.FILE)`` will sort
+      all the entries according to their function name, and resolve all ties
+      (identical function names) by sorting by file name.
+
+      For the string argument, abbreviations can be used for any key names, as
+      long as the abbreviation is unambiguous.
+
+      The following are the valid string and SortKey:
+
+      +------------------+---------------------+----------------------+
+      | Valid String Arg | Valid enum Arg      | Meaning              |
+      +==================+=====================+======================+
+      | ``'calls'``      | SortKey.CALLS       | call count           |
+      +------------------+---------------------+----------------------+
+      | ``'cumulative'`` | SortKey.CUMULATIVE  | cumulative time      |
+      +------------------+---------------------+----------------------+
+      | ``'cumtime'``    | N/A                 | cumulative time      |
+      +------------------+---------------------+----------------------+
+      | ``'file'``       | N/A                 | file name            |
+      +------------------+---------------------+----------------------+
+      | ``'filename'``   | SortKey.FILENAME    | file name            |
+      +------------------+---------------------+----------------------+
+      | ``'module'``     | N/A                 | file name            |
+      +------------------+---------------------+----------------------+
+      | ``'ncalls'``     | N/A                 | call count           |
+      +------------------+---------------------+----------------------+
+      | ``'pcalls'``     | SortKey.PCALLS      | primitive call count |
+      +------------------+---------------------+----------------------+
+      | ``'line'``       | SortKey.LINE        | line number          |
+      +------------------+---------------------+----------------------+
+      | ``'name'``       | SortKey.NAME        | function name        |
+      +------------------+---------------------+----------------------+
+      | ``'nfl'``        | SortKey.NFL         | name/file/line       |
+      +------------------+---------------------+----------------------+
+      | ``'stdname'``    | SortKey.STDNAME     | standard name        |
+      +------------------+---------------------+----------------------+
+      | ``'time'``       | SortKey.TIME        | internal time        |
+      +------------------+---------------------+----------------------+
+      | ``'tottime'``    | N/A                 | internal time        |
+      +------------------+---------------------+----------------------+
 
       Note that all sorts on statistics are in descending order (placing most
       time consuming items first), where as name, file, and line number searches
       are in ascending order (alphabetical). The subtle distinction between
-      ``'nfl'`` and ``'stdname'`` is that the standard name is a sort of the
-      name as printed, which means that the embedded line numbers get compared
-      in an odd way.  For example, lines 3, 20, and 40 would (if the file names
-      were the same) appear in the string order 20, 3 and 40.  In contrast,
-      ``'nfl'`` does a numeric compare of the line numbers.  In fact,
-      ``sort_stats('nfl')`` is the same as ``sort_stats('name', 'file',
-      'line')``.
+      ``SortKey.NFL`` and ``SortKey.STDNAME`` is that the standard name is a
+      sort of the name as printed, which means that the embedded line numbers
+      get compared in an odd way.  For example, lines 3, 20, and 40 would (if
+      the file names were the same) appear in the string order 20, 3 and 40.
+      In contrast, ``SortKey.NFL`` does a numeric compare of the line numbers.
+      In fact, ``sort_stats(SortKey.NFL)`` is the same as
+      ``sort_stats(SortKey.NAME, SortKey.FILENAME, SortKey.LINE)``.
 
       For backward-compatibility reasons, the numeric arguments ``-1``, ``0``,
       ``1``, and ``2`` are permitted.  They are interpreted as ``'stdname'``,
@@ -424,6 +431,8 @@ Analysis of the profiler data is done using the :class:`~pstats.Stats` class.
 
       .. For compatibility with the old profiler.
 
+      .. versionadded:: 3.7
+         Added the SortKey enum.
 
    .. method:: reverse_order()
 
diff --git a/Lib/pstats.py b/Lib/pstats.py
index b7a20542a39..1b57d26b5a5 100644
--- a/Lib/pstats.py
+++ b/Lib/pstats.py
@@ -25,9 +25,32 @@
 import time
 import marshal
 import re
+from enum import Enum
 from functools import cmp_to_key
 
-__all__ = ["Stats"]
+__all__ = ["Stats", "SortKey"]
+
+
+class SortKey(str, Enum):
+    CALLS = 'calls', 'ncalls'
+    CUMULATIVE = 'cumulative', 'cumtime'
+    FILENAME = 'filename', 'module'
+    LINE = 'line'
+    NAME = 'name'
+    NFL = 'nfl'
+    PCALLS = 'pcalls'
+    STDNAME = 'stdname'
+    TIME = 'time', 'tottime'
+
+    def __new__(cls, *values):
+        obj = str.__new__(cls)
+
+        obj._value_ = values[0]
+        for other_value in values[1:]:
+            cls._value2member_map_[other_value] = obj
+        obj._all_values = values
+        return obj
+
 
 class Stats:
     """This class is used for creating reports from data generated by the
@@ -49,13 +72,14 @@ class Stats:
 
     The sort_stats() method now processes some additional options (i.e., in
     addition to the old -1, 0, 1, or 2 that are respectively interpreted as
-    'stdname', 'calls', 'time', and 'cumulative').  It takes an arbitrary number
-    of quoted strings to select the sort order.
+    'stdname', 'calls', 'time', and 'cumulative').  It takes either an
+    arbitrary number of quoted strings or SortKey enum to select the sort
+    order.
 
-    For example sort_stats('time', 'name') sorts on the major key of 'internal
-    function time', and on the minor key of 'the name of the function'.  Look at
-    the two tables in sort_stats() and get_sort_arg_defs(self) for more
-    examples.
+    For example sort_stats('time', 'name') or sort_stats(SortKey.TIME,
+    SortKey.NAME) sorts on the major key of 'internal function time', and on
+    the minor key of 'the name of the function'.  Look at the two tables in
+    sort_stats() and get_sort_arg_defs(self) for more examples.
 
     All methods return self, so you can string together commands like:
         Stats('foo', 'goo').strip_dirs().sort_stats('calls').\
@@ -161,7 +185,6 @@ def dump_stats(self, filename):
               "ncalls"    : (((1,-1),              ), "call count"),
               "cumtime"   : (((3,-1),              ), "cumulative time"),
               "cumulative": (((3,-1),              ), "cumulative time"),
-              "file"      : (((4, 1),              ), "file name"),
               "filename"  : (((4, 1),              ), "file name"),
               "line"      : (((5, 1),              ), "line number"),
               "module"    : (((4, 1),              ), "file name"),
@@ -202,12 +225,19 @@ def sort_stats(self, *field):
                        0:  "calls",
                        1:  "time",
                        2:  "cumulative"}[field[0]] ]
+        elif len(field) >= 2:
+            for arg in field[1:]:
+                if type(arg) != type(field[0]):
+                    raise TypeError("Can't have mixed argument type")
 
         sort_arg_defs = self.get_sort_arg_defs()
+
         sort_tuple = ()
         self.sort_type = ""
         connector = ""
         for word in field:
+            if isinstance(word, SortKey):
+                word = word.value
             sort_tuple = sort_tuple + sort_arg_defs[word][0]
             self.sort_type += connector + sort_arg_defs[word][1]
             connector = ", "
diff --git a/Lib/test/test_pstats.py b/Lib/test/test_pstats.py
index 566b3eab771..f835ce309a6 100644
--- a/Lib/test/test_pstats.py
+++ b/Lib/test/test_pstats.py
@@ -2,6 +2,7 @@
 from test import support
 from io import StringIO
 import pstats
+from pstats import SortKey
 
 
 
@@ -33,6 +34,47 @@ def test_add(self):
         stats = pstats.Stats(stream=stream)
         stats.add(self.stats, self.stats)
 
+    def test_sort_stats_int(self):
+        valid_args = {-1: 'stdname',
+                      0: 'calls',
+                      1: 'time',
+                      2: 'cumulative'}
+        for arg_int, arg_str in valid_args.items():
+            self.stats.sort_stats(arg_int)
+            self.assertEqual(self.stats.sort_type,
+                             self.stats.sort_arg_dict_default[arg_str][-1])
+
+    def test_sort_stats_string(self):
+        for sort_name in ['calls', 'ncalls', 'cumtime', 'cumulative',
+                    'filename', 'line', 'module', 'name', 'nfl', 'pcalls',
+                    'stdname', 'time', 'tottime']:
+            self.stats.sort_stats(sort_name)
+            self.assertEqual(self.stats.sort_type,
+                             self.stats.sort_arg_dict_default[sort_name][-1])
+
+    def test_sort_stats_partial(self):
+        sortkey = 'filename'
+        for sort_name in ['f', 'fi', 'fil', 'file', 'filen', 'filena',
+                           'filenam', 'filename']:
+            self.stats.sort_stats(sort_name)
+            self.assertEqual(self.stats.sort_type,
+                             self.stats.sort_arg_dict_default[sortkey][-1])
+
+    def test_sort_stats_enum(self):
+        for member in SortKey:
+            self.stats.sort_stats(member)
+            self.assertEqual(
+                    self.stats.sort_type,
+                    self.stats.sort_arg_dict_default[member.value][-1])
+
+    def test_sort_starts_mix(self):
+        self.assertRaises(TypeError, self.stats.sort_stats,
+                          'calls',
+                          SortKey.TIME)
+        self.assertRaises(TypeError, self.stats.sort_stats,
+                          SortKey.TIME,
+                          'calls')
+
 
 if __name__ == "__main__":
     unittest.main()
diff --git a/Misc/ACKS b/Misc/ACKS
index 900604ca4c6..20210e8089b 100644
--- a/Misc/ACKS
+++ b/Misc/ACKS
@@ -1706,6 +1706,7 @@ Jeff Wheeler
 Christopher White
 David White
 Mats Wichmann
+Marcel Widjaja
 Truida Wiedijk
 Felix Wiemann
 Gerry Wiener
diff --git a/Misc/NEWS.d/next/Library/2018-01-04-14-45-33.bpo-29237.zenYA6.rst b/Misc/NEWS.d/next/Library/2018-01-04-14-45-33.bpo-29237.zenYA6.rst
new file mode 100644
index 00000000000..f903aa7001b
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2018-01-04-14-45-33.bpo-29237.zenYA6.rst
@@ -0,0 +1 @@
+Create enum for pstats sorting options



More information about the Python-checkins mailing list