[Matplotlib-users] Plotting Lists of Strings has high CPU

Jody Klymak jklymak at uvic.ca
Thu Oct 25 19:58:45 EDT 2018


Perhaps not surprising that hasn’t been optimized, because most folks don’t have that many categories.  If you have an actual use-case for that many categories, submitting a bug report on Github would be great.  

Cheers,   Jody

> On Oct 25, 2018, at  16:47 PM, Douglas Clowes <douglas.clowes at gmail.com> wrote:
> 
> > Strings are now treated as “categories” rather than cast to floats,  and plotted in the order received.
> 
> > https://matplotlib.org/gallery/lines_bars_and_markers/categorical_variables.html <https://matplotlib.org/gallery/lines_bars_and_markers/categorical_variables.html>
> 
> > Cheers,   Jody
> 
> Thanks for that Jody, I did just "get lucky".
> 
> Some assessment of this shows the high CPU associated with this operation is at least partially avoidable.
> 
> The majority of the CPU time, according to:
>   python3 -m cProfile -s time plotit.py -s|head -n20
> is in or under StrCategoryFormatter._text which seems to be getting called exponentially more times than I would expect. Of the order number of categories squared in my samples, with 40K calls for 100 categories and 4M for 1000 on mpl 2.2 amd 6M on mpl 3.0. Seems high.
> 
> Within the _text function in 2.2, the most expensive operation is the constant test of the numpy version. This can be significantly reduced by moving the constant expression with a simple change like:
> 
> diff --git a/lib/matplotlib/category.py b/lib/matplotlib/category.py
> index b135bff1c..89b1c5bd9 100644
> --- a/lib/matplotlib/category.py
> +++ b/lib/matplotlib/category.py
> @@ -28,6 +28,8 @@ import matplotlib.ticker as ticker
>  # np 1.6/1.7 support
>  from distutils.version import LooseVersion
>  
> +NP_PRE_1_7_0 = LooseVersion(np.__version__) < LooseVersion('1.7.0')
> +
>  VALID_TYPES = tuple(set(six.string_types +
>                          (bytes, six.text_type, np.str_, np.bytes_)))
>  
> @@ -158,7 +160,7 @@ class StrCategoryFormatter(ticker.Formatter):
>      def _text(value):
>          """Converts text values into `utf-8` or `ascii` strings
>          """
> -        if LooseVersion(np.__version__) < LooseVersion('1.7.0'):
> +        if NP_PRE_1_7_0:
>              if (isinstance(value, (six.text_type, np.unicode))):
>                  value = value.encode('utf-8', 'ignore').decode('utf-8')
>          if isinstance(value, (np.bytes_, six.binary_type)):
> 
> 
> _______________________________________________
> Matplotlib-users mailing list
> Matplotlib-users at python.org
> https://mail.python.org/mailman/listinfo/matplotlib-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20181025/7aa63443/attachment.html>


More information about the Matplotlib-users mailing list