[issue26002] make statistics.median_grouped more efficient

Sun Jan 3 09:21:58 EST 2016

New submission from Upendra Kumar:

statistics.median_grouped currently uses cf=data.index(x) to find the leftmost position of x in data ( line number 445 ). Here, data.index() has linear time complexity, which may not be good for long lists.

But, here since the list 'data' is sorted beforehand, we can use binary search ( bisect_left() ) to find the leftmost occurrence of 'x' in 'data'. 

Similarly, for counting number of occurrence of 'x' in data after sorting, we can find the position of rightmost occurrence of 'x' in data[l1....len(data)], where l1 is position of leftmost occurrence of 'x' (line number 447 )

----------
components: Library (Lib)
files: median_grouped.patch
keywords: patch
messages: 257419
nosy: upendra-k14
priority: normal
severity: normal
status: open
title: make statistics.median_grouped more efficient
type: performance
versions: Python 3.4, Python 3.5, Python 3.6
Added file: http://bugs.python.org/file41484/median_grouped.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue26002>
_______________________________________