[Python-ideas] Run length encoding

Neal Fultz nfultz at gmail.com
Sat Jun 10 23:14:06 EDT 2017


Thanks, that's cool.   Maybe the root problem is that the docs aren't using
the right words when I google. Run-length-encoding is particularly relevant
for spare matrices, but there's probably a library for those as well.  On
the data science side of things, there's a few hundred R packages that use
it there[1].

Can you explicate the guiding principle a bit? I'm perplexed that python
would come with zip and gzip but not rle.

[1] :
https://github.com/search?l=R&q=user%3Acran+rle&type=Code&utf8=%E2%9C%93

On Sat, Jun 10, 2017 at 7:59 PM, David Mertz <mertz at gnosis.cx> wrote:

> Here's a one-line version:
>
> from itertools import groupby
> rle_encode = lambda it: (
>     (l[0],len(l)) for g in groupby(it) for l in [list(g[1])])
>
> Since "not every one line function needs to be in the standard library" is
> a guiding principle of Python, and even moreso of `itertools`, probably
> this is a recipe in the documentation at most.  Or maybe it would have a
> home in `more_itertools`.
>
>
> On Sat, Jun 10, 2017 at 7:20 PM, Neal Fultz <nfultz at gmail.com> wrote:
>
>> Hello python-ideas,
>>
>> I am very new to this, but on a different  forum and after a couple
>> conversations, I really wished Python came with run-length encoding
>> built-in; after all, it ships with zip, which is much more complicated :)
>>
>> The general idea is to be able to go back and forth between two
>> representations of a sequence:
>>
>> [1,1,1,1,2,3,4,4,3,3,3]
>>
>> and
>>
>> [(1, 4), (2, 1), (3, 1), (4, 2), (3, 3)]
>>
>> where the first element is the data element, and the second is how many
>> times it is repeated.
>>
>> I wrote an encoder/decoder in about 20 lines (
>> https://github.com/nfultz/rle.py/blob/master/rle.py ) and would like to
>> offer it for the next version; I think it might fit in nicely in the
>> itertools module, for example. I am curious about your thoughts.
>>
>> Best,
>>
>> -Neal
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170610/606f0270/attachment-0001.html>


More information about the Python-ideas mailing list