dictionary comparing int keys and joins their values if two key are within a certain distance

Wed Nov 8 12:11:58 EST 2017

Daiyue Weng wrote:

> I have a nested dictionary of defaultdict(dict) whose sub dict have int
> keys and lists (list of ints) as values,
> 
> 'A' = {2092: [1573], 2093: [1576, 1575], 2094: [1577], 2095:
> [1574]}'B' = {2098: [1], 2099: [2, 3], 2101: [4], 2102: [5]}'C' =
> {2001: [6], 2003: [7, 8], 2004: [9], 2005: [10]}
> 
> I union two list values if the difference of their sub keys equals to 1
> under the same outer key e.g. A, and put the joined lists into another
> list. So this list will look like,
> 
> [1573, 1576, 1575, 1577, 1574][1, 2, 3][4, 5][6][7, 8, 9, 10]
> 
> here since 2092, 2093, 2094, 2095 are consecutive by 1, their values are
> put into a list [1573, 1576, 1575, 1577, 1574].
> 
> based on Detecting consecutive integers in a list
> <https://stackoverflow.com/questions/2361945/detecting-consecutive-> integers-in-a-list>,
> a simple solution can be built when the distance between two
> neighbouring sub keys is set to 1.
> 
> results = []for key, sub_dict in d.items():
>     sub_dict_keys = sorted(sub_dict.keys())
>     for k, g in groupby(enumerate(sub_dict_keys), lambda ix: ix[0] -
>     ix[1]):
>         consecutive_keys = list(map(itemgetter(1), g))
>         val_list = []
> 
>         for dict_key in consecutive_keys:
>             val_list.extend(sub_dict[dict_key])
> 
>         results.append(val_list)
> print(results)
> 
> , however, the code can only work when the difference between two keys is
> 1, I am wondering how to make the code account for an arbitrary distance,
> e.g. the distance between two consecutive keys are less than or equal to 2
> or 3, ... e.g. set the distance to 2,
> 
> 'A' = {2092: [1573], 2093: [1576, 1575], 2095: [1577], 2097:
> [1574]}'B' = {2098: [1], 2099: [2, 3], 2101: [4], 2102: [5]}'C' =
> {2001: [6], 2003: [7, 8], 2008: [9], 2009: [10]}
> 
> the result list will look like,
> 
> [1573, 1576, 1575, 1577, 1574][1, 2, 3, 4, 5][6, 7, 8][9, 10]

There's a lot of noise in your question. The actual problem has nothing to
do with dictionaries, it's about dividing a list into groups.

The answer boils down to building the groups by identifying gaps between
them instead of calculating a key that is common to the group. To find a gap 
you need two adjacent items, and while you can get those by combining tee 
and zip, or by constructing a stateful key function that remembers the 
previous value I prefer to start from scratch, with the grouped() generator 
replacing itertools.groupby():

$ cat group_neighbors_simple.py
from itertools import groupby

def grouped(items, is_neighbor):
    items = iter(items)
    try:
        prev = next(items)
    except StopIteration:
        return
    group = [prev]
    for value in items:
        if is_neighbor(prev, value):
            group.append(value)
        else:
            yield group
            group = [value]
        prev = value
    yield group

sample = [1, 1, 2, 4, 6, 7, 6]
print([
    [v for i, v in group] for key, group in
    groupby(enumerate(sample), lambda iv: iv[0] - iv[1])
])

print(list(grouped(sample, lambda a, b: b - a == 1)))
print(list(grouped(sample, lambda a, b: b - a == 2)))
print(list(grouped(sample, lambda a, b: abs(b - a) < 2)))
$ python3 group_neighbors_simple.py 
[[1], [1, 2], [4], [6, 7], [6]]
[[1], [1, 2], [4], [6, 7], [6]]
[[1], [1], [2, 4, 6], [7], [6]]
[[1, 1, 2], [4], [6, 7, 6]]