Mapping with continguous ranges of keys

Fri Dec 16 13:39:34 EST 2016

On Thursday, 15 December 2016 17:06:39 UTC, Steve D'Aprano  wrote:
> I have some key:value data where the keys often are found in contiguous
> ranges with identical values. For example:
> 
> {1: "foo",
>  2: "foo",
>  3: "foo",
>  # same for keys 4 through 99
>  100: "foo",
>  101: "bar",
>  102: "bar",
>  103: "foobar",
>  104: "bar",
>  105: "foo", 
> }
> ...
> -- 
> Steve

All the answers seem to rely on in-memory solutions.  But isn't the problem a classic data design problem (cf Codd) with two tables.

CREATE TABLE keys   (id    INTEGER NOT NULL PRIMARY KEY,
                     kkey  INTEGER,
                     UNIQUE (kkey) );
                        ## eg id = 999, kkey=101
CREATE TABLE values (id    INTEGER NOT NULL PRIMARY KEY,
                     k_id  INTEGER,
                     value VARCHAR,
                     UNIQUE (k_id, value),
                     FOREIGN KEY (k_id) REFERENCES keys(id));
                        ## eg k_id = 999, value = "bar"

For example, Python/SQLITE can parse the list of key:value pairs.  key is looked up in keys -- and  a keys row is added if the key is new.  The keys id is saved.

k_id--value pair is looked up -- and a row is added if the pair is new.

Some of this can be simplified by relying on SQL to handle non-UNIQUE errors.
This approach may be slower than in-memory processing, but it has almost no database size limits.