[Python-ideas] dictionary constructor should not allow duplicate keys

Chris Angelico rosuav at gmail.com
Wed May 4 10:54:22 EDT 2016


On Thu, May 5, 2016 at 12:36 AM, David Mertz <mertz at gnosis.cx> wrote:
> The thing that nudged me to -1 is code something like the following, which
> I've actually written for completely valid use-cases:
>
> def code_generate(data_source, output=open('data.py','w')):
>     """Write data to loadable Python module
>
>     data_source may contain multiple occurrences of each key.
>
>     Later occurrences are "better" than earlier ones
>
>     """
>     print("data = {", file=output)
>     for key, val in data_source:
>         print("    %s: %s," % (key, val), file=output)
>     print("}", file=output)
>
>
> I really don't want the Python interpreter to complain about my perfectly
> valid 'data.py' module that I auto-generated knowing full well it might have
> duplicate literal keys.  Yes, of course I could write a more complex
> function that explicitly pruned keys if they were duplicates before writing
> to 'data.py'.

Or, much more simply, you could just dump it straight out like this:

print("data = dict(%r)" % list(data_source), file=output)

If the dict display starts rejecting duplicates, these kinds of code
generators should be easily tweakable. The only risk would be if
duplicates can *theoretically* happen, but are incredibly rare, such
that the code runs for years without anyone noticing.

This is a way of catching compile-time-detectable errors. Python has
generally been pretty helpful with that kind of thing, even to the
point of special-casing print-used-as-a-statement to tell you to put
parens around it. While I fully understand that there are reasons
against doing this (including "that's the job of linters, not the core
language"), I do _not_ understand the apparent attitude that it's a
fundamentally bad thing to do. Most cases where a dict display is
used, it's intended to have unique keys.

ChrisA


More information about the Python-ideas mailing list