[Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

Nick Coghlan ncoghlan at gmail.com
Sun Apr 19 01:06:41 CEST 2009


Mart Sõmermaa wrote:
> On Sat, Apr 18, 2009 at 3:41 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Yep - Guido has pointed out in a few different API design discussions
>> that a boolean flag that is almost always set to a literal True or False
>> is a good sign that there are two functions involved rather than just
>> one. There are exceptions to that guideline (e.g. the reverse argument
>> for sorted and list.sort), but they aren't common, and even when they do
>> crop up, making them keyword-only arguments is strongly recommended.
> 
> As you yourself previously noted -- "it is often
> better to use *args for the two positional arguments - it avoids
> accidental name conflicts between the positional arguments and arbitrary
> keyword arguments" -- kwargs may cause name conflicts.

Despite what I said earlier, it is probably OK to use named parameters
on the function in this case, especially since you have 3 optional
arguments that someone may want to specify independently of each other.
If someone really wants to add a query parameter to their URL that
conflicts with one of the function parameter names then they can pass
them in the same way they would pass in parameters that don't meet the
rules for a Python identifier (i.e. using the explicit params dictionary).

Something that can be done to even further reduce the chance of
conflicts is to prefix the function parameter names with underscores:

  def add_query_params(_url, _dups, _params, _sep, **kwargs)

That said, I'm starting to wonder if an even better option may be to
just drop the kwargs support from the function and require people to
always supply a parameters dictionary. That would simplify the signature
to the quite straightforward:

  def add_query_params(url, params, allow_dups=True, sep='&')

The "keyword arguments as query parameters" style would still be
supported via dict's own constructor:

>>> add_query_params('foo', dict(bar='baz'))
'foo?bar=baz'

>>> add_query_params('http://example.com/a/b/c?a=b', dict(b='d'))
'http://example.com/a/b/c?a=b&b=d'

>>> add_query_params('http://example.com/a/b/c?a=b&c=q',
... dict(a='b', b='d', c='q'))
'http://example.com/a/b/c?a=b&c=q&a=b&c=q&b=d'

>>> add_query_params('http://example.com/a/b/c?a=b', dict(a='c', b='d'))
'http://example.com/a/b/c?a=b&a=c&b=d'

This also makes the transition to a different container type (such as
OrderedDict) cleaner, since you will already be constructing a separate
object to hold the new parameters.

> But I also agree, that the current proliferation of positional args is ugly.
> 
> add_query_params_no_dups() would be suboptimal though, as there are
> currently three different ways to handle the duplicates:
> * allow duplicates everywhere (True),
> * remove duplicate *values* for the same key (False),
> * behave like dict.update -- remove duplicate *keys*, unless
> explicitly passed a list (None).

So if we went the multiple functions route, we would have at least:

add_query_params_allow_duplicates()
add_query_params_ignore_duplicate_items()
add_query_params_ignore_duplicate_keys()

I agree that isn't a good option, but mapping True/False/None to those
specific behaviours also seems rather arbitrary (specifically, it is
difficult to remember which of "allow_dups=False" and "allow_dups=None"
means to ignore any duplicate keys and which means to ignore only
duplicate items). It also doesn't provide a clear mechanism for
extension (e.g. what if someone wanted duplicate params to trigger an
exception?)

Perhaps the extra argument should just be a key/value pair filtering
function, and we provide functions for the three existing behaviours
(i.e. allow_duplicates(), ignore_duplicate_keys(),
ignore_duplicate_items()) in the urllib.parse module.

> (See the documentation at
> http://github.com/mrts/qparams/blob/bf1b29ad46f9d848d5609de6de0bfac1200da310/qparams.py
> ).

Note that your implementation and docstring currently conflict with each
other - the docstring says "pass them via a dictionary in second
argument:" but the dictionary is currently the third argument (the
docstring also later refers to passing OrderedDictionary as the second
argument).

Phrases like "second optional argument" and "fourth optional argument"
are also ambiguous - do they refer to "the second argument, which
happens to be optional" or to "the second of the optional arguments".
The fact that changing the function signature to disallow keyword
argument would make the optional parameters easier to refer to is a big
win in my book.

> Additionally, as proposed by Antoine Pitrou, removing keys could be implemented.
> 
> It feels awkward to start a PEP for such a marginal feature, but
> clearly a couple of enlightened design decisions are required.

Probably not a PEP - just a couple of documented design decisions on a
tracker item pointing to discussion on this list for the rationale.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


More information about the Python-Dev mailing list