[Patches] [ python-Patches-1478993 ] Take advantage of BaseException/Exception split in cookielib

SourceForge.net noreply at sourceforge.net
Mon May 8 19:48:22 CEST 2006


Patches item #1478993, was opened at 2006-04-29 16:43
Message generated for change (Comment added) made by gbrandl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1478993&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Library (Lib)
Group: Python 2.5
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: John J Lee (jjlee)
Assigned to: Nobody/Anonymous (nobody)
Summary: Take advantage of BaseException/Exception split in cookielib

Initial Comment:
The patch takes advantage of the exception hierarchy
reorganisation to remove some ugly code in cookielib. 
It clarifies a couple of exception messages, too.


----------------------------------------------------------------------

>Comment By: Georg Brandl (gbrandl)
Date: 2006-05-08 17:48

Message:
Logged In: YES 
user_id=849994

Applied in rev. 45940.

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2006-04-30 14:38

Message:
Logged In: YES 
user_id=261020

(...and the new patch makes a tiny fix to a
slightly-inaccurate statement in the module docstring)

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2006-04-30 14:35

Message:
Logged In: YES 
user_id=261020

<shrug> I still think there should be documented guidelines
on this for both stdlib users and contributors (though they
should not live in the cookielib docs, of course!).

I've added a new patch to this tracker item,
cookielib_baseexception_2.patch, that adds an __all__ and
uses the name _warn_unhandled_exception (in addition to the
original patch contents).  I verified that the set of
documented module globals is identical to those listed in
the new __all__.  Let me know if I should also rename all
other non-public module globals defined in cookielib to have
initial underscores.

Just for the record, here is the list of all module globals
NOT listed in the new __all__ (as determined by doing
dir(cookielib) prior to applying the patch, and removing the
items I didn't add to __all__, so includes some noise from
things like 'import sys'):

['Absent', 'DAYS', 'DEFAULT_HTTP_PORT', 'EPOCH_YEAR',
'ESCAPED_CHAR_RE', 'HEADER_ESCAPE_RE',
'HEADER_JOIN_ESCAPE_RE', 'HEADER_QUOTED_VALUE_RE',
'HEADER_TOKEN_RE', 'HEADER_VALUE_RE', 'HTTP_PATH_SAFE',
'IPV4_RE', 'ISO_DATE_RE', 'LOOSE_HTTP_DATE_RE',
'MISSING_FILENAME_TEXT', 'MONTHS', 'MONTHS_LOWER',
'STRICT_DATE_RE', 'TIMEZONE_RE', 'UTC_ZONES', 'WEEKDAY_RE',
'__builtins__', '__doc__', '__file__', '__name__',
'_str2time', '_threading', '_timegm', 'copy', 'cut_port_re',
'debug', 'deepvalues', 'domain_match', 'eff_request_host',
'escape_path', 'http2time', 'httplib', 'is_HDN',
'is_third_party', 'iso2time', 'join_header_words',
'liberal_is_HDN', 'logging', 'lwp_cookie_str', 'month',
'offset_from_tz_string', 'parse_ns_headers', 're', 'reach',
'request_host', 'request_path', 'request_port',
'split_header_words', 'sys', 'time', 'time2isoz',
'time2netscape', 'timegm', 'unmatched',
'uppercase_escaped_char', 'urllib', 'urlparse',
'user_domain_match', 'vals_sorted_by_key',
'warn_unhandled_exception']


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2006-04-30 02:23

Message:
Logged In: YES 
user_id=357491

You should use an underscore if you don't have an __all__
defined.  This is mostly for protection for ``from cookielib
import *`` code.  But if you define an __all__ I think you
are fine.

You do not need to document that undocumented globals should
not be relied upon.  Yes, people should know better (I
personally got nailed by the Debian folk for an undocumented
function in site.py that I changed the parameters of).  But
the suggestions Neal and I are making are to protect people
who do their doc checking from the command-line and thus
just do ``import cookielib; dir(cookielib)``.  I know Python
is for use by adults, but sometimes going a small step to
protect the adolescents is also okay.  =)

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2006-04-30 01:31

Message:
Logged In: YES 
user_id=261020

Hmm, perhaps by "do no worse" you simply meant not to rename
the function in this tracker item to a name not beginning
with an initial underscore (since that would introduce a new
non-public module global that does not begin with an
underscore).

In which case, sorry for the rant. :-)

My questions after the rant still stand. ;-)


----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2006-04-30 01:26

Message:
Logged In: YES 
user_id=261020

Bleh.

e). Stdlib users should assume all undocumented module
globals are not part of the public API (I guess this should
be go somewhere near the library reference introduction)


----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2006-04-30 01:22

Message:
Logged In: YES 
user_id=261020

So, you (Neil) agree with my three numbered action points
below?  You repeat my suggestion that we document this as if
it were a new suggestion; did you read my comment?

Sigh, sorry for being a little grumpy about this, but it's
hard to "do no worse" for a project if that project doesn't
seem itself to be very sure what it considers "worse":

While I must say I *agree* with you that such practices are
not good, if I as somebody apparently unusually inclined to
heavy use of underscores (even in most of my module names,
in library code) actually thought, however foolishly, that I
was *following stdlib conventions* by using *fewer*
underscores (for reasons I'll try to refrain from debating
further here), it does indeed seem pretty clear we're in
need of explicit documentation on this!  So your advice to
"do no worse" is a little annoying at this point... :-)

OK, so, what should get documented, specifically?  And where
should documentation for module authors go?

a). Stdlib module authors should always use underscores for
non-public module globals.

b). Don't know about this one: should non-legacy stdlib
modules (viz, those that follow rule a)) define __all__? 
(perhaps a point against doing this is that it may encourage
import * ?).

c). Stdlib packages should use __init__ to export public names.

d). Any discrepancy between __all__ and the API
documentation is a bug.

e). Stdlib users should assume all non- (I guess this should
be go somewhere near the library reference introduction)

Finally, how about my point 2?  Should I add underscores to
cookielib module globals I consider non-public (== all
undocumented module globals), or not?

Thanks for the feedback, both of you!


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2006-04-30 00:37

Message:
Logged In: YES 
user_id=33168

John, at this point (2.x) we should at least do no worse. 
Don't export unnecessary vars in any new code.  We should
also start documenting the situation and work towards
improving it.  For 3k, we should do better and solidify the
rules and do massive cleanup (module by module).  This will
probably involve some arm twisting of Guido. :-)

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2006-04-30 00:27

Message:
Logged In: YES 
user_id=261020

I changed the exception detail strings to use %r to get
quotes around the filename and quoted "bad" line read from
the file.  That makes it clearer what is part of the
explanatory English text and what is part of the filename
(or part of the quoted bad line, as the case may be). 
Filenames can and do contain spaces, commas, etc.

Your other point stunned me a bit.  I don't think it had
ever even really *occurred* to me that stdlib users might
consider stdlib module globals that are not documented as
public.  Ironically, I think that's because the code from
which cookielib derives is much stricter about this, all
modules starting with '_' and package __init__ exporting a
short list of names -- I guess I thought I was following
stdlib conventions by *not* adding initial underscores all
over the place.  Looking at some other stdlib code, I see
that underscores would have been more conventional after all.

Searching for reassurance, I discovered this from one of
your old python-dev summaries that confirms that
undocumented stdlib module globals are not considered part
of the module public interface:

http://www.python.org/dev/summary/2004-07-16_2004-07-31/#use-the-docs-to-know-what-the-public-api-is-people

e.g. from Tim Peters:

"""
As you noted later, it wasn't part of keyword's documented
interface, and you *always* act at your own risk when you go
beyond the docs.
"""

However, I don't see that this is explicitly documented,
which seems unfortunate to me (even though Tim's statement
is true regardless of any convention Python might have).

So, I guess I should:

1. Write something explicit about this (along the lines of
"Use undocumented module globals at your own risk") for the
stdlib library docs -- perhaps starting from Tim's post --
and submit that as a doc patch.

2. Leave all module global names in cookielib unchanged (so
people using those functions don't suffer gratuitous
breakage, even though any such people are asking for trouble
in the long run).  However, in the thread above, Michael
Hudson disagrees with that, and suggests all such module
globals be renamed.  So suggestions are welcome here on the
best course of action.

3. As you suggest, submit a patch to add an __all__ to
cookielib.


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2006-04-29 21:12

Message:
Logged In: YES 
user_id=357491

Overall the patch looks fine (on vacation so not up for
applying and handling any possible failures so not going to
assign to myself).  But a question and a suggestion.

Why were the error strings changed to use the repr instead
of the string representation?  What does it buy you?

And if you are going to be changing the function name, you
might want to consider using a leading underscore to prevent
people from using it or getting exported.  Otherwise I would
define __all__ for the module.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1478993&group_id=5470


More information about the Patches mailing list