[Doc-SIG] suggestions for a PEP

Tony J Ibbs (Tibs) tony@lsl.co.uk
Wed, 21 Mar 2001 10:32:36 -0000


> >> Have we
> >> considered the classic spec for labels to appear left of a
> colon, namely
> >> RFC 822 (e-mail headers) and its kin ?  I think that
> basically comes
> >> down to r'\w+(-\w+)*' as regex, generally specified
> >> [...]
>
> Fine with me.

I'm assuming we're talking about paragraph labels.

I think we should just go with the English definition of a word, which
means [-A-Za-z], and leave it at that. It is *meant* to look like a
word.

Just because there is a colon there doesn't mean it is related to other
fields that happen to end with a colon.

The current default labels are::

    label_dict = {"Arguments":"arguments",
                  "Author":"author",
                  "Authors":"author",
                  "Dedication":"dedication",
                  "History":"history",
                  "Raises":"raises",
                  "References":"references",
                  "Returns":"returns",
                  "Version":"version",
                 }

If one is translating (slightly modified format) PEPs, then one would
instead use::

    builder.label_dict = {"PEP":"pep",
                          "Title":"title",
                          "Version":"version",
                          "Author":"author",
                          "Status":"status",
                          "Type":"type",
                          "Created":"created",
                          "Post-History":"post-history",
                          "Discussions-To":"discussions-to",
                          }

I think "keep it simple" is required here - these labels are meant to be
few and simple, so English words seems sensible to me. I would thus vote
against underlines and against digits.

Also, validation aside, I don't *use* a regular expression - I look for
the right "shape" of paragraph (1 line, colon in it) and check what is
to the left of the colon against the dictionary. From *my* point of view
the legitimate characters idea only comes in with a validation phase (of
course, it would be different for Edward).

> Basically re defines '\w' = '[0-9a-zA-Z_]

Erm - basically it doesn't - it invokes "locales" which makes life more
complex (and I have no idea what sre does about '\w').