[Web-SIG] PasteDeploy 0.1

Phillip J. Eby pje at telecommunity.com
Tue Aug 23 19:08:21 CEST 2005


At 11:03 AM 8/23/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>>It's important to me, and it's not intuitive to me what you envision. So 
>>>I feel a need to services in action, replacing global configuration.
>>
>>Using a config service in a factory to get a default argument value:
>>     def some_app_factory(parent_component, **args):
>>         config = parent_component.get_service("global_config")
>>         args.setdefault('someparam', config['someparam'])
>
>So "parent_component" is some special object created by the config loader...

Each component in a "pipeline" receives the previous non-wrapper component 
in the pipeline as its parent.  The top-level parent would be an object 
whose get_service() always returns None or raises an error or something 
like that.  (I'm being vague because we haven't started nailing down a 
precise "services" spec and don't want to mix it in with the syntax 
discussion for now.)


>>Registering a config service (old syntax):
>>     [globalconfigservice from SomeEgg]
>>     someparam = "foo"
>
>...and I assume in this case globalconfigservice does something along the 
>lines:
>
>def globalconfigservice(parent_component, next, **args):
>     config = parent_component.get_service('global_config').copy()
>     config.update(args)
>     component = Component(parent_component)
>     component.save_service('global_config', config)
>     return next(component)
>
>Obviously I'm making up the component interface here.

I was thinking something more like this:

     class globalconfigservice:
         def __init__(self, parent_component, **args):
             self._parent = parent_component
             self._data = args

         def get_service(self, key):
             if key=='global_config':
                 return self
             return self._parent.get_service(key)

         def __getitem__(self,key):
             try:
                 return self._data[key]
             except KeyError:
                 previous = self._parent.get_service('global_config')
                 if previous is None:
                     raise
                 result = self._data[key] = previous[key]
                 return result

This isn't a wrapper, so it doesn't know about the "next" component, and 
doesn't need to.  Parent components can be shared by multiple 
children.  Wrappers, on the other hand, transform their child, and are not 
considered a parent component.

>>     [...next component in the stack...]
>>The config service would respond to 'get_service("global_config")' by 
>>returning self.
>>The idea is that when you chain non-wrapper components in a pipeline, 
>>each one gets the previous component as its "parent component", so you 
>>can "acquire" services from your parents.  Components nearer to you (i.e. 
>>more local) can override more global service definitions.
>
>OK, well now I'm a bit confused... is globalconfigservice a wrapper?  I 
>assume globalconfigservice can't modify the parent_component it is passed, 
>and has to create a new one?

No. The globalconfigservice *becomes* the parent_component of the 
components that follow it, until another non-wrapper component is defined 
(which then becomes the parent of those that follow it, and so on).


>Hmm... it would be nice to allow configuration filenames to be 
>variables.  Though "in" and "from" don't scream "config file" and "egg" to 
>me -- they are both equally vague terms.  I'd rather see "in egg" and "in 
>file".

I'd rather just use 'from ProjectNameHere' and 'from "config_URL_here"', 
since these two syntaxes can cover everything you or I have thus far imagined.


>>An interesting question is whether you should be able to refer to nested 
>>definitions as factory prototypes (ala your auth2/auth) or whether only 
>>top-level names should be usable.  For example in this:
>>     foo: bar from baz
>>     spam:
>>         foo: snickety from lemon
>>         scuzz: foo
>>         sprim:
>>             thingy: foo
>>Does "scuzz: foo" refer to the inner foo or the outer foo?  What about 
>>"thingy: foo"?
>>I'm inclined to say that both refer to the spam: foo rather than the 
>>outermost "foo".  i.e., more or less the same rules as Python scopes.
>
>I agree.  Will spam.foo be an unambiguous representation?  It seems like 
>it should be.  Would there be a global object, like globals.foo?

I guess there could be, but then I lean towards making a file be one object 
by default.  If you want a named top-level, you could do:

main from:
     bar is squidge from spim
     main is bar:
        foo = "whee"

That is, we could allow targetless "from" to promote a name from a new 
child context.


>>One minor problem with this syntax overall, though, is that it's a bit 
>>context-dependent.  Whether "foo:" means "define foo" or "create a foo" 
>>is just a matter of alternating layers.  It would be better if the syntax 
>>were less ambiguous, e.g.:
>
>I don't see the distinction between "define" and "create".

By define I mean "bind the following to the name foo", and by create I mean 
"create an instance using the foo factory".


>   By this distinction do you mean that pieces of the loading process 
> lazy?  Can all parts be lazy?  (I.e., the config file defines named 
> factories, the body of sections isn't evaluated until those factories are 
> invoked)

No; I was strictly speaking of the context-specific nature of that specific 
syntax, because it alternates layers of defining names and invoking 
factories, such that a given snippet of syntax can't be independently 
understood by a reader.


>>     main :=
>>         login wrapper from Paste:
>>             # blah
>>         urlmap from Paste:
>>             "/"     := static
>>             "/blog" := main in "blog.ini"
>>             "/cms"  :=
>>                  auth wrapper from Paste:
>>                      require_role = "admin"
>>                  filebrowser from FileBrowser:
>>                      document_root = static.document_root
>
>...for instance, when this was "main = pipeline:", it was clear this was 
>just another "create", except using "pipeline" to create the object, and 
>pipeline looks at the section contents.  The unnamed sections below it are 
>just like positional parameters (would named sections be ordered? -- I've 
>always wanted ordered class statements, I imagine I'd like to keep order 
>here too)

I don't really want them to be positional parameters, I want them to 
stack.  If pipelines were rare, I'd just nest them and use e.g. a 'next' 
keyword.  However, nested pipelines mean you have to indent everything 
every time you add a new wrapper, which would be like having to do "else: 
if:" instead of "elif:".


>I don't have any attachment to "pipeline", but I think some word is fine 
>in that position, and I don't see why this is a particularly "special" 
>construct (except of course that it should be builtin).  Would this be 
>allowed?:
>
>main = urlmap from Paste:
>   "/" = static from Paste:
>     document_root = "/home/me/htdocs"

This syntax is ambiguous, because you don't know if the thing after the '=' 
should be parsed as a Python expression or as a constructor expression, at 
least not without significant parser lookahead.  Significant lookahead 
isn't that good for a human reader, either.  That's why I think we need 
syntax to distinguish "object definition" from "value assignment".


>>But that doesn't actually seem to help visually, and makes it harder to 
>>write because you have to remember all the time whether you need ":" or 
>>":=".  Maybe this would be better:
>>     main is:
>>         login wrapper from Paste:
>>             # blah
>>         urlmap from Paste:
>>             match_mode = "longest"
>>             "/" is static
>>             "/blog" is main in "blog.ini"
>>             "/cms" is:
>>                  auth wrapper from Paste:
>>                      require_role = "admin"
>>                  filebrowser from FileBrowser:
>>                      document_root = static.document_root
>
>While I'm not attached to "pipeline", "is" is about as vague as "in" and 
>"from".

Well, I'm fine with dropping "in", so we would have only two special 
keywords, "is" and "from", and they're not interchangeable, so there's a 
minimum of ambiguity.  Also, I chose "from" because of the similarity to 
importing, and "is" implies object identity as well as definition (e.g. 
"the definition of main is...").

(One of the things I'm trying to do with this syntax, btw, is stick with 
Python's tokens and keywords, so that the tokenize module can do most of 
the heavy lifting, and I'd also prefer we didn't introduce new reserved 
words that aren't keywords in Python.)


>Assuming "main" was a special magic name for the primary application.  I 
>would certainly assume that reading the config file (even I'd never seen 
>these config files before).  I, for instance, do not like Python's "if 
>__name__=='__main__'" idiom; I think using a conventional name to indicate 
>the primary function of a file is just fine.

Well, __name__=='__main__' doesn't apply here.  I see this as the 
difference between def statements and regular statements in a 
module.  Function bodies aren't executed unless they're used, so it seems 
wrong to me to have a def main.  If the magic name were __main__ I could 
accept it more, except for the fact that it would then highlight the point 
that if the idiom is common enough to need a magic name, then it's common 
enough to warrant a way of doing it without a name!


>>Okay.  The "in" syntax I gave above allows that, although I could also go 
>>for only using "from", as long as config URLs are quoted strings.  I also 
>>think the strings should be relative or absolute URLs, rather than 
>>filenames.  (So that '/' has the same meaning on all platforms.)  That 
>>will be something of a pain for Windows users who may need to include 
>>drive letters, but oh well.  We can always treat the letters A-Z as a 
>>special "file:" protocol to fix that.  :)
>
>By URLs, do you just mean that they use URL syntax, URL quoting of 
>filenames, etc?

Yes.  And that relative URLs are interpreted as relative to the URL that 
was used to load the file they're in.  But also that absolute URLs are 
allowed, which may include application or framework/specific URLs, and the 
loading facility should be hookable to do the actual URL joining and 
retrieving.  ZConfig works like this, and PEAK hooks into it so that all of 
PEAK's special urls like "pkgfile:" and such can be used.  I've definitely 
got an eye on using this format we're discussing as a nice schema-free 
alternative to ZConfig.


>   That's fine by me; I normalize \ to / in paste.deploy and run 
> urllib.unquote on the result already.  I'm not sure what to do with \'s; 
> they are dumb and annoying and I hate them, but when they slip into the 
> system it should at least handle them reasonably.

I think \ should have its normal meaning in a string literal, unless a 
"raw" literal is used.


>While it is slightly annoying to keep track of it, I think it's important 
>that filenames be defined as relative to the config file that they are 
>contained in.  The current working directory is useless, and always using 
>absolute filenames makes config files very hard to reuse.

Agreed; they should be interpreted as URLs relative to the current 
file.  ZConfig (and PEAK's wrapping of it) both use this approach and it 
works well.


>>>>Conversely, if I assume that some further description is required, I 
>>>>would want to say "pypi:" or "project:" or something else of that sort, 
>>>>because "egg" isn't the essential nature of the thing; the name is a 
>>>>*project* name, while eggs are an implementation detail.
>>>
>>>egg: is an access method, just like http: or whatever.  It doesn't say 
>>>what the URI describes, just how to find it.
>>
>>Ah, but that's just it.  The project name is a URN, not a URL, precisely 
>>because it *doesn't* describe how to locate the resource, it just names 
>>the resource and tells the system to go find it.
>
>Well, sure it says how to find it -- load pkg_resources, get the package 
>by name, etc.  There's always a "system, go do stuff for me" step, that's 
>how computers work.

I'm referring here to the technical meaning of a "naming" system versus an 
"addressing" system.  An addressing system identifies a canonical "naming 
authority" that provides global uniqueness, whereas a "naming" system only 
implies the context in which the name may be understood.  You can read up 
the RFCs on URNs vs. URLs (which are both subtypes of URI), or you can read 
up on JNDI, LDAP, x.500 and other "naming" services if you don't believe 
me.  An 'egg:' URI would be a URN, not a URL, and the 'egg' makes no sense 
in either case, because an egg is a resource *type*, not a naming or 
addressing scheme.  Thus, if I were to create a URI scheme for eggs, I 
would use a name like 'pypi:' or 'py-project:' or something like that, to 
denote the naming scheme.



More information about the Web-SIG mailing list