[Pythonmac-SIG] appscript terminology caching

has hengist.podd at virgin.net
Fri Oct 15 12:36:27 CEST 2004


Jack Jansen wrote:

>I think you want to start with something of an architecture for 
>getting terminology, something similar to Python's import mechanism: 
>there are a number of engines that allow you to get at terminology 
>data (one that reads aete resources, one that starts the application 
>and asks it,

Getting terminology from applications is the OSA's job. This is 
currently done by OSAGetAppTerminology, though eventually sdefs will 
supercede aetes, at which point a new OSA call will no doubt be added 
for getting terminology in sdef format. osaterminology should use 
this call on OSes that provide it, otherwise it should use 
OSAGetAppTerminology.


>one that uses the cache, etc), and these all use a common well-defined API.

There are several places a standard API could be located:

1. After osaterminology.aeteparser.getaete. This would capture the 
raw aete data directly returned from OSAGetAppTerminology. A cache 
module generated here would contain a list of unparsed aete strings.

2. Between the aete parser and consumers. (Note: the 0.6.0 rewrite 
has already decoupled these via a SAX-style API; see 
osaterminology.aeteparser.__init__.) A cache module generated here 
would basically be a long list of method calls that recreate the 
original parsing events when fed a consumer instance.

3. After appscript.terminology module. This would capture fully 
processed, specialised data structures containing minimal name-code 
translation information needed to build references. (See the dump() 
function I just added.)


A cache inserted at any of these locations would do for remote 
scripting support. For optimising appscript performance, later is 
better. There are two performance humps affecting app object creation:

- The first call to OSAGetAppTerminology is very slow, ~0.5sec. 
Subsequent calls are very fast; it's just that first one that's the 
killer. The OSATerminology extension has to provide a scripting 
component instance to OSAGetAppTerminology, and uses the AppleScript 
component as it's about the only one available; unfortunately the AS 
component is rather slow to load. If anyone has any ideas for routing 
around this, e.g. using a faster loading dummy component or turning 
the OSATerminology extension into a permanently running daemon, I'm 
all ears.

- The time it takes to parse the terminology and turn it into Python 
data structures. This is a non-issue for the help systems which don't 
require top performance, but it's a definite problem for application 
scripting as slow starts are a real turn-off, particularly for folk 
used to AppleScript. (AS performs all terminology translation work at 
compile-time so doesn't have the same runtime overheads as appscript.)


Inserting a cache at #1 or #2 will avoid the first of these humps, 
reducing startup time by 0.5sec. Inserting it at #3 will avoid both 
and effectively eliminate startup overhead completely: importing a 
dump()ed terminology module is 10x faster than parsing that 
terminology from source (e.g. 0.02sec instead of 0.2sec).

Note that a cache at #3 will be of little general use, as the only 
data it holds is name<->AE code translation tables. For example, both 
help systems require full terminology, so when scripting remote 
applications the interactive help() would be unavailable and HTML 
terminology docs would need to be generated on the host machine then 
distributed to users. But how big an issue this would be I dunno; I 
reckon this is probably the lesser of two evils but what do others 
think?


Of course, there's no technical reason why a caching system couldn't 
support >1 of these options by supporting multiple converters and 
allowing clients to request the type of cached data they require. The 
cache system would be concerned with matching cached terminologies to 
the applications specified by the client, not with the specific 
content of individual terminology modules. I'd be a tad cautious here 
as there's obvious potential for the implementation and user 
interface to balloon out of control, but it's certainly an option to 
consider.


>You then provide a default set of engines in a default order, but 
>the users can change this order, or add their own special-purpose 
>engine to the front, etc.

I think pluggable engines would be overkill. There's only two places 
terminology data could come from: the original application or the 
cache. And there's only one resolution order that really makes sense: 
if a cache is available then get the terminology from that, otherwise 
get it from the application (or raise an error if that's not 
possible, i.e. the application is remote).


>How you decide that a terminology is "right" is also an issue: if 
>you want terminology for MyApp 1.0.1 and the first importer can 
>supply it for MyApp 1.0, do you continue searching?

That's an easy one: in an automatic terminology retrieval system, the 
cached terminology must _exactly_ match the version of application 
being scripted. If it doesn't, the terminology must be retrieved from 
the application (if it's a local app) or an error raised (if it's a 
remote app). In a manual system where the user specifies the cached 
terminology to use as part of the app constructor, it would be up to 
them to ensure they used the correct one; c.f. AppleScript's 'using 
terminology from' blocks.

Admittedly, in a mostly-automatic system we might just wing it and 
assume that as long as the url/path matches, the application and the 
cached terminology are up to date. Rather less safe than checking 
version numbers, mind you, so some folk are bound to trip up 
following system/application updates since the cache system won't 
pick up on the change itself. I'd be very reluctant to make cache 
generation full automatic in this case, leaving it to the user to add 
to and update the cache directory completely by themselves.


>And you probably have to think of *why* you want caching. The choice 
>of how to organise things depends on whether the main goal is to 
>make remote scripting possible, or to speed up access to local apps, 
>or whatever other reason there may be.

Indeed. Keep thrashing. :)

Thanks,

has

-- 
http://freespace.virgin.net/hamish.sanderson/


More information about the Pythonmac-SIG mailing list