XML vs Python?

Mike Brenner mikeb at mitre.org
Sun Jan 19 07:57:09 EST 2003


I agree on some of the points, but disagree on two points.


Point 1. Python structures have development and execution speed.

I agree. Python can get even better than a list
of dictionaries, using a structure which is either
a LIst or a diCTionary at each level (a "lict").
Just recursively get the children. Notice
notice whether the attributes are unique to 
decide whether to make a list or a dictionary.


Point 2. Objects are more obvious than python dictionaries.

I disagree. The one thing in mathematics one needs
to learn in order to do topology, category theory,
abstract algebra, homology, cohomology, braids, and K-Theory
is: the arrow. An arrow from A to B means that
"for each a in set A, there is exactly one b in set B."
>From this all more advanced mathematical concepts form.
The two most important of these mathematical concepts
which are based on the arrow diagram are:

	- the SQL dependency model
	- the Python dictionary model

In my opinion, the consecutive python dictionary references:

	database["table_name"]["row_key"]["column_name"]

is easier to work with than:

	database.get_cell(database.get_table("table_name"),
			  "row_key",
		          database.get_column("table_name", "column_name"))

And very similar abbreviations can be used for efficiency in both cases.

	table=database["table_name"]
	row=table["row_key"]
	cell=row["column_name"]

		vs. 

	table=database.get_table("table_name")
	column_id=table.get_column("column_name")
	row=table.get_row("row_key")
	cell=row.get_column(column_id)

			  
Point 3. A "standard" object model might be better.

I disagree. For example, in Java, the new JDOM 
model makes it one level less deep to access the 
nodes in an XML document. Since Java came up with
a new "standard" model that is easier to work with
and also faster at runtime, than DOM, it follows
that current standards are not perfect. JDOM is
just one step better than DOM. There are many steps
better to be gotten. 

The same in Python. We are nowhere near a good 
object model, nor does RDF, SVG, or any of the 
other complex applications based on XML come 
close to the level of the model we will shortly 
be needing.

We should always be looking to improve our data
models, and an object model is just one aspect
of our data model. Even objects themselves
should be reevaluated and thrown away when 
something better is discovered.

For example, bottom-up object-oriented design
replaced top-down design just about everywhere
except at the requirements level where the
objects come together to actually solve a problem.

That happened because we need to test software
to make it work, but people don't like to test,
and bottom-up requires less testing. Therefore,
the replacement for object-oriented design will
be something that requires even less testing than
objects.

Of course, the answer is arrows. The replacement
for objects will be dataflow diagrams with
their constraints expresses as mathematical
dependencies between sets (that is, arrow
diagrams). UML use cases, dataflow diagrams,
hierarchy diagrams, finite state machines,
and Sequence Diagrams are just the beginning.

Someday someone will modify UML and Object
Oriented Design by:

	- adding "arrow-dependencies" strong
	  enough to express geometric and
	  semantical constraints
	- changing the finite state machines
	  to colored Petri nets
	- combining the sequence diagrams with
	  timing charts
	- adding sprite graphics to SVG
	- adding XLINK, XSQL, etc., to browsers
	- standardizing the javascript on browsers
	- permitting browser to print multi-page diagrams
	- enhancing activities to include the whole
	  2D hierarchy of processes-activities-tasks
	  and requirements-activities-outputs,
	  annotating them with the tools used
	- adding a default web rendering of data
	  as relational 2D tables like spreadsheets
	  with a SQL query in each cell, with
	  mouse adjustable-width cells, floating
	  headers, and updating just like a spreadsheet
	  only over the web (any other rendering
	  would require a few minutes of rendering
	  programming)

after which objects will be obsolete. This could happen
as early as 5 years from now.



Paul wrote: 
> Before the XML heavyweights get in on this, I would make the following points:> 
>   * A lists and dictionaries approach certainly has its merits:
>     speed (at least according to the PyRXP people), familiarity,
>     interoperability with other Python stuff.
>
>  * Once you start to look into more advanced features, it would
>    seem to me that the lists and dictionaries model approaches
>    such a level of complexity that a "proper" object model would
>    be better. That is because you would need to find more
>    efficient ways (in terms of expression) of requesting an
>    attribute with a given namespace, for example, than examining
>    raw dictionaries. Certainly, it appears to me that...
>
>    attr = element.getAttributeNS(some_namespace, some_name)
>
>    ...is more obvious than...
>
>      attr = element[some_namespace][some_name]
>
>    ...especially if it's hidden in lots of heavy XML processing
>    code.
>
>  * You could write your own object model. Frederik Lundh's
>    ElementTree is like this. I'm not 100% convinced that it's
>    beneficial to drop a standard object model that is widely
>    understood for another which is more Pythonic, although this
>    is a tradeoff that might be appropriate for some situations.






More information about the Python-list mailing list