[DB-SIG] Re: [Python-Dev] database APIs

Luke Kenneth Casson Leighton lkcl@samba-tng.org
Mon, 3 Feb 2003 16:17:32 +0000


On Mon, Feb 03, 2003 at 09:48:54AM -0500, Kevin Jacobs wrote:
> On Mon, 3 Feb 2003, Luke Kenneth Casson Leighton wrote:
> >  the more complex requirements that i have are more like a substitute
> >  for SQL "VIEWs".  proper views, not just read-only VIEWs like some
> >  SQL database have.
> 
> Logical read/write views don't really solve your problem either.  

 you are right.

 if you assume that certain steps need to be performed manually,
 it covers a good proportion of the job, but definitely not
 enough.

 my goal ultimately is to be able to write a file, maybe in
 XML or other suitable text file format, that specifies all the
 reports i want, that they are just tabular display of records,
 generation of an input form, or generation of create / delete
 input form.

 and for there to be like a _single_ cgi program that can
 take such a specification file and just... do everything.

 clearly, the validation of some variable values as you specify
 in 3) below cannot be entirely done in an automated manner:
 some things are going to be too complex to specify and you'd
 just have to put a function call out to do the validation.

 ... unless you have one of those really neato fancy SQL
 databases that can do CONSTRAINTS etc and can specify logic
 tests on the input data ;)
 

> The heart
> of your situation involves (roughly) these processing steps:
 
  for an update / data input process, yes.

  the process is slightly different and much simpler
  for report generation.


>   1) Extract and unencode variables from the CGI enviornment

 yep.

>   2) Decode the variable names and map them to schema elements
>   3) Validate the variable values against the schema and business logic
>   4) Group variables into transactional batches and sub-batches
>      based on the physical schema
>   5) Render and execute SQL to perform updates


 what do you define as schema elements?
 i think i can work out what you mean by schema by context.

 against business logic, yep.

 i am marginally lost but not entirely lost in terminology on
 2, 3 and 4.

	OKAY, i think i know what you mean in 4).
	
 in the code i have written, i have grouped
 variables into dictionaries, where the field name is
 the key and the value is the 

 i also have a second input on the "update" functions
 from which the WHERE clause of an UPDATE statement
 is generated; the first argument is the dictionary
 of fields from which the SET clause is generated.



 assuming that code to do all the above is auto-generated from
 either a SQL file (with CREATE and ALTER TABLE statements
 including FOREIGN KEY statements from which JOINs can be
 auto-generated) or from a separate IDL file, then code can
 be generated for the following:

 1) the types of variable fields from the database definition
    can auto-generate type-checking and also data entry forms
    and their corresponding input CGIs
    (dates, ints, floats, char, all can be got from the
	 SQL CREATE statement)

 2) variable names for the viewing, generation of data entry
    forms and their corresponding input CGIs can all be done
	too.

	however, this is where i believe the SQL file alone is
	insufficient to auto-generate such code: you don't
	necessarily want _all_ of the fields to be presented
	in a single input form; you don't necessarily want all
	of the fields to be viewed in a report.

 4) grouping variables into transactional batches

	very easy to auto-generate, if you have the IDL file
	specifying things.

	what i have done at present, which is where i am getting
	in a muddle, is that the grouping is done by looking
	for the name of the table as the first part of the
	CGI POST arguments.

 	only those CGI arguments with the first part of the table
	and then the field name as the second part of any given CGI
	argument is used to generate the transactional batch
	input for stage 5).


 5) render and execute SQL statements.

    here, again, this is very easy: the logic is that you receive
	a dictionary of field names and their values and the
	SET part of the SQL statement can be generated from it;
	you also receive some secondary input from which the
	WHERE clause of the UPDATE can be generated.

	easy :)



> Most of the above can be done fairly generically in Python, with minimal
> code needed for each particular form you are processing.  However, there is
> no short-cut magic bullet "save" routine or "View" object that can do all of
> the above.  

 ? ;)

 i invite you to take a look at what i have started on - pysqldb
 contains some of the base classes and functions, following on
 from that is sqlgen.py and sqlhtmlgen.py which take a SQL file
 and auto-generate enough python code to do exactly the above.

 where it breaks down is that the SQL file itself is not sufficient
 to generate:

 - complex sub-reports on one subset of fields from one table JOINed
   against another subset of fields in another

 - business logic
 
 what i HAVEN'T added is logic to auto-generate from CONSTRAINTS
 because my present SQL knowledge runs out at MySQL which doesn't
 obey constraints, doesn't have VIEWs, and the MySQL authors are
 arrogant enough to even spell out in their documentation that
 they don't believe foreign key constraints are necessary.  grr.




> I recommend focusing on implementing concise and powerful
> abstractions for each step seperately.

 *vigorous nods*


> >  one of the input forms has now got so complex that the update
> >  function is 160 lines of code and the add function is 100.
> >  and that's simply turning HTML POST cgi form dictionaries / lists
> >  into dictionaries
> 
> Then you need to write smarter code!  

 :) i realise this :)

 i have a neat system to generate html forms and tables, a
 neat system to do database queries (if you can call single
 function to do a single query neat :) i use the python cgi
 function SvFormContentDict() to get me the HTML POST, and
 then it goes haywire after that.


 basically as i am always refactoring code that i write more
 than twice or three times at the most, this is the one area
 where i haven't put in enough thought, and it's getting a 
 little out of hand.


 regarding posting code: i'll locate the view cvs page for you...

 http://cvs.sf.net/cgi-bin/viewcvs.cgi/custom/custom/cgi/src/python.py

 line 714 is where check_update_item() starts.

 line 877 is where check_add_item() starts.

 the add and update functions basically parse _any_ of the forms
 involving POSTs: if the HTTP post has "mode=add" then check_add_item()
 gets called, if it has mode=update, update gets called.

 there are several forms that stock.py generates, all of which
 will call one of these two functions.

 where things have broken down with this over-simplistic approach
 is that the check_add_item() function needs to find out why it's
 been called, from the dictionary of cgi args.



 ... but in some ways this is all a distraction / just a snapshot
 of the bigger picture (so i went back and filled in a response
 to the earlier parts of your email).

 l.