MS SQL Server/ODBC package for Python

Peter Herndon tpherndon at gmail.com
Mon Apr 18 13:08:27 EDT 2005


> > Next I modified the benchmark to reflect my particular circumstances
> > more accurately (?Maybe?  Comments invited).  I changed the query to
> > one of the queries in regular use in my application.  This query
> > selects 26 columns from 3 joined tables, with a where clause "where
> > f476 = ?", and I provide the missing value as a tuple in the execute
> > statement.  Note that, as I mentioned in my reply to M-A, the f476
> > field is not indexed, and is a long varchar.  Again, the system is
> > bought, so I have no control over the schema. ;)
> >
> > The other change I made was to reduce the number of iterations from 100
> > to 10.  Since there are 128000 records in the main table, the wait for
> > 100 iterations was too long for my patience.  Under these
> > circumstances, mx.ODBC's numbers are 188.49 seconds and 377.56 seconds
> > respectively, and adodbapi's times are 111.15 seconds and 223.55
> > seconds respectively.
> 
> Just curious: are you timing just the time it takes to
> complete the .execute() or do you also fetch the complete
> result or only part of it ?
> 

I have included the .fetchall() in the loop, yes.  The only other
change I made to the benchmark is to use ODBC.Windows.DriverConnect
rather than .Connect, as that's what I'd used before in my
application.  Is there a difference between the two, other than the
argument syntax?

> The mxODBC 2.0 release fetches this number after every .execute().
> If adodbapi avoids this (which we'll also integrate into mxODBC 2.1),
> then this would explain the differences you see.
> 

I look forward to retesting with 2.1 -- I'm curious to see what
differences arise.

> Another reason could indeed be related to the longvarchar
> field: these fields are fetched in multiple chunks if the
> ODBC driver doesn't provide proper size information - each
> of these chunks will require a network access which slows
> down the data fetching.

The datum in the field is uniformly short, 8 characters or less.  It's
actually our internal identifier.  Again, I'd use a varchar and index
the field if I had control over the schema.  However, the small size
of the datum argues against multiple chunks, unless the default chunk
size is really small.

> 
> Since mxODBC supports Unicode, but defaults to returning
> 8-bit strings, it is also possible that your longvarchar
> column is sent as Unicode and has to be converted to
> an 8-bit string first. Thus, allowing mxODBC to return Unicode
> could make a difference as well (see the docs on how this is
> done).

I'll take a look into this, as I have a vague memory of adodbapi
returning Unicode by default.

> Note that it often also pay off checking the ODBC driver
> settings, esp. if you have a networked setup - ODBC drivers
> often pre-fetch result sets and changing the defaults they
> use for this can make a huge difference in response times.
> 
> Unfortunately, mxODBC doesn't have control over these
> settings and there's no standard for them, so you'll
> have to check the ODBC driver documentation for details
> on the best settings can be found and set.

I'll have to dig into this.  There don't seem to be too many directly
available settings in ODBC Administrator, but there seem to be some
spots for inserting settings directly into a command line, so I'll
need to dig up the Sybase documentation on what's possible.

> > Regards,
> --
> Marc-Andre Lemburg
> eGenix.com

Thank you very much for your patience and insight.

Cheers,

---Peter Herndon



More information about the Python-list mailing list