pyodbc data corruption problem

Joe Salmeri JoeSalmeri at hotmail.com
Thu May 24 13:49:55 EDT 2007


I have done some additiona investigate into this problem and found the 
following:

As described below the problem does not begin to appear until the return 
value size is > 2048.

Once the return value is greater than 2048 bytes the value returned by 
pyodbc is 2 times the actual size of the return value data.

The return value data is  padded by exactly the same number of null 
characters as there are in the actual data size.

In other words if the actual database value is 4868 bytes, then pyodbc will 
return a value that is 9736 bytes long.  The first 4868 bytes will be the 
real data, followed by 4868 bytes of nulls.

I did a second test where the actual data size was 11,109 bytes.  In that 
case pyodbc returned a value that was 22,218 bytes long.  The first 11,109 
bytes are the real data, followed by 11,109 null bytes.

This seems to confirm the bug.

"Joe Salmeri" <JoeSalmeri at hotmail.com> wrote in message 
news:EPidnR__UJAnp9PbnZ2dnUVZ_u6rnZ2d at comcast.com...
>I have found a data corruption problem with pyodbc.
>
> OS = Windows XP SP2
> DB = Microsoft Access XP
>
> PROBLEM:
>
> When selecting columns from a table that are of type Memo the value
> returned is padded with a bunch of null characters at the end.
>
> The problems does not seem to occur until the length of the Memo column
> exceeds 2048 bytes.
>
> I have attached several scripts to help demonstrate the problem.
>
> To recreate the problem:
>
> 1. Create a blank Access database named test.
> 2. Create a ODBC DSN named test for that database
> 3. Run the createtable.py script to create the table
> and load it with the dummy data
> 4. Run the broke.py script to show the problem.
>
> The issue is when the data value is > 2048 bytes.
>
> The size in the createtable.py is 2046 bytes plus 3 bytes at the end that
> contain "JOE" for a total of 2049 bytes.
>
> If you change it from 2046 to 2045 (or less) then the problem does not
> occur.
>
> #
> # createtable.py script
> #
>
> import pyodbc
>
> dbs = pyodbc.connect('dsn=test')
>
> c   = dbs.cursor()
>
> try:
>    sql = 'drop table test_memo'
>    c.execute(sql)
>    dbs.commit()
> except:
>    # ignore drop table failure
>    pass
>
> sql = 'create table test_memo (c1 int not null, c2 memo not null)'
>
> c.execute(sql)
>
> dbs.commit()
>
> sql = 'insert into test_memo values(1, ?)'
>
> c2_value = '1' * 2046
>
> c2_value = '%sJOE' % (c2_value)
>
> c.execute(sql, (c2_value,))
>
> dbs.commit()
>
> c.close()
> dbs.close()
>
> #
> # broke.py script
> #
>
>
> import pyodbc
>
> dbs = pyodbc.connect('dsn=test')
>
> c   = dbs.cursor()
>
> sql = 'select c2, len(c2) as c2_db_len from test_memo where c1 = 1'
>
> c.execute(sql)
>
> row = c.fetchone()
>
> (
>    c2,
>    c2_db_len
> ) = row
>
> print repr(c2)
>
> print 'c2 length        :', len(c2)
> print 'c2_db_length     :', c2_db_len
>
> print 'before nul length:', len(c2[0:c2.find('\x00')])
>
> c.close()
> dbs.close()
>
>
> 





More information about the Python-list mailing list