From floydophone at gmail.com Fri Sep 3 20:02:22 2010 From: floydophone at gmail.com (Pete Hunt) Date: Fri, 3 Sep 2010 14:02:22 -0400 Subject: [DB-SIG] ANN: PyMySQL 0.3 Message-ID: I?m proud to announce the release of PyMySQL 0.3. For those of you unfamiliar with PyMySQL, it is a pure-Python drop-in replacement for MySQLdb with an emphasis on compatibility with MySQLdb and for various Python implementations. I started working on the project due to my frustrations stemming from getting MySQLdb working on Snow Leopard. PyMySQL has been tested on CPython 2.3+, Jython, IronPython and PyPy, and we have an unreleased Python 3.0 branch in Subversion. I encourage anyone hoping to connect to MySQL from Python to check it out and report any bugs you might find! Our current focus has been bringing it up to compatibility with SQLAlchemy and Django, and we have by and large achieved that goal with a high level of performance. Check it out at http://www.pymysql.org/. From anthony.tuininga at gmail.com Fri Sep 10 06:37:46 2010 From: anthony.tuininga at gmail.com (Anthony Tuininga) Date: Thu, 9 Sep 2010 22:37:46 -0600 Subject: [DB-SIG] cx_OracleTools 8.0 Message-ID: What is cx_OracleTools? cx_OracleTools is a set of Python scripts that handle Oracle database development tasks in a cross platform manner and improve (in my opinion) on the tools that are available by default in an Oracle client installation. Those who use cx_Oracle may also be interested in this project, if only as sample code. Binaries for Windows and Linux are provided for those who do not have a Python installation. Where do I get it? http://cx-oracletools.sourceforge.net What's new? 1) In DescribeObject, added option --show-synonyms which enables display of synonyms that reference the object. The default value for this option is false. 2) In DescribeObject, DescribeSchema, ExportObjects and RebuildTable, added support for Oracle context objects. 3) In DescribeSchema, ExportObjects and RecompileSource, added option --name-file which acts in the same fashion as the --name option except that the value of the option refers to a file containing a list of names, one name per line. 4) In DescribeObject, DescribeSchema and ExportObjects, added option --include-view-columns which enables specification of the column names when creating a view. 5) In DescribeObject and DescribeSchema added support for eliminating the quotas on tablespaces when generating create user statements. 6) In DescribeObject, DescribeSchema and ExportObjects, added options --as-of-timestamp and --as-of-scn which enable flashback queries when performing describes. This can be very useful for recovering those accidentally issued DDL commands! 7) In DumpCSV, make use of the builtin module csv and the standard option --schema; in addition, allow the file name to be specified as "-" or not at all in which case the output goes to stdout. 8) In DumpData, added support for dumping CLOB, BLOB and binary data values correctly. A commit statement is also appended to the output now as a convenience. 9) In ExportXML, added option --sort-by which allows the result set to be sorted before exporting. In addition, the source can be a query instead of simply a table name. 10) In GeneratePatch, switch to the new more intelligent parser. 11) In ImportXML, now use cElementTree rather than home-grown XML processing library. 12) In RebuildTable, removed SQL*Plus specific statements since by default connect statements are issued which only works properly with PatchDB. 13) In RecompileSource, added option --connect-as-owner and removed option --password. The new option specifies that when invalid objects are recompiled that a connection to the owner of the invalid object is established using the password of the current connection. The default value is false since this is an uncommon situation and is retained at all for support of product development at Computronix. 14) Replaced CompileSource with PatchDB which uses a much more advanced parser and is setup to handle additional commands besides executing SQL scripts. 15) Added setup.py for building with cx_Freeze which means that MSI packages for Windows and RPM packages for Linux are now available. 16) Other changes required to keep up with changes in Python, dependent libraries and Oracle (including up to Oracle 11.2) Anthony Tuininga From ethan at stoneleaf.us Sat Sep 18 08:11:01 2010 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 17 Sep 2010 23:11:01 -0700 Subject: [DB-SIG] dbf files and compact indices Message-ID: <4C9457F5.2070303@stoneleaf.us> Does anybody have any pointers, tips, web-pages, already written routines, etc, on parsing *.cdx files? I have found the pages on MS's sight for Foxpro, but they neglect to describe the compaction algorithm used, and my Google-fu has failed to find any sites with that information. Any and all help greatly appreciated! -- ~Ethan~ From carl at personnelware.com Sat Sep 18 15:21:41 2010 From: carl at personnelware.com (Carl Karsten) Date: Sat, 18 Sep 2010 08:21:41 -0500 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: <4C9457F5.2070303@stoneleaf.us> References: <4C9457F5.2070303@stoneleaf.us> Message-ID: On Sat, Sep 18, 2010 at 1:11 AM, Ethan Furman wrote: > Does anybody have any pointers, tips, web-pages, already written routines, > etc, on parsing *.cdx files? ?I have found the pages on MS's sight for > Foxpro, but they neglect to describe the compaction algorithm used, and my > Google-fu has failed to find any sites with that information. > > Any and all help greatly appreciated! > "Compound Index File Structure (.cdx)" http://msdn.microsoft.com/en-us/library/k35b9hs2%28v=VS.80%29.aspx which basiclly links to: http://msdn.microsoft.com/en-us/library/s8tb8f47%28v=VS.80%29.aspx Is that what you need? -- Carl K From ethan at stoneleaf.us Sat Sep 18 18:16:12 2010 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 18 Sep 2010 09:16:12 -0700 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: References: <4C9457F5.2070303@stoneleaf.us> Message-ID: <4C94E5CC.7060809@stoneleaf.us> Carl Karsten wrote: > On Sat, Sep 18, 2010 at 1:11 AM, Ethan Furman wrote: > >>Does anybody have any pointers, tips, web-pages, already written routines, >>etc, on parsing *.cdx files? I have found the pages on MS's sight for >>Foxpro, but they neglect to describe the compaction algorithm used, and my >>Google-fu has failed to find any sites with that information. >> >>Any and all help greatly appreciated! >> > > > "Compound Index File Structure (.cdx)" > > http://msdn.microsoft.com/en-us/library/k35b9hs2%28v=VS.80%29.aspx > > which basiclly links to: > http://msdn.microsoft.com/en-us/library/s8tb8f47%28v=VS.80%29.aspx > > Is that what you need? Thanks for the link, unfortunately I am already familiar with the page. What I need help with is the first sentence of the note at the bottom: Each entry consists of the record number, duplicate byte count and trailing byte count, all compacted. The key text is placed at the logical end of the node, working backwards, allowing for previous key entries. Here's a dump of the last interior node: ----- node type: 2 number of keys: 57 free space: 1 (or 256) (and is this bits, bytes, keys, what?) -- record number mask: c8 0e 40 b0 duplicate byte count mask: 28 trailing byte count mask: 00 -- bits used for record number: 178 bits used for duplicate count: 29 bits used for trail count: 64 bytes used for rec num, dup count, trail count: 192 ----- 12 00 ff 3f 00 00 1f 1f 0e 05 05 03 01 00 c8 0e 40 b0 28 00 b2 1d 40 c0 29 00 d0 42 40 d0 54 80 c0 43 40 a8 14 40 b8 40 40 c8 02 40 d0 08 00 b0 4c 80 b0 3a 40 a0 50 80 d0 3b 40 a8 09 40 b8 0a 80 88 3c 80 c0 2a 00 d8 21 c0 c0 3d 40 c0 4a 80 b0 26 40 b8 2b 40 c0 2c 00 c0 41 40 b8 4d 80 c8 37 00 c0 04 40 c8 44 80 c0 1b 40 c8 15 80 c8 27 40 c8 16 00 a8 2d c0 c8 51 80 b8 2e 40 c0 1e 00 b0 17 40 b8 46 40 b0 2f 80 c8 4f 80 a8 13 00 c8 59 00 c8 31 00 c8 1f 00 a8 3e 40 c0 22 40 a8 07 00 c8 23 80 d0 32 80 b0 52 80 c0 34 80 b0 20 40 b0 24 40 c0 47 80 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4e 44 45 4e 49 44 53 4f 4e 43 43 41 4d 4d 4f 4e 54 54 48 45 57 53 53 4c 45 4e 52 54 49 4e 45 5a 4e 4e 4d 41 47 45 45 49 45 42 45 52 4d 41 4e 45 57 49 4e 53 4c 41 56 45 4e 42 45 52 47 4b 41 56 41 4e 4a 4f 4e 45 53 49 52 49 53 48 53 54 45 54 4c 45 52 52 41 4e 4f 4c 53 54 45 49 4e 45 41 44 4c 45 59 48 41 54 48 41 57 41 59 52 49 4d 45 53 45 41 53 4f 4e 53 53 47 4c 41 44 53 54 4f 4e 45 55 52 52 59 4f 53 54 52 49 4e 4b 52 42 45 53 4f 4c 45 59 46 49 4c 45 4e 45 4e 49 53 4e 47 4c 55 4e 44 45 42 45 52 4c 45 4f 44 53 4f 4e 49 4e 47 4c 45 52 4d 41 52 45 53 54 45 43 4b 45 52 54 4f 4e 44 41 59 57 47 45 52 52 4e 45 49 4c 2d 53 55 4e 44 54 4f 4f 4b 53 45 59 4c 45 4e 44 45 4e 49 4e 55 4e 48 49 41 50 50 45 54 54 41 52 4e 41 48 41 4e 43 41 4c 44 57 45 4c 4c 55 54 54 52 55 43 45 4f 43 41 52 44 45 4c 4f 4f 4d 42 45 52 47 4e 53 45 4c 45 45 52 42 41 43 48 55 47 55 53 54 4e 44 45 52 53 4f 4e 41 4c 4c 41 4e ----- The last half (roughly) consists of last names compressed together, while the first half consists of 57 (in this case) entries of the record number, duplicate byte count and trailing byte count, all compacted -- how do I uncompact them? -- ~Ethan~ From carl at personnelware.com Sat Sep 18 19:15:22 2010 From: carl at personnelware.com (Carl Karsten) Date: Sat, 18 Sep 2010 12:15:22 -0500 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: <4C94E5CC.7060809@stoneleaf.us> References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> Message-ID: On Sat, Sep 18, 2010 at 11:16 AM, Ethan Furman wrote: > Carl Karsten wrote: >> >> On Sat, Sep 18, 2010 at 1:11 AM, Ethan Furman wrote: >> >>> Does anybody have any pointers, tips, web-pages, already written >>> routines, >>> etc, on parsing *.cdx files? ?I have found the pages on MS's sight for >>> Foxpro, but they neglect to describe the compaction algorithm used, and >>> my >>> Google-fu has failed to find any sites with that information. >>> >>> Any and all help greatly appreciated! >>> >> >> >> "Compound Index File Structure (.cdx)" >> >> http://msdn.microsoft.com/en-us/library/k35b9hs2%28v=VS.80%29.aspx >> >> which basiclly links to: >> http://msdn.microsoft.com/en-us/library/s8tb8f47%28v=VS.80%29.aspx >> >> Is that what you need? > > Thanks for the link, unfortunately I am already familiar with the page. > ?What I need help with is the first sentence of the note at the bottom: > > Each entry consists of the record number, duplicate byte count and > trailing byte count, all compacted. The key text is placed at the > logical end of the node, working backwards, allowing for previous key > entries. > > Here's a dump of the last interior node: > > ----- > node type: 2 > number of keys: 57 > free space: 1 (or 256) (and is this bits, bytes, keys, what?) > -- > record number mask: c8 0e 40 b0 > duplicate byte count mask: 28 > trailing byte count mask: 00 > -- > bits used for record number: 178 > bits used for duplicate count: 29 > bits used for trail count: 64 > bytes used for rec num, dup count, trail count: 192 > ----- > 12 00 ff 3f 00 00 1f 1f 0e 05 05 03 01 00 c8 0e 40 b0 28 00 > b2 1d 40 c0 29 00 d0 42 40 d0 54 80 c0 43 40 a8 14 40 b8 40 > 40 c8 02 40 d0 08 00 b0 4c 80 b0 3a 40 a0 50 80 d0 3b 40 a8 > 09 40 b8 0a 80 88 3c 80 c0 2a 00 d8 21 c0 c0 3d 40 c0 4a 80 > b0 26 40 b8 2b 40 c0 2c 00 c0 41 40 b8 4d 80 c8 37 00 c0 04 > 40 c8 44 80 c0 1b 40 c8 15 80 c8 27 40 c8 16 00 a8 2d c0 c8 > 51 80 b8 2e 40 c0 1e 00 b0 17 40 b8 46 40 b0 2f 80 c8 4f 80 > a8 13 00 c8 59 00 c8 31 00 c8 1f 00 a8 3e 40 c0 22 40 a8 07 > 00 c8 23 80 d0 32 80 b0 52 80 c0 34 80 b0 20 40 b0 24 40 c0 > 47 80 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 4e 44 45 4e 49 44 53 4f 4e 43 43 41 4d 4d 4f 4e 54 54 48 > 45 57 53 53 4c 45 4e 52 54 49 4e 45 5a 4e 4e 4d 41 47 45 45 > 49 45 42 45 52 4d 41 4e 45 57 49 4e 53 4c 41 56 45 4e 42 45 > 52 47 4b 41 56 41 4e 4a 4f 4e 45 53 49 52 49 53 48 53 54 45 > 54 4c 45 52 52 41 4e 4f 4c 53 54 45 49 4e 45 41 44 4c 45 59 > 48 41 54 48 41 57 41 59 52 49 4d 45 53 45 41 53 4f 4e 53 53 > 47 4c 41 44 53 54 4f 4e 45 55 52 52 59 4f 53 54 52 49 4e 4b > 52 42 45 53 4f 4c 45 59 46 49 4c 45 4e 45 4e 49 53 4e 47 4c > 55 4e 44 45 42 45 52 4c 45 4f 44 53 4f 4e 49 4e 47 4c 45 52 > 4d 41 52 45 53 54 45 43 4b 45 52 54 4f 4e 44 41 59 57 47 45 > 52 52 4e 45 49 4c 2d 53 55 4e 44 54 4f 4f 4b 53 45 59 4c 45 > 4e 44 45 4e 49 4e 55 4e 48 49 41 50 50 45 54 54 41 52 4e 41 > 48 41 4e 43 41 4c 44 57 45 4c 4c 55 54 54 52 55 43 45 4f 43 > 41 52 44 45 4c 4f 4f 4d 42 45 52 47 4e 53 45 4c 45 45 52 42 > 41 43 48 55 47 55 53 54 4e 44 45 52 53 4f 4e 41 4c 4c 41 4e > ----- > > The last half (roughly) consists of last names compressed together, > while the first half consists of 57 (in this case) entries of the record > number, duplicate byte count and trailing byte count, all compacted -- > how do I uncompact them? > huh, I see what you mean. What are you working on? I know a few people that may have the answer, but it would help to explain why it is being worked on. -- Carl K From ethan at stoneleaf.us Sat Sep 18 19:44:06 2010 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 18 Sep 2010 10:44:06 -0700 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> Message-ID: <4C94FA66.80008@stoneleaf.us> Carl Karsten wrote: > On Sat, Sep 18, 2010 at 11:16 AM, Ethan Furman wrote: > >>Carl Karsten wrote: >> >>>On Sat, Sep 18, 2010 at 1:11 AM, Ethan Furman wrote: >>> >>> >>>>Does anybody have any pointers, tips, web-pages, already written >>>>routines, >>>>etc, on parsing *.cdx files? I have found the pages on MS's sight for >>>>Foxpro, but they neglect to describe the compaction algorithm used, and >>>>my >>>>Google-fu has failed to find any sites with that information. >>>> >>>>Any and all help greatly appreciated! >>>> >>> >>> >>>"Compound Index File Structure (.cdx)" >>> >>>http://msdn.microsoft.com/en-us/library/k35b9hs2%28v=VS.80%29.aspx >>> >>>which basiclly links to: >>>http://msdn.microsoft.com/en-us/library/s8tb8f47%28v=VS.80%29.aspx >>> >>>Is that what you need? >> >>Thanks for the link, unfortunately I am already familiar with the page. >> What I need help with is the first sentence of the note at the bottom: >> >>Each entry consists of the record number, duplicate byte count and >>trailing byte count, all compacted. The key text is placed at the >>logical end of the node, working backwards, allowing for previous key >>entries. >> >>Here's a dump of the last interior node: >> >>----- >>node type: 2 >>number of keys: 57 >>free space: 1 (or 256) (and is this bits, bytes, keys, what?) >>-- >>record number mask: c8 0e 40 b0 >>duplicate byte count mask: 28 >>trailing byte count mask: 00 >>-- >>bits used for record number: 178 >>bits used for duplicate count: 29 >>bits used for trail count: 64 >>bytes used for rec num, dup count, trail count: 192 >>----- >>12 00 ff 3f 00 00 1f 1f 0e 05 05 03 01 00 c8 0e 40 b0 28 00 >>b2 1d 40 c0 29 00 d0 42 40 d0 54 80 c0 43 40 a8 14 40 b8 40 >>40 c8 02 40 d0 08 00 b0 4c 80 b0 3a 40 a0 50 80 d0 3b 40 a8 >>09 40 b8 0a 80 88 3c 80 c0 2a 00 d8 21 c0 c0 3d 40 c0 4a 80 >>b0 26 40 b8 2b 40 c0 2c 00 c0 41 40 b8 4d 80 c8 37 00 c0 04 >>40 c8 44 80 c0 1b 40 c8 15 80 c8 27 40 c8 16 00 a8 2d c0 c8 >>51 80 b8 2e 40 c0 1e 00 b0 17 40 b8 46 40 b0 2f 80 c8 4f 80 >>a8 13 00 c8 59 00 c8 31 00 c8 1f 00 a8 3e 40 c0 22 40 a8 07 >>00 c8 23 80 d0 32 80 b0 52 80 c0 34 80 b0 20 40 b0 24 40 c0 >>47 80 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>00 4e 44 45 4e 49 44 53 4f 4e 43 43 41 4d 4d 4f 4e 54 54 48 >>45 57 53 53 4c 45 4e 52 54 49 4e 45 5a 4e 4e 4d 41 47 45 45 >>49 45 42 45 52 4d 41 4e 45 57 49 4e 53 4c 41 56 45 4e 42 45 >>52 47 4b 41 56 41 4e 4a 4f 4e 45 53 49 52 49 53 48 53 54 45 >>54 4c 45 52 52 41 4e 4f 4c 53 54 45 49 4e 45 41 44 4c 45 59 >>48 41 54 48 41 57 41 59 52 49 4d 45 53 45 41 53 4f 4e 53 53 >>47 4c 41 44 53 54 4f 4e 45 55 52 52 59 4f 53 54 52 49 4e 4b >>52 42 45 53 4f 4c 45 59 46 49 4c 45 4e 45 4e 49 53 4e 47 4c >>55 4e 44 45 42 45 52 4c 45 4f 44 53 4f 4e 49 4e 47 4c 45 52 >>4d 41 52 45 53 54 45 43 4b 45 52 54 4f 4e 44 41 59 57 47 45 >>52 52 4e 45 49 4c 2d 53 55 4e 44 54 4f 4f 4b 53 45 59 4c 45 >>4e 44 45 4e 49 4e 55 4e 48 49 41 50 50 45 54 54 41 52 4e 41 >>48 41 4e 43 41 4c 44 57 45 4c 4c 55 54 54 52 55 43 45 4f 43 >>41 52 44 45 4c 4f 4f 4d 42 45 52 47 4e 53 45 4c 45 45 52 42 >>41 43 48 55 47 55 53 54 4e 44 45 52 53 4f 4e 41 4c 4c 41 4e >>----- >> >>The last half (roughly) consists of last names compressed together, >>while the first half consists of 57 (in this case) entries of the record >>number, duplicate byte count and trailing byte count, all compacted -- >>how do I uncompact them? >> > > > huh, I see what you mean. > > What are you working on? > > I know a few people that may have the answer, but it would help to > explain why it is being worked on. > > I have a pure-python module to read db3 and vfp 6 dbf files, and I find that I need to read (and write) the idx and cdx index files that foxpro generates. We are in the process of switching from homegrown foxpro apps to homegrown python apps, but I have to support the foxpro file formats until the switch is complete. Once I have the index files down, I'll publish another release of it (an older version can be found on PyPI). Thanks for your help! -- ~Ethan~ From ethan at stoneleaf.us Sun Sep 19 06:23:21 2010 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 18 Sep 2010 21:23:21 -0700 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> <4C94FA66.80008@stoneleaf.us> Message-ID: <4C959039.5070700@stoneleaf.us> Vernon Cole wrote: > Ethan: > I cannot see where you mentioned your operating system, I am assuming > Windows. > > Perhaps you have already investigated this ... I have no way to test it > ... but you might try: > ADO can access almost any data source, and a quick look seems to show > that .dbf is supported using the JET driver or a FoxPro driver. > > 1) upload pywin32 > 2) import adodbapi > 3) find an appropriate connection string for your data source > http://connectionstrings.com suggests that perhaps: > Driver={Microsoft Visual FoxPro > Driver};SourceType=DBF;SourceDB=c:\myvfpdbfolder;Exclusive=No; > Collate=Machine;NULL=NO;DELETED=NO;BACKGROUNDFETCH=NO; > may be a good sample to start with -- there are other variations, check > their site. > 4) do your data input/output using standard Python db-api calls. > > see python\lib\site-packages\adodbapi\test\ for usage examples > > You can get pywin32 from http://sourceforge.net/projects/pywin32 Thanks for the suggestion, but I don't want to be tied to Foxpro, which means I need to be able to parse these files directly. I have the dbf files, now I need the idx and cdx files. -- ~Ethan~ From carl at personnelware.com Sun Sep 19 06:36:54 2010 From: carl at personnelware.com (Carl Karsten) Date: Sat, 18 Sep 2010 23:36:54 -0500 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: <4C959039.5070700@stoneleaf.us> References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> <4C94FA66.80008@stoneleaf.us> <4C959039.5070700@stoneleaf.us> Message-ID: On Sat, Sep 18, 2010 at 11:23 PM, Ethan Furman wrote: > Vernon Cole wrote: >> >> Ethan: >> I cannot see where you mentioned your operating system, I am assuming >> Windows. >> >> Perhaps you have already investigated this ... I have no way to test it >> ... but you might try: >> ADO can access almost any data source, and a quick look seems to show that >> .dbf is supported using the JET driver or a FoxPro driver. >> >> 1) upload pywin32 >> 2) import adodbapi >> 3) find an appropriate connection string for your data source >> http://connectionstrings.com suggests that perhaps: >> Driver={Microsoft Visual FoxPro >> Driver};SourceType=DBF;SourceDB=c:\myvfpdbfolder;Exclusive=No; >> Collate=Machine;NULL=NO;DELETED=NO;BACKGROUNDFETCH=NO; >> may be a good sample to start with -- there are other variations, check >> their site. >> 4) do your data input/output using standard Python db-api calls. >> >> see python\lib\site-packages\adodbapi\test\ for usage examples >> >> You can get pywin32 from http://sourceforge.net/projects/pywin32 > > Thanks for the suggestion, but I don't want to be tied to Foxpro, which > means I need to be able to parse these files directly. ?I have the dbf > files, now I need the idx and cdx files. What do you mean "tied" ? -- Carl K From ethan at stoneleaf.us Sun Sep 19 06:51:56 2010 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 18 Sep 2010 21:51:56 -0700 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> <4C94FA66.80008@stoneleaf.us> Message-ID: <4C9596EC.40906@stoneleaf.us> Dennis Lee Bieber wrote: > On Sat, 18 Sep 2010 10:44:06 -0700, Ethan Furman > declaimed the following in gmane.comp.python.general: > > >>I have a pure-python module to read db3 and vfp 6 dbf files, and I find >>that I need to read (and write) the idx and cdx index files that foxpro >>generates. We are in the process of switching from homegrown foxpro >>apps to homegrown python apps, but I have to support the foxpro file >>formats until the switch is complete. Once I have the index files down, >>I'll publish another release of it (an older version can be found on PyPI). >> > > Seems odd that you'd have to do all that low-level processing... Do > you have the VFP ODBC driver on the systems? That would permit just > using ODBC queries to operate on the data. Hmmm. I may look at that. When I first started this project, I was brand new to Python and unaware of all the cool stuff out there. It was a great learning project -- iterators, magic methods, emulating sequences and attributes, properties, class methods, inheritence, packages... it has been quite an adventure! It's probably a safe bet that I don't /need/ to, but I certainly /want/ to. It would be a finished project, and a sense of accomplishment. Thanks for the suggestion. -- ~Ethan~ From ethan at stoneleaf.us Sun Sep 19 07:10:16 2010 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 18 Sep 2010 22:10:16 -0700 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> <4C94FA66.80008@stoneleaf.us> <4C959039.5070700@stoneleaf.us> Message-ID: <4C959B38.3040108@stoneleaf.us> Carl Karsten wrote: > On Sat, Sep 18, 2010 at 11:23 PM, Ethan Furman wrote: >>Thanks for the suggestion, but I don't want to be tied to Foxpro, which >>means I need to be able to parse these files directly. I have the dbf >>files, now I need the idx and cdx files. > > > > What do you mean "tied" ? I meant having to have Foxpro installed. I just learned from another reply that I may not have to have Foxpro installed as long as I have the Foxpro ODBC driver, because then I could use odbc instead of fiddling with the files directly. While I may switch over to odbc in the future, I would still like to have the idx/cdx components. -- ~Ethan~ From mal at egenix.com Sun Sep 19 13:04:40 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 19 Sep 2010 13:04:40 +0200 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: <4C959B38.3040108@stoneleaf.us> References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> <4C94FA66.80008@stoneleaf.us> <4C959039.5070700@stoneleaf.us> <4C959B38.3040108@stoneleaf.us> Message-ID: <4C95EE48.2030303@egenix.com> Ethan Furman wrote: > Carl Karsten wrote: >> On Sat, Sep 18, 2010 at 11:23 PM, Ethan Furman >> wrote: >>> Thanks for the suggestion, but I don't want to be tied to Foxpro, which >>> means I need to be able to parse these files directly. I have the dbf >>> files, now I need the idx and cdx files. >> >> >> >> What do you mean "tied" ? > > I meant having to have Foxpro installed. I just learned from another > reply that I may not have to have Foxpro installed as long as I have the > Foxpro ODBC driver, because then I could use odbc instead of fiddling > with the files directly. > > While I may switch over to odbc in the future, I would still like to > have the idx/cdx components. If you are working on Windows, you can install the MS MDAC package to get a hold of the MS FoxPro ODBC drivers. They are usually already installed in Vista and 7, in XP they comes with MS SQL Server and MS Office as well. mxODBC can then provide Python access on Windows, mxODBC Connect on other platforms. If you want direct files access on other platforms, you can use http://pypi.python.org/pypi/dbf/ or http://dbfpy.sourceforge.net/. If you want to add support for index files (which the above two don't support), you could also have a look at this recipe for some inspiration: http://code.activestate.com/recipes/362715-dbf-reader-and-writer/ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 19 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ethan at stoneleaf.us Sun Sep 19 20:45:54 2010 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 19 Sep 2010 11:45:54 -0700 Subject: [DB-SIG] dbf files and compact indices In-Reply-To: <4C95EE48.2030303@egenix.com> References: <4C9457F5.2070303@stoneleaf.us> <4C94E5CC.7060809@stoneleaf.us> <4C94FA66.80008@stoneleaf.us> <4C959039.5070700@stoneleaf.us> <4C959B38.3040108@stoneleaf.us> <4C95EE48.2030303@egenix.com> Message-ID: <4C965A62.3040804@stoneleaf.us> M.-A. Lemburg wrote: > If you are working on Windows, you can install the MS MDAC package to > get a hold of the MS FoxPro ODBC drivers. They are usually already installed > in Vista and 7, in XP they comes with MS SQL Server and MS Office as > well. mxODBC can then provide Python access on Windows, mxODBC Connect > on other platforms. > > If you want direct files access on other platforms, you can use > http://pypi.python.org/pypi/dbf/ ^--- I'm the author if this package :) > or http://dbfpy.sourceforge.net/. ^--- from the quick skim of the code, I think mine does more at this point (memos, adding/deleting/renaming fields in existing tables, in-memory indexes, unicode support, export to csv,tab,fixed formats, field access via attribute/dictionary/index style (e.g. table.fullname or table['fullname'] or table[0] if fullname is the first field), very rudimentary sql support, etc.) > If you want to add support for index files (which the above two don't > support), you could also have a look at this recipe for some > inspiration: > > http://code.activestate.com/recipes/362715-dbf-reader-and-writer/ I didn't see anything regarding the .idx or .cdx files in this recipe. :( Thanks for your time, though! -- ~Ethan~ From rnpydbsig at wonderclown.net Mon Sep 20 18:03:57 2010 From: rnpydbsig at wonderclown.net (Randall Nortman) Date: Mon, 20 Sep 2010 12:03:57 -0400 Subject: [DB-SIG] When must transactions begin? Message-ID: <20100920160357.GP20992@li4-40.members.linode.com> PEP 249 says that transactions end on commit() or rollback(), but it doesn't explicitly state when transactions should begin, and there is no begin() method. I think the implication is that transactions begin on the first execute(), but that's not explicitly stated. At least one driver, pysqlite2/sqlite3, does not start a transaction for a SELECT statement. It waits for a DML statement (INSERT, UPDATE, DELETE) before opening a transaction. Other drivers open transactions on any statement, including SELECT. My question for the DB-SIG is: Can I call it a bug in pysqlite2 that it does not open transactions on SELECT? Should the spec be amended to make this explicit? Or are both behaviors acceptable, in which case perhaps a begin() method needs to be added for when the user wants control over opening transactions? TIA, Randall Nortman From mal at egenix.com Mon Sep 20 18:49:29 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 20 Sep 2010 18:49:29 +0200 Subject: [DB-SIG] When must transactions begin? In-Reply-To: <20100920160357.GP20992@li4-40.members.linode.com> References: <20100920160357.GP20992@li4-40.members.linode.com> Message-ID: <4C979099.40500@egenix.com> Randall Nortman wrote: > PEP 249 says that transactions end on commit() or rollback(), but it > doesn't explicitly state when transactions should begin, and there is > no begin() method. Transactions start implicitly after you connect and after you call .commit() or .rollback(). They are not started for each statement. > I think the implication is that transactions begin > on the first execute(), but that's not explicitly stated. At least > one driver, pysqlite2/sqlite3, does not start a transaction for a > SELECT statement. It waits for a DML statement (INSERT, UPDATE, > DELETE) before opening a transaction. Other drivers open transactions > on any statement, including SELECT. > > My question for the DB-SIG is: Can I call it a bug in pysqlite2 that > it does not open transactions on SELECT? Should the spec be amended > to make this explicit? Or are both behaviors acceptable, in which > case perhaps a begin() method needs to be added for when the user > wants control over opening transactions? I should probably add a note to PEP 249 about this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 20 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From farcepest at gmail.com Mon Sep 20 20:49:55 2010 From: farcepest at gmail.com (Andy Dustman) Date: Mon, 20 Sep 2010 14:49:55 -0400 Subject: [DB-SIG] When must transactions begin? In-Reply-To: <4C979099.40500@egenix.com> References: <20100920160357.GP20992@li4-40.members.linode.com> <4C979099.40500@egenix.com> Message-ID: On Mon, Sep 20, 2010 at 12:49 PM, M.-A. Lemburg wrote: > > > Randall Nortman wrote: >> PEP 249 says that transactions end on commit() or rollback(), but it >> doesn't explicitly state when transactions should begin, and there is >> no begin() method. > > Transactions start implicitly after you connect and after you call > .commit() or .rollback(). They are not started for each statement. Did the transaction exist before the first statement, or did executing the statement cause it to be created? Doesn't matter. Or does it? >From a server (implementation) perspective, I am pretty sure that executing a statement starts a transaction. Otherwise you would have open transactions for an extended period of time, even when the client has not executed statements, and that has implications for concurrency. And this is an effect that *would* be noticeable by clients. How to test this: Connect to the database with two clients. In one, insert a row and commit. In the other, try to select them. If transactions begin at connect time, the selecting client should *not* be able to see them, because they didn't exist at the start of the transaction. Test two: Connect to the database with two clients. In one, select some rows from a table, but don't commit or rollback. In the other, insert a row and commit. The first client should not be able to see the inserted row until it does a commit or rollback, even though it hasn't modified any data. The above of course depends on your isolation level, but I typically get a bug report/question every few months from someone who has a loop where they try to select newly inserted records by another client, and they never show up, and it's because they never closed their transaction. (MySQLdb with InnoDB tables) In MySQL, some statements (primarily DDL, i.e. CREATE TABLE and pals) implicitly commit a transaction. -- Question the answers From mal at egenix.com Mon Sep 20 21:04:06 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 20 Sep 2010 21:04:06 +0200 Subject: [DB-SIG] When must transactions begin? In-Reply-To: References: <20100920160357.GP20992@li4-40.members.linode.com> <4C979099.40500@egenix.com> Message-ID: <4C97B026.2020705@egenix.com> Andy Dustman wrote: > On Mon, Sep 20, 2010 at 12:49 PM, M.-A. Lemburg wrote: >> >> >> Randall Nortman wrote: >>> PEP 249 says that transactions end on commit() or rollback(), but it >>> doesn't explicitly state when transactions should begin, and there is >>> no begin() method. >> >> Transactions start implicitly after you connect and after you call >> .commit() or .rollback(). They are not started for each statement. > > Did the transaction exist before the first statement, or did executing > the statement cause it to be created? Doesn't matter. Or does it? The above is the explanation on the logical level (and a lot easier to understand, IMHO, since you don't have to explain the existence of non-transactional behavior on a connection). The implementation can optimize this in whatever way is necessary or required by the backend. I just wanted to make the point that a transaction is not started for each SELECT you execute on the connection. >>From a server (implementation) perspective, I am pretty sure that > executing a statement starts a transaction. Otherwise you would have > open transactions for an extended period of time, even when the client > has not executed statements, and that has implications for > concurrency. And this is an effect that *would* be noticeable by > clients. > > How to test this: Connect to the database with two clients. In one, > insert a row and commit. In the other, try to select them. If > transactions begin at connect time, the selecting client should *not* > be able to see them, because they didn't exist at the start of the > transaction. > > Test two: Connect to the database with two clients. In one, select > some rows from a table, but don't commit or rollback. In the other, > insert a row and commit. The first client should not be able to see > the inserted row until it does a commit or rollback, even though it > hasn't modified any data. > > The above of course depends on your isolation level, but I typically > get a bug report/question every few months from someone who has a loop > where they try to select newly inserted records by another client, and > they never show up, and it's because they never closed their > transaction. (MySQLdb with InnoDB tables) True. The various isolation levels can have interesting side-effects on what you see in your application. This is database specific, though, and cannot be dealt with in the DB-API. I can add a footnote, though, if you think that would help. > In MySQL, some statements (primarily DDL, i.e. CREATE TABLE and pals) > implicitly commit a transaction. Yep. Other databases insist that you do this explicitly and refuse to run any other statement until you do (IIRC, PostgreSQL is one such database). Yet other databases don't have such limitations and even allow dropping tables in a transaction without affecting the database until you commit the change (e.g. MaxDB). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 20 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From daniele.varrazzo at gmail.com Thu Sep 23 02:24:53 2010 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Thu, 23 Sep 2010 01:24:53 +0100 Subject: [DB-SIG] DBAPI two phase commit implementation in psycopg2 Message-ID: Hello, I've recently joined the db-sig ML, and I've read the threads about the two phase commit interface design of Jan 2008. I'd like to implement the DBAPI TPC extension in psycopg2: I'm considering the best way to overcome the slight model difference between the XA-inspired DBAPI and the PostgreSQL commands. The DBAPI xid structure has members (format_id, gtrid, bqual). In postgresql PREPARE TRANSACTION only takes a string "tid". So it will be the driver's responsibility to map between the xid triple and the tid string. I may come out with a separator (e.g. "|") and concatenate the three parts into a tid, escaping the separator. On the other way round, it wouldn't be a problem to perform the inverse split, but I should take in consideration transactions whose tid doesn't follow the pattern (e.g. created by a non-XA-oriented application) and is just composed by a string. tpc_recover would then create xid with format_id = 0 and bqual = "", that seem reasonable default values reading the XA specs. Is postgres the only database with this xid mapping issue? If other dbs have similar issues, how is the xid - string mapping usually performed? Any other suggestion about the matter would be appreciated. Thank you. -- Daniele From mal at egenix.com Fri Sep 24 10:48:08 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 24 Sep 2010 10:48:08 +0200 Subject: [DB-SIG] DBAPI two phase commit implementation in psycopg2 In-Reply-To: References: Message-ID: <4C9C65C8.8090606@egenix.com> Daniele Varrazzo wrote: > Hello, > > I've recently joined the db-sig ML, and I've read the threads about > the two phase commit interface design of Jan 2008. > > I'd like to implement the DBAPI TPC extension in psycopg2: I'm > considering the best way to overcome the slight model difference > between the XA-inspired DBAPI and the PostgreSQL commands. > > The DBAPI xid structure has members (format_id, gtrid, bqual). In > postgresql PREPARE TRANSACTION only takes a string "tid". So it will > be the driver's responsibility to map between the xid triple and the > tid string. I may come out with a separator (e.g. "|") and concatenate > the three parts into a tid, escaping the separator. On the other way > round, it wouldn't be a problem to perform the inverse split, but I > should take in consideration transactions whose tid doesn't follow the > pattern (e.g. created by a non-XA-oriented application) and is just > composed by a string. tpc_recover would then create xid with format_id > = 0 and bqual = "", that seem reasonable default values reading the XA > specs. Both sound like reasonable ways to map the xid requirements onto PGs single string approach. I'd set the branch qualifier to something like 'pgsql' or the database name, since each resource in a global transaction should have its own branch qualifier. I'd also look around to check how other tools that interoperate with PG in two-phase commits handle this. XA is a widely used standard in the industry, so I assume the problem must have popped up elsewhere as well. Note that the TM will usually create the xid and only the TM has to be able to recognize its own xids for the purpose of managing different transactions. > Is postgres the only database with this xid mapping issue? If other > dbs have similar issues, how is the xid - string mapping usually > performed? All the big ones (Oracle, DB2, Sybase, etc.) use XA for this. > Any other suggestion about the matter would be appreciated. Thank you. Some references: http://download.oracle.com/javase/1.5.0/docs/api/javax/transaction/xa/Xid.html http://dev.mysql.com/doc/refman/5.0/en/xa-statements.html The XA spec: http://www.opengroup.org/bookstore/catalog/c193.htm -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 24 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From daniele.varrazzo at gmail.com Fri Sep 24 18:04:08 2010 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Fri, 24 Sep 2010 17:04:08 +0100 Subject: [DB-SIG] DBAPI two phase commit implementation in psycopg2 In-Reply-To: <4C9C65C8.8090606@egenix.com> References: <4C9C65C8.8090606@egenix.com> Message-ID: On Fri, Sep 24, 2010 at 9:48 AM, M.-A. Lemburg wrote: > I'd also look around to check how other tools that interoperate with > PG in two-phase commits handle this. XA is a widely used standard > in the industry, so I assume the problem must have popped up > elsewhere as well. MySQL uses a XA model, so the mapping is direct. Other high profile databases hide the commands they use under tons of APIs and I've not been able to find implementation details. I've found instead the JDBC implementation of the XA - PG mapper [1]: it is probably a good idea to use the same format to allow some form of interoperation between tools. They use str(format_id) + '_' + Base64(gtrid) + '_' + Base64(bqual) and on recover they refuse to work on anything that doesn't follow this model. I don't agree on the latter point because I think the driver should allow the user to leverage everything the database permits, so I will probably find a way to parse back a generic string into a XID. But I guess this is an implementation detail better discussed on the psycopg mailing lists. Thank you very much. Have a nice weekend. -- Daniele [1]: http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/xa/RecoveredXid.java?rev=1.3&content-type=text/x-cvsweb-markup From james at jamesh.id.au Tue Sep 28 09:09:39 2010 From: james at jamesh.id.au (James Henstridge) Date: Tue, 28 Sep 2010 15:09:39 +0800 Subject: [DB-SIG] DBAPI two phase commit implementation in psycopg2 In-Reply-To: References: <4C9C65C8.8090606@egenix.com> Message-ID: On Sat, Sep 25, 2010 at 12:04 AM, Daniele Varrazzo wrote: > On Fri, Sep 24, 2010 at 9:48 AM, M.-A. Lemburg wrote: > >> I'd also look around to check how other tools that interoperate with >> PG in two-phase commits handle this. XA is a widely used standard >> in the industry, so I assume the problem must have popped up >> elsewhere as well. > > MySQL uses a XA model, so the mapping is direct. Other high profile > databases hide the commands they use under tons of APIs and I've not > been able to find implementation details. > > I've found instead the JDBC implementation of the XA - PG mapper [1]: > it is probably a good idea to use the same format to allow some form > of interoperation between tools. They use str(format_id) + '_' + > Base64(gtrid) + '_' + Base64(bqual) and on recover they refuse to work > on anything that doesn't follow this model. I don't agree on the > latter point because I think the driver should allow the user to > leverage everything the database permits, so I will probably find a > way to parse back a generic string into a XID. But I guess this is an > implementation detail better discussed on the psycopg mailing lists. > > Thank you very much. Have a nice weekend. When writing the TPC additions for the spec, I did take PostgreSQL into account (I thought I'd have time to write the psycopg2 implementation back then). The reasoning for using the three part identifiers was that the XA-style identifiers were quite common and it was easier to losslessly encode the three part identifiers as a string than vice versa. The spec should allow you to manage identifiers that don't match your mapping though. The user can only get references to transaction ID objects via method calls on the connection. So while transaction IDs are required to provide tuple like behaviour, the adapter doesn't have to use actual tuple objects. A custom object type could easily be used here to round trip the foreign IDs between tpc_recover() and tpc_commit()/tpc_abort(). James. From daniele.varrazzo at gmail.com Tue Sep 28 11:07:50 2010 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Tue, 28 Sep 2010 10:07:50 +0100 Subject: [DB-SIG] DBAPI two phase commit implementation in psycopg2 In-Reply-To: References: <4C9C65C8.8090606@egenix.com> Message-ID: On Tue, Sep 28, 2010 at 8:09 AM, James Henstridge wrote: > When writing the TPC additions for the spec, I did take PostgreSQL > into account (I thought I'd have time to write the psycopg2 > implementation back then). ?The reasoning for using the three part > identifiers was that the XA-style identifiers were quite common and it > was easier to losslessly encode the three part identifiers as a string > than vice versa. > > The spec should allow you to manage identifiers that don't match your > mapping though. ?The user can only get references to transaction ID > objects via method calls on the connection. ?So while transaction IDs > are required to provide tuple like behaviour, the adapter doesn't have > to use actual tuple objects. ?A custom object type could easily be > used here to round trip the foreign IDs between tpc_recover() and > tpc_commit()/tpc_abort(). Thank you James. I've found your code on launchpad with the Xid implementation and it seems a great starting point. I know well enough how the state in psycopg connections is handled so there should be no problem to complete the implementation. Regards, -- Daniele