Python plain-text database or library that supports joins?

askel dummy666 at mail.ru
Fri Jun 22 17:20:04 EDT 2007


On Jun 22, 1:18 pm, felciano <felci... at gmail.com> wrote:
> Hello --
>
> Is there a convention, library or Pythonic idiom for performing
> lightweight relational operations on flatfiles? I frequently find
> myself writing code to do simple SQL-like operations between flat
> files, such as appending columns from one file to another, linked
> through a common id. For example, take a list of addresses and append
> a 'district' field by looking up a congressional district from a
> second file that maps zip codes to districts.
>
> Conceptually this is a simple database operation with a join on a
> common field (zip code in the above example). Other case use other
> relational operators (projection, cross-product, etc) so I'm really
> looking for something SQL-like in functionality. However, the data is
> in flat-files, the file structure changes frequently, the files are
> dynamically generated from a range of sources, are short-lived in
> nature, and otherwise not warrant the hassle of a database setup. So
> I've been looking around for a nice, Pythonic, zero-config (no
> parsers, no setup/teardown, etc) solution for simple queries that
> handles a database of csv-files-with-headers automatically. There are
> number of solutions that are close, but in the end come up short:
>
> - KirbyBase 1.9 (latest Python version) is the closest that I could
> find, as it lets you keep your data in flatfiles and perform
> operations using the field names from those text-based tables, but it
> doesn't support joins (the more recent Ruby version seems to).
> - Buzhug and Sqlite have their data structures w no automatic .tab
> or .csv parsing (unless sqlite includes a way to map flatfiles to
> sqlite virtual tables that I don't know about).
> -http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/159974is
> heading in the right direction, as it shows how to perform relational
> operations on lists and are index based rather than field-name based.
> -http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498130andhttp://furius.ca/pubcode/pub/conf/common/bin/csv-db-import.html
> provide ways of automatically populating DBs but not the reverse
> (persist changes back out to the data files)
>
> The closest alternatives I've found are the GNU textutils that support
> join, cut, merge, etc but I need to add additional logic they don't
> support, nor do they allow field-level write operations from Python
> (UPDATE ... WHERE ...). Normally I'd jump right in and start coding
> but this seems like something so common that I would have expected
> someone else to have solved, so in the interest of not re-inventing
> the wheel I thought I'd see if anyone had any other suggestions. Any
> thoughts?
>
> Thanks!
>
> Ramon

ramon,

i don't think that using flat text files as a database is common these
days. if you need relational database features what stops you from
using rdbms? if the only reason for that is some legacy system then
i'd still use in-memory sqlite database for all relational operations.
import, process, export back to text if you need to.




More information about the Python-list mailing list