SQL and CSV

Matimus mccredie at gmail.com
Tue May 5 14:04:00 EDT 2009


On May 5, 9:25 am, Nick <nic... at gmail.com> wrote:
> On May 5, 5:19 pm, Tim Golden <m... at timgolden.me.uk> wrote:
>
>
>
> > Nick wrote:
> > > I have a requirement to read a CSV file. Normally, no problem, just
> > > import CSV and slurp the file up.
>
> > > However, in this case I want to filter out lines that have fields set
> > > to particular values.
>
> > > It would be neat to be able to do something like this.
>
> > > select * from test.csv where status <> "Canceled"
>
> > > Using adodb I can do this, so long as I don't have the where clause. :-
> > > (
>
> > > Is there a reasonable lightweight way of doing this in Python?
>
> > > I could write some python code that is used to filter rows, and inport
> > > that from config, but it's not quite as elegant as an SQL route.
>
> > Not entirely clear what you are and aren't prepared to try here, but...
> > the most obvious Python-based way to do this is treating the csv reader
> > as an iterator and filtering there. Your last line suggests that's not
> > what you want but just in case I've misunderstood:
>
> > <test.csv>
> > id,code,status
> > 1,"ONE","Active"
> > 2,"TWO","Cancelled"
> > 3,"THREE","Active"
> > </test.csv>
>
> > <code>
> > import csv
>
> > for row in csv.DictReader (open ("c:/temp/test.csv", "rb")):
> >   if row['status'] != 'Cancelled':
> >     print row
>
> > </code>
>
> > Doesn't seem too onerous, and could obviously be wrapped in
> > some useful class/module.
>
> > But if you really want to go the SQL route, I believe there are
> > ODBC adapters for CSV which, combined with PyODBC or CeODBC,
> > would probably take you where you want to go.
>
> > TJG
>
> Part of the problem is that the 'selection' needs to be in a config
> file. I can put the if row['status'] != 'Cancelled': return True into
> a config, read it and eval it, but its not quite as clean as an sql
> route.
>
> Nick

Well, if you are using 2.5.x you could always stuff it into a sqlite
in-memory database, and then execute a SQL query. Heck, you don't even
_need_ 2.5, but in 2.5 sqlite is part of the distribution.

Matt



More information about the Python-list mailing list