Code suggestion - List comprehension

Chris Angelico rosuav at gmail.com
Thu Dec 12 16:46:56 EST 2013


On Fri, Dec 13, 2013 at 7:40 AM, Shyam Parimal Katti <spk265 at nyu.edu> wrote:
> A semi-colon in the string value  indicates the termination of a sql query.
> So the expected out come is a conversion to a list of valid sql queries:
> ['drop table sample_table;', 'create table sample_test (col1 int);', 'select
> col1 from sample_test;']

Hmm. Just to clarify, does a semicolon _anywhere_ in the string
terminate the query? If so, you have a problem of parsing, and the
best thing to do is to enable multi-query processing on your database
connection and send them all across. (You'd have to skip any
semicolons inside quoted strings, for instance.) Your current code
depends on the semi being at the end of one of the strings, which is a
much safer proposition.

If you really mean to split it anywhere, the easiest is to simply join
the whole lot and then split on the semi:

sample = ['drop table sample_table;', 'create table sample_test',
'(col1 int);', 'select col1 from', ' sample_test;']
pure_sqls = ''.join(sample).split(';')[:-1]

Note that the last element from the split is NOT a valid query. If
all's well, it should be an empty string (as it will be in this
sample), but if it's not empty, it's a partial query. I don't know
what you want to do with those; your code above will simply ignore
them, so I've done the same here, applying the "trim off the last
element" smiley operator [:-1] before assigning to pure_sqls.

Parenthesis: I just asked the members of Threshold RPG what they
thought [:-1] meant. 15 seconds after I asked, three responses came in
almost simultaneously.

Zeala: pothead smoking a roach
Claudas: fallen jaw?
Tharl: constipated, but happy.

I don't know what that means for the OP's code. Somehow it doesn't
seem a good omen. End parenthesis.

For what you're doing, a list comp isn't really appropriate. Broadly,
a list comp should be creating zero or one elements from each element
of the source list; what you're trying to do here is stitching things
together, which requires state. You can't do that with a list comp.
The best way is probably what you already have, but if you'd like it
to be shorter, you need simply invent a split marker that can't
possibly exist in your queries, and use that. Let's suppose "\0" can't
ever occur (that's likely, given that you're working with SQL).

sample = ['drop table sample_table;', 'create table sample_test',
'(col1 int);', 'select col1 from', ' sample_test;']
pure_sqls = [s.replace('\0','') for s in
'\0'.join(sample+['']).split(';\0') if s!='']

Assuming the exact sequence ";\0" never comes up in your text, this
will work perfectly. You could change out the replace call to put a
delimiter in, if that made sense:
pure_sqls = [s.replace('\0',' ') for s in
'\0'.join(sample+['']).split(';\0') if s!='']

This is also a reasonable example of a filtered list comp, as it'll
suppress any blank entries in the result. Whether this is better or
worse than trimming off the last unit depends on how you want to treat
text after the last semicolon.

ChrisA



More information about the Python-list mailing list