generating SQL - problem

Alex Martelli aleaxit at yahoo.com
Wed Aug 8 17:51:44 EDT 2001


"Lee" <lee.reilly at ntlworld.com> wrote in message
news:3B71AA2D.E9D219B7 at ntlworld.com...
> I wonder if someone could offer me a little advice. I have a large
> (155KB) text file holding MP3 data in the format:
>
> Filename with path   ; Title    ; Artistd
> d:\\SRS\\Breed.mp3      ; Breed         ; Snake River Conspiracy
>
> I want to parse this file and create a single SQL insert statement, but

I wouldn't be surprised if your SQL engine failed to accept a 200K
statement, but that's another issue.  Also, the string you're building
is not a single insert statement... it's a lot of them separated by
semicolons (you COULD build a single insert if you wanted, since
an insert can insert many records in a table).

> I am having a few problems. I have the following script, which works
> fine (may not be pretty, but it works ;-)
>
> #
>
,---------------------------------------------------------------------------
--+
>
> # | createMP3SQL.py
> # |
> # | import string
> # |
> # | fileobject=open('newdata.txt', 'r')  # open the playlist
> # | line = fileobject.readline()          # read the first line
> # |
> # | FullSQL = ''  # the final query string
> # | end = 'false' # test condition - true when end of file is reached
> # | while (end!='true'):
> # |   try:
> # |     line = fileobject.readline()
> # |     trackdata=string.split(line, ";")
> # |     file=string.rstrip(trackdata[0])
> # |     title=string.lstrip(string.rstrip(trackdata[1]))
> # |     artist=string.lstrip(string.rstrip(trackdata[2]))
> # |     SQL = 'insert into MP3s values (\'' + file + '\', \'' + title +
> '\', \''
> # |     + artist + '\')'
> # |     FullSQL = FullSQL + SQL + ";"
> # |   except Exception:
> # |     end='true'

It's VERY inefficient to build up a huge string by concatenating
many small ones, one by one.  Here's a far faster way:

result = []
for line in fileobject.readlines():
    trackdata = line.split(',')
    file, title, artist = map(string.strip, trackdata[:3])
    result.append("insert into MP3 values('%s','%s','%s');"%
        (file, title, artist))
fullSQL = ''.join(result)

It still has issues (if you have any artist named "O'Malley" you're
going to be in SQL syntax trouble:-) but I think it's not too bad
now.

> When I run this script is takes ~5 secs to process, and when I type in

I bet my version's faster...?

> 'FullSQL' to display the text (the SQL string) in IDLE it halts
> completely.

I think that's the problem -- _IDLE_ may well have limits to
the length of strings it can display.  Try showing len(fullSQL)
first, then fullSQL[-99:] to see just the last 99 characters, &c.


Alex






More information about the Python-list mailing list