Import a textfile to MS SQL with python

Tim Golden tim.golden at viacom-outdoor.co.uk
Wed Sep 6 05:53:54 EDT 2006


[joel.sjoo at gmail.com]

| I'm a dba for SQL server and I Will import a textfile to SQL. For
| example I use a file with 3 columns. ID, Name and Surname and the
| columns are tab separated. I don't know much about programming.
| Anyway, I use this code below. It works, but it will not split the
| columns. 

Split the problem into two parts:

1) Determine the correct row/column values from your tab-separated file
2) Write the values into your database table

The first part is probably best handled by the built-in csv
module. While you can roll your own there are quite a few
gotchas you have to dodge - embedded delimiters and so on. 

Something like this:

<code>
import csv

# by default reader uses "," as delimiter; specify tab instead

reader = csv.reader (open ("test.tsv"), delimiter="\t")
data = []
for line in reader:
  data.append (line)

# or data = list (reader)

print data
#
# Something like:
# [[1, "Tim", "Golden"], [2, "Fred", "Smith"], ...]
#

</code>

OK, now you've got a list of lists, each entry being one
row in your original file, each item one column. To get
it into your database, you'll need something like the
following -- ignoring the possibility of executemany.

<code>
# uses data from above
import <database module> # pymssql, oracle, sqlite, etc.

db = <database module>.connect (... whatever you need ...)
q = db.cursor ()
for row in data:
  q.execute (
    "INSERT INTO python (id, namn, efternamn) VALUES (?, ?, ?)",
    row
  )

db.commit () # if needed etc.
db.close ()
</code>

This works because the DB-API says that an .execute takes
as its first parameter the SQL command plus any parameters 
as "?" (or something else depending on the paramstyle, 
but this is probably the most common). Then as the second
parameter you pass a list/tuple containing as many items
as the number of "?" in the command. You don't need to worry
about quoting for strings etc; the db interface module should
take care of that.

Behind the scenes, this code will be doing something like this
for you:

INSERT INTO python (id, namn, efternamn) VALUES (1, 'Tim', 'Golden')
INSERT INTO python (id, namn, efternamn) VALUES (2, 'Fred', 'Smith')

and so on, for all the rows in your original data.

Some db interface modules implement .executemany, which means that
you specify the statement once and pass the whole list at one go.
Whether it's more efficient than looping yourself depends on what's
happening behind the scenes. It's certainly a touch tidier.

Hope all that is intelligble and helpful
TJG

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________



More information about the Python-list mailing list