Performance Problems when selecting HUGE amounts of data from MySQL dbs

Skip Montanaro skip at pobox.com
Wed Jan 9 11:14:06 EST 2002


    Gabriel>       data=db.fetchall()
    Gabriel>       data="".join(data)?
    Gabriel> which I assume should be way faster as it doesn't need to copy
    Gabriel> the data to another list first (which is nice from a memory
    Gabriel> requirement point of view as well)?

I was just about to post something like that.  Yes, that should work,
assuming db.fetchall() returns a list of strings.  On my system (using a
recent version of MySQLdb), it always returns a list of tuples, however:

    >>> c.execute("select name from performers where name != '' limit 10")
    10L
    >>> data = c.fetchall()
    >>> data
    ((' & Seville',), (' A Special Gues" Kathy Mattea Trio',), (' A Special
    Guest Dspatch',), (' A Special Guest Eddie From Ohio',), (' A Special
    Guest G.Love andSpecial Sauce',), (' A Special Guest Marshall Tucker
    Band',), (' A Special Gues Nelly Furado',), (" A Special Guest Slash's
    Snake Pit",), (' A Special Guest Tsla',), (' A Tribute To The
    Beatles',)) 

so I'd do something like

    data = "".join([x[0] for x in c.fetchall()])

to create the giant string.

    Gabriel> I was thinking about using that particular solution, but then
    Gabriel> mysql-python wouldn't provide me with fetchall() and
    Gabriel> fetchallDict() in the same instance but maybe this works in
    Gabriel> 0.9.1 to which I just upgraded...

Most of the time I tell MySQLdb to create a DictCursorNW (dictionary cursor
that ignores warnings):

    self.db = MySQLdb.Connection(host=self.host,
                                 user=self.user,
                                 passwd=self.password,
                                 db=self.database,
                                 cursorclass=MySQLdb.cursors.DictCursorNW)

-- 
Skip Montanaro (skip at pobox.com - http://www.mojam.com/)




More information about the Python-list mailing list