sqlite user-defined functions & unicode issue

Ben Wolfson wolfson at gmail.com
Wed Jan 24 00:24:52 EST 2007


I've got a db some of whose elements have been created automatically
from filesystem data (whose encoding is iso-8859-1).  If I try to
select one of those elements using a standard SQL construct, things
work fine:

>>> sqlite.register_converter("text", unicode)
>>> db = sqlite.connect(".mdb/database", detect_types=sqlite.PARSE_COLNAMES)
>>> cursor = db.cursor()
>>> cursor.execute("SELECT * FROM artists WHERE name LIKE ?", ("John
Butcher%",)).fetchall()
[(1653, u'John Butcher & Eddie Pr\xe9vost')]
>>>

But if I use the pysqlite recipe for searching with a regexp, things don't work:
>>> db.create_function("regexp", 2, regexp)
>>> cursor.execute("SELECT * FROM artists WHERE REGEXP(?, name)",
('John Butcher',)).fetchall()

Traceback (most recent call last):
  File "<pyshell#14>", line 1, in -toplevel-
    cursor.execute("SELECT * FROM artists WHERE REGEXP(?, name)",
('John Butcher',)).fetchall()
OperationalError: user-defined function raised exception
>>>

Testing reveals that "item" is passed as None:
>>> def regexp(expr, item):
	if item is None: return False
	r = re.compile(expr, re.I)
	return r.search(item) is not None

>>> db = sqlite.connect(".mdb/database", detect_types=sqlite.PARSE_COLNAMES)
>>> db.create_function("regexp", 2, regexp)
>>> cursor.execute("SELECT * FROM artists WHERE REGEXP(?, name)",
('John Butcher',)).fetchall()
[]
>>>

How can I get around this?  I really want to be able to search by
regexp, and not just the standard SQL %-pattern.

-- 
Ben Wolfson
"However, identifying what we call 'time' or even 'space', which I
shall mention soon, is a very difficult problem, and a philosopher
would say that it is an extremely annoying subject."
(Soseki Natsume, "The Philosophical Foundations of Literature")



More information about the Python-list mailing list