Regular expressions

Tim Chase python.list at tim.thechases.com
Tue Nov 3 22:12:08 EST 2015


On 2015-11-03 19:04, Michael Torrie wrote:
> Grep can use regular expressions (and I do so with it regularly),
> but it's default mode is certainly not regular expressions, and it
> is still very powerful.

I suspect you're thinking of `fgrep` (AKA "grep -F") which uses fixed
strings rather than regular expressions.  By default, `grep` certainly
does use regular expressions:

  tim at linux$ seq 5 | grep "1*"
  tim at bsd$ jot 5 | grep "1*"

will output the entire input, not just lines containing a "1"
followed by an asterisk.

> I've never used regular expressions in a database query language;
> until this moment I didn't know any supported such things in their
> queries.  Good to know.  How you would index for regular
> expressions in queries I don't know.

At least PostgreSQL allows for creating indexes on a particular
regular expression.  E.g. (shooting from the hip so I might have
missed something):

  CREATE TABLE contacts (
   -- ...
   phonenumber VARCHAR(15),
   -- ...
   )
  CREATE INDEX contacts_just_phone_digits_idx
   ON contacts((regexp_replace(phonenumber, '[^0-9]', '')));

  INSERT INTO contacts(..., phonenumber, ...)
   VALUES (..., '800-555-1212', ...)

  SELECT *
  FROM contacts
  WHERE -- should use contacts_just_phone_digits_idx
   regexp_replace(phonenumber, '[^0-9]', '') = '8005551212';

It's not as helpful as one might hope because you're stuck using a
fixed regexp rather than an arbitrary regexp, but if you have a
particular regexp you search for frequently, you can index it.
Otherwise, you'd be doing full table-scans (or at least a full scan
of whatever subset the active non-regexp'ed index yields) which can
be pretty killer on performance.

You'd have to research on other DB engines.

-tkc







More information about the Python-list mailing list