regular expressions eliminating filenames of type foo.thumbnail.jpg

oscartheduck oscartheduck at gmail.com
Mon Jun 25 00:48:53 EDT 2007


Hi folks,

I'm trying to alter a program I posted about a few days ago. It
creates thumbnail images from master images. Nice and simple.

To make sure I can match all variations in spelling of jpeg, and
different cases, I'm using regular expressions.


The code is currently:

-----

#!/usr/bin/env python
from PIL import Image
import glob, os, re

size = 128, 128

def thumbnailer(dir, filenameRx):
     for picture in [ p for p in os.listdir(dir) if
os.path.isfile(os.path.join(
dir,p)) and filenameRx.match(p) ]:
         file, ext = os.path.splitext(picture)
         im = Image.open (picture)
         im.thumbnail(size, Image.ANTIALIAS)
         im.save(file + ".thumbnail" + ext)

jpg = re.compile(".*\.(jpg|jpeg)", re.IGNORECASE)
thumbnailer(".", jpg)

-----

The problem is this. This code outputs foo.thumbnail.jpg when ran, and
when ran again it creates foo.thumbnail.thumbnail.jpg and so on,
filling a directory.

The obvious solution is to filter out any name that contains the term
"thumbnail", which I can once again do with a regular expression. My
problem is the construction of this expression.

The relevant page in the tutorial docs is: http://docs.python.org/lib/re-syntax.html

It lists (?<!...) as the proper syntax, with the example given being
 >>> m = re.search('(?<!abc)def', 'abcdef')


I tried adding something like that to my original regex, but it added
a third argument, which re.compile can't accept.

jpg = re.compile("(?<!thumbnail).*\.(jpg|jpeg)", ".*\.(jpg|jpeg)",
re.IGNORECASE)

I tried it with re.search instead and received a lot of errors.


 So I tried this:

jpg = re.compile(".*\.(jpg|jpeg)", re.IGNORECASE)
jpg = re.compile("(?<!*thumbnail).jpg", "jpg")
thumbnailer(".", jpg)


Two assignments, but I receive more errors telling me this:

[james at devil ~/pictures]$ ./thumbnail.2.py
Traceback (most recent call last):
  File "./thumbnail.2.py", line 15, in ?
    jpg = re.search("(?<!*thumbnail).jpg", "jpg")
  File "/usr/local/lib/python2.4/sre.py", line 134, in search
    return _compile(pattern, flags).search(string)
  File "/usr/local/lib/python2.4/sre.py", line 227, in _compile
    raise error, v # invalid expression
sre_constants.error: nothing to repeat




--------------------------------------------


I'm stuck as to where to go forwards from here. The code which
produced the above error is:

---------
#!/usr/bin/env python
from PIL import Image
import glob, os, re

size = 128, 128

def thumbnailer(dir, filenameRx):
     for picture in [ p for p in os.listdir(dir) if
os.path.isfile(os.path.join(
 dir,p)) and filenameRx.match(p) ]:
         file, ext = os.path.splitext(picture)
         im = Image.open (picture)
         im.thumbnail(size, Image.ANTIALIAS)
         im.save(file + ".thumbnail" + ext)

jpg = re.compile(".*\.(jpg|jpeg)", re.IGNORECASE)
jpg = re.search("(?<!*thumbnail).jpg", "jpg")
thumbnailer(".", jpg)
png = re.compile(".*\.png", re.IGNORECASE)
thumbnailer(".", png)
gif = re.compile(".*\.gif", re.IGNORECASE)
thumbnailer(".", gif)

---------


I'd like to know where I can find more information about regexs and
how to think with them, as well as some idea of the solution to this
problem. As it stands, I can solve it with a simple os.system call and
allow my OS to do the hard work, but I'd like the code to be portable.




More information about the Python-list mailing list