scanf style parsing

Duncan Booth duncan at NOSPAMrcp.co.uk
Thu Sep 27 07:32:35 EDT 2001


tim at vegeta.ath.cx (Tim Hammerquist) wrote in 
news:slrn9r61oo.uim.tim at vegeta.ath.cx:

> But don't think regex's are disposable just because Python's string type
> is more convenient.  Consider the following:
> 
>     # perl
>     if ($filename =~ /\.([ps]?html?|cgi|php[\d]?|pl)$/) { ... }
>     # python
>     re_web_files = re.compile(r'\.([ps]?html?|cgi|php[\d]?|pl)$')
>     m = re_web_files.search(filename)
>     if m:
>         ...
> 
> This is a very complicated (but relatively efficient way) to match files
> with all the folowing extensions:
>     .htm    .html   .shtm   .shtml  .phtm   .phtml
>     .cgi
>     .php    .php2   .php3   .php4
>     .pl

Wouldn't you be happier with this?:

   extensions = ['.htm', '.html', '.shtm', '.shtml', '.phtm',
        '.phtml', '.cgi', '.php', '.php2', 'php3', '.php4', '.pl']
   ext = os.path.splitext(filename)[1]
   if ext in extensions:
      ...

which has the arguable advantage of matching what your description says 
instead of what your original code does.

regexes are wonderful: in moderation.

-- 
Duncan Booth                                             duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?



More information about the Python-list mailing list