Simulating call-by-reference

Bengt Richter bokr at oz.net
Thu Nov 17 22:47:24 EST 2005


On Thu, 17 Nov 2005 10:03:50 GMT, Rikard Bosnjakovic <bos at REMOVETHIShack.org> wrote:

>I'm tidying up some code. Basically, the code runs a bunch of 
>regexp-searches (> 10) on a text and stores the match in a different variable.
>
>Like this:
>
>     re1 = r' ..(.*).. '
>     re2 = r' .... '
>     re3 = r' .(.*).. '
>     ...
>     m = re.search(re1, data)
>     if m:
>       myclass.bar = m.group(1)
>
>     m = re.search(re2, data)
>     if m:
>       myclass.foo = m.group(1)
>
>     m = re.search(re3, data)
>     if m:
>       myclass.baz = m.group(1)
>
>
>While this code works, it's not very good looking.
>
>What I want is to rewrite it to something like this:
>
>    l = [ (re1, myclass.bar),
>          (re2, myclass.foo),
>          (re3, myclass.baz),
>        ]
>
>    for (x,y) in l:
>      m = re.search(x, y)
>      if m:
>           y = m.group(1)
>
>But since Python doesn't work that way, that idea is doomed. What I'm 
>looking for are other (better) ways or pointers to accomplish this task of 
>cleanup.
>
You could tag your regexs with the foo bar baz names, and pre-compile them.
Then you could do something like

 >>> import re
 >>> re1 = re.compile(r'(?P<bar>\d+)') # find an int
 >>> re2 = re.compile(r'(?P<foo>[A-Z]+)') # find a cap seq
 >>> re3 = re.compile(r'(?P<baz>[a-z]+)') # find a lower case seq
 >>>
 >>> data = 'abc12 34CAPS lowercase'
 >>>
 >>> class myclass(object): pass # ??
 ...
 >>> class myotherclass(object): pass # ???
 ...
 >>> L = [ (re1, myclass),
 ...       (re2, myclass),
 ...       (re3, myotherclass),
 ...     ]
 >>> for (rx, cls) in L:
 ...     m = rx.search(data)
 ...     if m:
 ...         setattr(cls, *m.groupdict().items()[0])
 ...
 >>> myclass.bar
 '12'
 >>> myclass.foo
 'CAPS'
 >>> myotherclass.baz
 'abc'

Of course, this is only finding a single group, so this specific code
might not work for other searches you might like to do. Also, if you don't
need an alternate myotherclass, DRY says don't repeat it in L. I.e., you
could write (spelling myclass more conventionally)

 >>> class MyClass(object): pass # ??
 ...
 >>> L = (re1, re2, re3)
 >>> for (rx) in L:
 ...     m = rx.search(data)
 ...     if m: setattr(MyClass, *m.groupdict().items()[0])
 ...
 >>> for k,v in MyClass.__dict__.items(): print '%15s: %r'%(k,v)
 ...
      __module__: '__main__'
             bar: '12'
             baz: 'abc'
        __dict__: <attribute '__dict__' of 'MyClass' objects>
             foo: 'CAPS'
     __weakref__: <attribute '__weakref__' of 'MyClass' objects>
         __doc__: None
 >>> for it in (it for it in MyClass.__dict__.items() if not it[0].startswith('_')): print '%15s: %r'%it
 ...
             bar: '12'
             baz: 'abc'
             foo: 'CAPS'

The
    setattr(MyClass, *m.groupdict().items()[0])

just makes an assignment of whatever comes out to be the first of name-tagged
matches (of which there has to at least one here also). If you want several name-tagged
matches in a single regex, you could do that and do a setattr for each item in m.groubdict().items().

What else you can do is only limited by your imagination ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list