basic question: target assignment in for loop

Carl Banks imbosol-1046071041 at aerojockey.com
Mon Feb 24 02:34:19 EST 2003


Kawaldeep Grewal wrote:
> hello,
> 
> this may be a faq, and if it is, I would appreciate a pointer in the 
> right direction.
> 
> I'm using python to edit some text/html with the re module. I want to do 
> this:
> 
> html = htmlFile.readlines()
> for line in html:
>       line = re.sub("regexString", functionReturningString, line),
> 
> but python assigns the target by value and not by reference, which (in 
> my mind) breaks the abstraction. So, I have to resort to this:
> 
> html = htmlFile.readlines()
> i = 0
> while i < len(html):
>        html[i] = re.sub("regexString", functionReturningString, html[i])
>        i = i + 1
> 
> this code is decidedly not elegant, and looks very C-ish. As I'm new to 
> python, can anyone tell me whether I'm just confused or that this is the 
> way to do things?


Sometimes it's best to just build up a separate list.  For example:

    oldhtml = htmlFile.readlines()
    newhtml = []
    for line in oldhtml:
        newhtml.append(re.sub("regexString", functionReturningString, line))


You can get the same effect using one line of code using a list
comprehension; these are more complicated:

    html = [ re.sub("regexString", functionReturningString, line)
             for line in htmlFile.readlines() ]


And if you would rather use an index, you would find the xrange
function useful; it is used for iteration:

    html = htmlFile.readlines()
    for i in xrange(len(html)):
        html[i] = re.sub("regexString", functionReturningString, html[i])


A couple of other points.  It will certainly benefit you speedwise to
use a compiled regular expression.  Do this by using re.compile:

    pattern = re.compile("regexString")
    html = htmlFile.readlines()
    for i in xrange(len(html)):
        html[i] = pattern.sub(functionReturningString, html[i])


Second, if your goal is to simply replace one string (or regex) with
another everywhere in a file, and then write it out, this can be done
much more efficiently with one sweep over the entire buffer:

    pattern = re.compile("regexString")
    buf = htmlFile.read()
    buf = pattern.sub(functionReturningString, buf)


-- 
CARL BANKS




More information about the Python-list mailing list