Question about concatenation error

colonel thecamel at camelrichard.org
Wed Sep 7 12:34:25 EDT 2005


I am new to python and I am confused as to why when I try to
concatenate 3 strings, it isn't working properly. 

Here is the code:

------------------------------------------------------------------------------------------
import string
import sys
import re
import urllib

linkArray = []
srcArray = []
website = sys.argv[1]

urllib.urlretrieve(website, 'getfile.txt')           

filename = "getfile.txt"                            
input = open(filename, 'r')                          
reg1 = re.compile('href=".*"')                       
reg3 = re.compile('".*?"')                           
reg4 = re.compile('http')                            
Line = input.readline()                             

while Line:                                          
    searchstring1 = reg1.search(Line)                
    if searchstring1:                                
        rawlink = searchstring1.group()              
        link = reg3.search(rawlink).group()          
        link2 = link.split('"')                      
        cleanlink = link2[1:2]                       
        fullink = reg4.search(str(cleanlink))
        if fullink:
            linkArray.append(cleanlink)                          
        else:
            cleanlink2 = str(website) + "/" + str(cleanlink)
            linkArray.append(cleanlink2)
    Line = input.readline()                                   

print linkArray
-----------------------------------------------------------------------------------------------

I get this:

["http://www.slugnuts.com/['index.html']",
"http://www.slugnuts.com/['movies.html']",
"http://www.slugnuts.com/['ramblings.html']",
"http://www.slugnuts.com/['sluggies.html']",
"http://www.slugnuts.com/['movies.html']"]

instead of this:

["http://www.slugnuts.com/index.html]",
"http://www.slugnuts.com/movies.html]",
"http://www.slugnuts.com/ramblings.html]",
"http://www.slugnuts.com/sluggies.html]",
"http://www.slugnuts.com/movies.html]"]

The concatenation isn't working the way I expected it to.  I suspect
that I am screwing up by mixing types, but I can't see where...

I would appreciate any advice or pointers.

Thanks.



More information about the Python-list mailing list