Question about concatenation error

colonel thecamel at camelrichard.org
Wed Sep 7 12:42:27 EDT 2005


On Wed, 07 Sep 2005 16:34:25 GMT, colonel <thecamel at camelrichard.org>
wrote:

>I am new to python and I am confused as to why when I try to
>concatenate 3 strings, it isn't working properly. 
>
>Here is the code:
>
>------------------------------------------------------------------------------------------
>import string
>import sys
>import re
>import urllib
>
>linkArray = []
>srcArray = []
>website = sys.argv[1]
>
>urllib.urlretrieve(website, 'getfile.txt')           
>
>filename = "getfile.txt"                            
>input = open(filename, 'r')                          
>reg1 = re.compile('href=".*"')                       
>reg3 = re.compile('".*?"')                           
>reg4 = re.compile('http')                            
>Line = input.readline()                             
>
>while Line:                                          
>    searchstring1 = reg1.search(Line)                
>    if searchstring1:                                
>        rawlink = searchstring1.group()              
>        link = reg3.search(rawlink).group()          
>        link2 = link.split('"')                      
>        cleanlink = link2[1:2]                       
>        fullink = reg4.search(str(cleanlink))
>        if fullink:
>            linkArray.append(cleanlink)                          
>        else:
>            cleanlink2 = str(website) + "/" + str(cleanlink)
>            linkArray.append(cleanlink2)
>    Line = input.readline()                                   
>
>print linkArray
>-----------------------------------------------------------------------------------------------
>
>I get this:
>
>["http://www.slugnuts.com/['index.html']",
>"http://www.slugnuts.com/['movies.html']",
>"http://www.slugnuts.com/['ramblings.html']",
>"http://www.slugnuts.com/['sluggies.html']",
>"http://www.slugnuts.com/['movies.html']"]
>
>instead of this:
>
>["http://www.slugnuts.com/index.html]",
>"http://www.slugnuts.com/movies.html]",
>"http://www.slugnuts.com/ramblings.html]",
>"http://www.slugnuts.com/sluggies.html]",
>"http://www.slugnuts.com/movies.html]"]
>
>The concatenation isn't working the way I expected it to.  I suspect
>that I am screwing up by mixing types, but I can't see where...
>
>I would appreciate any advice or pointers.
>
>Thanks.


Okay.  It works if I change:

        fullink = reg4.search(str(cleanlink))
        if fullink:
            linkArray.append(cleanlink)                          
        else:
            cleanlink2 = str(website) + "/" + str(cleanlink)

to

        fullink = reg4.search(cleanlink[0])
        if fullink:
            linkArray.append(cleanlink[0])                          
        else:
            cleanlink2 = str(website) + "/" + cleanlink[0]


so can anyone tell me why "cleanlink" gets coverted to a list?  Is it
during the slicing?


Thanks.



More information about the Python-list mailing list