Question about concatenation error

Steve Holden steve at holdenweb.com
Wed Sep 7 13:10:54 EDT 2005


colonel wrote:
> On Wed, 07 Sep 2005 16:34:25 GMT, colonel <thecamel at camelrichard.org>
> wrote:
> 
> 
>>I am new to python and I am confused as to why when I try to
>>concatenate 3 strings, it isn't working properly. 
>>
>>Here is the code:
>>
>>------------------------------------------------------------------------------------------
>>import string
>>import sys
>>import re
>>import urllib
>>
>>linkArray = []
>>srcArray = []
>>website = sys.argv[1]
>>
>>urllib.urlretrieve(website, 'getfile.txt')           
>>
>>filename = "getfile.txt"                            
>>input = open(filename, 'r')                          
>>reg1 = re.compile('href=".*"')                       
>>reg3 = re.compile('".*?"')                           
>>reg4 = re.compile('http')                            
>>Line = input.readline()                             
>>
>>while Line:                                          
>>   searchstring1 = reg1.search(Line)                
>>   if searchstring1:                                
>>       rawlink = searchstring1.group()              
>>       link = reg3.search(rawlink).group()          
>>       link2 = link.split('"')                      
>>       cleanlink = link2[1:2]                       
>>       fullink = reg4.search(str(cleanlink))
>>       if fullink:
>>           linkArray.append(cleanlink)                          
>>       else:
>>           cleanlink2 = str(website) + "/" + str(cleanlink)
>>           linkArray.append(cleanlink2)
>>   Line = input.readline()                                   
>>
>>print linkArray
>>-----------------------------------------------------------------------------------------------
>>
>>I get this:
>>
>>["http://www.slugnuts.com/['index.html']",
>>"http://www.slugnuts.com/['movies.html']",
>>"http://www.slugnuts.com/['ramblings.html']",
>>"http://www.slugnuts.com/['sluggies.html']",
>>"http://www.slugnuts.com/['movies.html']"]
>>
>>instead of this:
>>
>>["http://www.slugnuts.com/index.html]",
>>"http://www.slugnuts.com/movies.html]",
>>"http://www.slugnuts.com/ramblings.html]",
>>"http://www.slugnuts.com/sluggies.html]",
>>"http://www.slugnuts.com/movies.html]"]
>>
>>The concatenation isn't working the way I expected it to.  I suspect
>>that I am screwing up by mixing types, but I can't see where...
>>
>>I would appreciate any advice or pointers.
>>
>>Thanks.
> 
> 
> 
> Okay.  It works if I change:
> 
>         fullink = reg4.search(str(cleanlink))
>         if fullink:
>             linkArray.append(cleanlink)                          
>         else:
>             cleanlink2 = str(website) + "/" + str(cleanlink)
> 
> to
> 
>         fullink = reg4.search(cleanlink[0])
>         if fullink:
>             linkArray.append(cleanlink[0])                          
>         else:
>             cleanlink2 = str(website) + "/" + cleanlink[0]
> 
> 
> so can anyone tell me why "cleanlink" gets coverted to a list?  Is it
> during the slicing?
> 
> 
> Thanks.

The statement

     cleanlink = link2[1:2]

results in a list of one element. If you want to accesss element one 
(the second in the list) then use

     cleanlink = link2[1]

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/




More information about the Python-list mailing list