Replace and inserting strings within .txt files with the use of regex

MRAB python at mrabarnett.plus.com
Mon Aug 9 09:52:42 EDT 2010


Νίκος wrote:
> On 8 Αύγ, 17:59, Thomas Jollans <tho... at jollans.com> wrote:
> 
>> Two problems here:
>>
>> str.replace doesn't use regular expressions. You'll have to use the re
>> module to use regexps. (the re.sub function to be precise)
>>
>> '.'  matches a single character. Any character, but only one.
>> '.*' matches as many characters as possible. This is not what you want,
>> since it will match everything between the *first* <? and the *last* ?>.
>> You want non-greedy matching.
>>
>> '.*?' is the same thing, without the greed.
> 
> Thanks you,
> 
> So i guess this needs to be written as:
> 
> src_data = re.sub( '<?(.*?)?>', '', src_data )
> 
In a regex '?' is a special character, so if you want a literal '?' you
need to escape it. Therefore:

     src_data = re.sub(r'<\?(.*?)\?>', '', src_data)

> Tha 'r' special char doesn't need to be inserter before the regex here
> due to regex ain't containing backslashes.
> 
>> You will have to find the </body> tag before inserting the string.
>> str.find should help -- or you could use str.replace and replace the
>> </body> tag with you counter line, plus a new </body>.
> 
> Ah yes! Damn why din't i think of it.... str.replace should do the
> trick. I was stuck trying to figure regexes.
> 
> So, i guess that should work:
> 
>  src_data = src_data.replace('</body>', '<br><br><h4><font
> color=green> Αριθμός Επισκεπτών: %(counter)d </font></h4></body>' )
> 
>> No it's not. You're just giving up too soon.
> 
> Yes youa re right, your hints keep me going and thank you for that.




More information about the Python-list mailing list