[Tutor] problem in replacing regex

Kumar hihiren1 at gmail.com
Wed Apr 8 14:43:38 CEST 2009


Hello Denis/Kent,

Sorry if I created any confusion.

Here is my value:
myString = """ https://hello.com/accid/12345-12
/12345-12
http://sadfsdf.com/12345-12
http://sadfsdf.com/asdf/asdf/asdf/12345-12
12345-12 """

so above statements are the value of myString (one string only).

I couldn't try denis' suggestion because I have to replace all occurrences
of 12345-12 at one go only.

Here I am trying to convert 12345-12 into <a href="
http://mysite.com/12345-12">12345-12</a> so it will make all occurrences of
12345-12 as hyperlink.

Now here I want to exclude existing hyperlinks which has 12345-12 (e.g.
http://sadfsdf.com/[^/] and http://sadfsdf.com/asdf/asdf/asdf/12345-12 and
like that)

I have applied one thing like re.sub('([^/][0-9]{6}-[0-9]{2})',r'<a href="
http://mysite.com/acid=\1">\1</a>;', text )
but i think as kent mentioned i may be wrong about the working of
[^/][0-9]{6}-[0-9]{2}). I assumed that while replacing it would ignore
12345-12 which has '/' prefixed. (i.e. /12345-12 ) this is just an example.

I tried applying [^(http).*][0-9]{6}-[0-9]{2}) , as per my assumption it
would ignore all the 12345-12 for which word is beginning with http. I may
be wrong at this assumption. Please help how can i exclude number starting
with http?

Regards,
Kumar

On Wed, Apr 8, 2009 at 4:04 PM, Kent Johnson <kent37 at tds.net> wrote:

> > Kumar <hihiren1 at gmail.com> s'exprima ainsi:
> >
> >> Hi Danis,
> >>
> >> Just to be more specific that I can add [^/] in my expression which will
> >> successfully work for url (i.e. http://sdfs/123-34) but it will also
> work
> >> for non url (i.e. /123-34 ) so u am just trying it to make it specific
> for
> >> url only
> >> I have already tried [^(http).*/] but that also failed and it didn't
> work on
> >> /123-34
>
> The [^] don't do what you think they do.
>
> In a regex, [abc] means, match a single character that is either a, b,
> or c. [^abc] means match a single character that is not a, not b and
> not c. It does *not* mean to match a three-letter string that is not
> abc, or anything like that.
>
> I think you are trying to write a negative look-behind, something like
> (?<!http//:) but it's hard to be sure without seeing your code.
>
> Did you try Denis' suggestion of a single regex to match both patterns?
>
> Kent
>
> PS Please use Reply All to reply to the list.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090408/32c87149/attachment-0001.htm>


More information about the Tutor mailing list