Marking hyperlinks in a Text widget

Thu Jun 8 22:53:36 EDT 2000

* André Dahlqvist <andre at beta.telenordia.se> menulis:
| 
| [snip] So instead of doing this, which would be pretty ugly, I 
| would prefer to use regular expressions for this. How would I use
| regular expressions to only find the links themselves, and not the
| stuff just in front of it?

If I'm understanding you correctly, you can use a group to match
the entire link, instead of just the prefix.  For example:

urlfinder = re.compile( "("                     # begin group
                        "(http://|ftp://)"      # match either prefix,
                        "[^\s)]+"               # then any number of
                                                # non-whitespace chars,
                                                # excluding ')'
                        ")"                     # end group

url.search("blah (http://www.cornell.edu) blah").group(0)

 => 'http://www.cornell.edu'

| Anyway, here's the code that I am not proud of which finds the URLs:
| 
| for word in string.split(text):
|     if string.split(word, "://")[0] in ("http", "ftp"):
|         pos = helpwin.textwidget.search(word, 1.0, END)
|         # Add a tag from the start to the end of the URL
|         textwidget.tag_add('url', pos, + pos + " + " + `len(word)`+ " chars")

Have you tried this on something with more than one URL in it?  It looks
like it will only tag the first URL, because when you search for "http"
or "ftp", you start over from position 1.0 each time.

I think it'd be easier to find all the URLs in the raw text first, then
insert into the text widget, tagging as you go along.

| And finally the code that has to find what URL we clicked on, and start
| the browser. I have not yet changed this to check for already running 
| browsers, but I will do that later. If there is an easier way of finding 
| these URLs I would love to hear it.
| 
| def browser(event):
|     import bisect
|     common_browsers = ['mozilla', 'netscape', 'lynx', 'w3m']
| 
|     # Find the ranges that have been tagged with the 'url' tag
|     ranges = event.widget.tag_ranges('url')
|     cursor_position = event.widget.index("@%d %d", event.x, event.y)
|     # Find out in range the cursor was clicked
|     slot = binsect.binsect(ranges, cursor_position)
|     # Get the url
|     hyperlink = event.widget.get(ranges[n-1], ranges[n])

This looks like it'd be the easiest way to do it, short of what I
suggested before (storing the URL with the tag itself).

|     for browser in common_browsers:
|         if not os.system(browser + hyperlink):
|             # Found browser, leave
|             break

You could probably search for browsers at the start of your application,
then cache the results, instead of looping through trying all possible
browsers each time.

-- 
cliff crawford    -><-    http://www.people.cornell.edu/pages/cjc26/
                          Synaesthesia now!            icq 68165166