Using Beautiful Soup to entangle bookmarks.html

Adam Jones ajones1 at gmail.com
Thu Sep 7 17:30:25 EDT 2006


Francach wrote:
> Hi,
>
> I'm trying to use the Beautiful Soup package to parse through the
> "bookmarks.html" file which Firefox exports all your bookmarks into.
> I've been struggling with the documentation trying to figure out how to
> extract all the urls. Has anybody got a couple of longer examples using
> Beautiful Soup I could play around with?
>
> Thanks,
> Martin.

If the only thing you want out of the document is the URL's why not
search for: href="..." ? You could get a regular expression that
matches that pretty easily. I think this should just about get you
there, but my regular expressions have gotten very rusty.

/href=\".+\"/




More information about the Python-list mailing list