how to strip the domain name in python?

Marko.Cain.23 at gmail.com Marko.Cain.23 at gmail.com
Sat Apr 14 11:36:17 EDT 2007


On Apr 14, 12:02 am, Michael Bentley <mich... at jedimindworks.com>
wrote:
> On Apr 13, 2007, at 11:49 PM, Marko.Cain... at gmail.com wrote:
>
>
>
> > Hi,
>
> > I have a list of url names like this, and I am trying to strip out the
> > domain name using the following code:
>
> >http://www.cnn.com
> >www.yahoo.com
> >http://www.ebay.co.uk
>
> > pattern = re.compile("http:\\\\(.*)\.(.*)", re.S)
> > match = re.findall(pattern, line)
>
> > if (match):
> >         s1, s2 = match[0]
>
> >         print s2
>
> > but none of the site matched, can you please tell me what am i
> > missing?
>
> change re.compile("http:\\\\(.*)\.(.*)", re.S) to re.compile("http:\/
> \/(.*)\.(.*)", re.S)

Thanks. I try this:

but when the 'line' is http://www.cnn.com, I get 's2' com,
but i want 'cnn.com' (everything after the first '.'), how can I do
that?

pattern = re.compile("http:\/\/(.*)\.(.*)", re.S)


    match = re.findall(pattern, line)

    if (match):

        s1, s2 = match[0]

        print s2




More information about the Python-list mailing list