Extracting real-domain-name (without sub-domains) from a given URL

Terry Reedy tjreedy at udel.edu
Tue Jan 13 19:11:44 EST 2009


S.Selvam Siva wrote:

>> I doubt anyone's created a general ready-made solution for this, you'd
>> have to code it yourself.
>> To handle the common case, you can cheat and just .split() at the
>> periods and then slice and rejoin the list of domain parts, ex:
>> '.'.join(domain.split('.')[-2:])
>>
>> Cheers,
>> Chris
> 
> 
> Thank you Chris Rebert,
>   Actually i tried with domain specific logic.Having 200 TLD like
> .com,co.in,co.uk and tried to extract the domain name.
>   But my boss want more reliable solution than this method,any way i
> will try to find some alternative solution.

I make a dict mapping TLDs to number of parts to strip off
parts = {
'com':1,
'in':2,
'org':1,
'uk':2,
}
etc

If certain TLDs need a special function, define the function first and 
map that TLD to the function and then switch on the type of value (int 
or function) when you look it up.




More information about the Python-list mailing list