Help with Regex for domain names

Tim Daneliuk tundra at tundraware.com
Thu Jul 30 11:47:06 EDT 2009


Feyo wrote:
> I'm trying to figure out how to write efficiently write a regex for
> domain names with a particular top level domain. Let's say, I want to
> grab all domain names with country codes .us, .au, and .de.
> 
> I could create three different regexs that would work:
> regex = re.compile(r'[\w\-\.]+\.us)
> regex = re.compile(r'[\w\-\.]+\.au)
> regex = re.compile(r'[\w\-\.]+\.de)
> 
> How would I write one to accommodate all three, or, better yet, to
> accommodate a list of them that I can pass into a method call? Thanks!

Just a point of interest:  A correctly formed domain name may have a
trailing period at the end of the TLD [1].  Example:

       foo.bar.com.

Though you do not often see this, it's worth accommodating "just in
case"...


[1] http://homepages.tesco.net/J.deBoynePollard/FGA/web-fully-qualified-domain-name.html



-- 
----------------------------------------------------------------------------
Tim Daneliuk     tundra at tundraware.com
PGP Key:         http://www.tundraware.com/PGP/



More information about the Python-list mailing list