Which method to check if string index is queal to character.

Avi Gross avigross at verizon.net
Mon Dec 28 20:37:42 EST 2020


Thanks, Chris,

I am not actually up-to-date on such messaging issues but not shocked at
what you wrote. Years ago I recall most messages going out of my workplace
looked like machine!machine2!ihnp4!more!evenmore!user with no @ in sight and
as you mention, you may want to send to a domain and have it send to a
subdomain so a multiple @ may make sense and so on. I note we have some
places like groups.io that disguise the @ in your original email address so
you can still see who it is from, even though it is in some sense from them
but to actually use the email address in your own mailer, you need to
substitute it back in. 

I think we all agree that unless there is further standardization, an email
address can easily be rejected that is otherwise usable in some context and
that one in proper format (by some definition) will fail in that context.

The original question actually focused more narrowly on a good way to find
if a character existed in a string for which regular expressions need not
apply and most email addresses re short enough that techniques to speed up
the search may not be useful unless all the program does is search millions
of email addresses for the presence.

Dropping out, ...

-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon.net at python.org> On
Behalf Of Chris Angelico
Sent: Monday, December 28, 2020 8:02 PM
To: Python <python-list at python.org>
Subject: Re: Which method to check if string index is queal to character.

On Tue, Dec 29, 2020 at 10:08 AM Avi Gross via Python-list
<python-list at python.org> wrote:
>
> This may be a nit, but can we agree all valid email addresses as used 
> today have more than an @ symbol?
>
> I see it as requiring at least one character before the @ that come 
> from a list of allowed characters (perhaps not ASCII) but does not 
> include the symbol @ again. It is normally followed by some minimal 
> number of characters and maybe  a period and one of the currently 
> valid domains like .com or .it but the latter gets tricky as it can 
> look like user at abd.def.att.com or other long variations where only the 
> final component must be testable in the program.

There can be an @ in the first part of the address, and the domain may well
not have a dot.

> The lack of an at-sign suggests it is not an email address. The lack 
> of anything before or after also seems to disqualify it. You may be 
> able to add more conditions but as noted, having more than one at-sign 
> may also disqualify it.

Lack of an at sign means it's a local address that can't be routed over the
internet, and in many contexts, it's reasonable to exclude those. But two
isn't illegal.

> I am sure someone has some complex regular expressions that they think 
> matches only potentially valid strings but, of course, as noted by 
> Chris, to really validate that an address works might require sending 
> something and validating a human replied and that can be quite  task.
>

Yes, many such regexes exist, and they are *all wrong*. Without exception. I
don't think it's actually possible for a regex to perfectly match all
(syntactically) valid email addresses and nothing else.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list