[Tutor] about regular expression
Alan Trautman
ATrautman@perryjudds.com
Wed Mar 26 10:39:02 2003
Hate to argue but if you are parsing MLA formatted documents there is to be
only one space after all periods. Is there a formal style guide for the text
you are accepting? The Sean's idea would work well. I also matters which
generation of style guide is used. MLA used 2 spaces after sentences until
1998. I think the APA (journalism) may use one space too.
Of course people following those standards who are not in formal
academic/professional life are rare as unicorns as well.
Hate to make it harder but English is hard. Maybe non-capitalized word
followed by a period then by spaces(s) and then a capitol letter? I'm sure I
might be missing something (You would need a way to catch questions such as
"What time is lunch" said Mary?) but maybe it will find more word for you.
HTH,
Alan
-----Original Message-----
From: Steegness [mailto:steegness@hotmail.com]
Sent: Wednesday, March 26, 2003 8:25 AM
To: tutor@python.org
Subject: Re: [Tutor] about regular expression
In standard well-formed English, a full stop period is usually followed by
two spaces, whereas abbreviations are only followed by one. Perhaps this
will serve for the purpose for the regex.
Sean
> Date: Wed, 26 Mar 2003 03:07:06 -0800 (PST)
> From: Abdirizak abdi <a_abdi406@yahoo.com>
> To: tutor@python.org
> Subject: [Tutor] about regular expression
>
> Hi everyone
>
> Can any one suggest a regular _expression that can distinguish between two
fullstop(e.g : he got married last weeek.) and abbreviated word (e.g Al.
Brown), I have tried some but I couldn't get them right. please give an
example each of these two.
>
> thanks in advance
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor