[Tutor] Learning Regular Expressions

Terry--gmail terry.kemmerer at gmail.com
Mon May 23 18:08:45 EDT 2016


Running Linux Mint
The YouTube Sentdex Video tutor I am following.
He is working in Python3.4 and I am running Python3.4.3

He's demonstrating some Regular Expressions which I wanted to test out. 
On these test scripts, for future referrence, I have been putting my 
notes in Tripple Quotes and naming the scripts descriptively to be able 
to find them again, when I need to review. However, this time, when  I 
copied in a simple script below my RE notes, and ran it from IDLE (and 
from Console) I got the following error:

SyntaxError:  EOF while scanning triple-quoted string literal

Now, there was also a tripple-quoted string I had set a variable to in 
my script...so I thought it was the active part of the script! But 
eventually, through the process of elimination, I discovered the 
scripted worked great without the notes!  I'd like to know what it is in 
the below Tripple-Quoted section that is causing me this problem...if 
anyone recognizes. In IDLE's script file..._it's all colored green_, 
which I thought meant Python was going to ignore everything between the 
tripple-quotes! But if I run just the below portion of the script in 
it's own file, I get the same While Scanning Tripple-Quotes error.

#!/usr/bin/env python3

'''
Regular Expressions - or at least some

Identifiers:

\d  any number
\D  anything but a number (digit)
\s  space
\S  anything but a space
\w  any character
\W  anything but a character
.   any character (or even a period itself if you use \.) except for a 
newline
a   search for just the letter 'a'
\b  the white space around words

Modifiers
{x}    we are expecting "x" number of something
{1, 3}  we're expecting 1-3 in length of something -, so for digits we 
write  \d{1-3}
+  means Match 1 or more
?  means Match 0 or 1
*   Match 0 or more
$  Match the end of a string
^  Match the beginning of a string
|   Match either or   - so you might write  \d{1-3} | \w{5-6}
[ ]  a range or "variance" such as [A-Z] or [A-Za-z] Cap 1st letter 
followed by lower case
             or [1-5a-qA-Z] starts with a number inclusive of 1-5 then 
lower case letter then
             followed by any Cap letter! :)

White Space Characters  (may not be seen):
\n  new line
\s  space
\t   tab
\e  escape
\f  form feed
\r  return

DON'T FORGET!:
.  +  *  ?  [  ]  $  ^  (  )  {  }  |  \   if you really want to use 
these, you must escape them '\'

'''

Thanks for your thoughts!
--Terry


More information about the Tutor mailing list