[Tutor] Regex help

Bill Burns billburns at pennswoods.net
Mon Oct 10 06:10:58 CEST 2005


I'm looking to get the size (width, length) of a PDF file. Every pdf
file has a 'tag' (in the file) that looks similar to this

Example #1
MediaBox [0 0 612 792]

or this

Example #2
MediaBox [ 0 0 612 792 ]

I figured a regex might be a good way to get this data but the
whitespace (or no whitespace) after the left bracket has me stumped.

If I do this

pattern = re.compile('MediaBox \[\d+ \d+ \d+ \d+')

I can find the MediaBox in Example #1 but I have to do this

pattern = re.compile('MediaBox \[ \d+ \d+ \d+ \d+')

to find it for Example #2.

How can I make *one* regex that will match both cases?

Thanks for the help,

Bill



More information about the Tutor mailing list