Noob script needs some input: CVS PatchMaker
John Machin
sjmachin at lexicon.net
Fri Jun 16 20:13:21 EDT 2006
On 16/06/2006 7:28 PM, Holger wrote:
> Well, that was an excellent opportunity to get some python practice, so
> below is my first shot at the problem.
>
> Any feedback on what would be "the pythonic way" to do this would be
> much appreciated!
>
> #!/usr/bin/env python
> # Copyright 2006 Holger Lindeberg Bille
>
> import sys, re, os
> import popen2
>
> workingfile = re.compile("^Working file: *(.*)$")
> revision = re.compile("^revision *(.*)$")
> fileend =
> re.compile("^===========================================================================")
> details = re.compile("^date: *")
> entryend = re.compile("^----------------------------")
> branches = re.compile("^branches:( *(.*);)*")
>
> class LogEntry:
> def __init__(self):
> self.rev = 0
> self.prevrev = 0
> self.text = []
>
> def setName(self, name):
> self.name = name
>
> def read(self, file):
> done = 0
> for line in file:
> regx = details.search(line)
> if regx:
> pass
> else:
> if entryend.search(line):
> break
> else:
> if fileend.search(line):
> done = 1
> break
> else:
> self.text.append(line.strip())
> return done
IMHO that flight of geese heading equatorwards for winter is not Xic for
any language X. Compare with:
| def read(self, file):
| done = 0
| for line in file:
| regx = details.search(line)
| if regx:
| pass
| elif entryend.search(line):
| break
| elif fileend.search(line):
| done = 1
| break
| else:
| self.text.append(line.strip())
| return done
2nd comment: Make a habit of NOT using the names of built-ins like
"file" for your own names. Pretend they are reserved words. Doesn't
matter in this case, but will save you grief some day soon.
3rd comment: Read the section in the re manual that explains the
difference between search and match. Searching for "^foo" will give the
same results as using match() with "foo" or the redundantly anchored
"^foo". However some regex engines when presented with
re.search("^foo", "x" * 10000)
will note that there is no joy at offset 0, there is no point (given the
anchor "^") of looking at offset 1, and return almost immediately.
Others (cough, cough) will check at offset 1, 2, ...
Ponder these results:
python -mtimeit -s"import re;rx=re.compile('^foo');txt='x'*10000"
"rx.match(txt)"
100000 loops, best of 3: 1.2 usec per loop
python -mtimeit -s"import re;rx=re.compile('foo');txt='x'*10000" "
rx.search(txt)"
10000 loops, best of 3: 19.8 usec per loop
python -mtimeit -s"import re;rx=re.compile('^foo');txt='x'*10000"
"rx.search(txt)"
1000 loops, best of 3: 201 usec per loop
4th comment: what you have called "regx" is a match object. "mobj" might
be a better choice. The term "regex" is applied to a pattern, or
sometimes to the compiled re object.
> def GuessPrevRev(self):
> pass
>
> def filter(self, filter):
Ugh. THREE filters: the built-in, the argument, and the method.
In any case, this method doesn't perform a filtering operation, and the
arg is not a filter, it's an re pattern. Suggestion:
def anyLinesMatch(self, pattern):
> found = 0
> for line in self.text:
> if filter.search(line):
> found = 1
> break
> return found
[snip]
> class FileLog:
> def __init__(self):
> self.revs = []
>
[snip]
>
> def filter(self, filter):
> found = 0
> newrevs = []
> for rev in self.revs:
> if rev.filter(filter):
Waahhh! The filter count has now hit 4.
> found = 1
> newrevs.append(rev)
> self.revs = newrevs
> return found
>
[snip]
>
> class LogDB:
> def __init__(self):
> self.flogs = []
[snip]
> def filter(self, filter):
> newflogs = []
> for flog in self.flogs:
> if flog.filter(filter):
See above.
> newflogs.append(flog)
> self.flogs = newflogs
>
[snip]
HTH,
John
More information about the Python-list
mailing list