[Chicago] Python Development in Chicago
Kevin L. Stern
kevin.l.stern at gmail.com
Fri Oct 26 17:54:50 CEST 2007
Kumar,
Thank you for the reply. I'll keep the __main__ thing in mind. I had
actually read about this at one point but skipped it in my code since I was
hacking it together quickly and only using it in one place. It certainly is
a good practice, though.
To answer you last question, I am not new to Chicago and am somewhat new to
python (but not new to software).
Kevin
On 10/26/07, Kumar McMillan <kumar.mcmillan at gmail.com> wrote:
>
> On 10/26/07, Kevin L. Stern <kevin.l.stern at gmail.com> wrote:
> > I'm doing some research in graph theory and I have a set of graphs
> (known as
> > the AT&T set) that is in a seemingly proprietary format - I want to swap
> it
> > into the graphml XML format. I hacked together a little Python script
> for
> > this. Would you folks say that this is 'pythonic', or does this look
> > newbie'ish?
>
> there is something in python that is somewhat wart-like and easy to
> miss in the docs (yet criticized often). Whenever you want to run
> some code in a module, say, if you want to treat that module like a
> command line script (like you are), you should really always put that
> code within this if statement:
>
> if __name__ == '__main__':
> # code goes here that you want to run from the command line
>
> why? There is no good reason for this so my first inclination is to
> answer like this: Just do it!
>
> For the non-conformists out there, here's really why. Python assigns
> the magical value '__main__' to <module>.__name__ when it is the
> top-level script, or as the lib ref at
> http://docs.python.org/lib/module-main.html says :
>
> "This module represents the (otherwise anonymous) scope in which the
> interpreter's main program executes -- commands read either from
> standard input, from a script file, or from an interactive prompt. It
> is this environment in which the idiomatic ``conditional script''
> stanza causes a script to run: " [see above]
>
> So, without that if statement your code will obviously still run but
> the code will also run *whenever* the module is imported. Might not
> be a big deal today, but tomorrow when you refactor your code or run
> anything that imports modules at will (pydoc, pudge, nosetests, the
> list goes on) then you run the dangerous risk of executing your main
> code at unexpected times.
>
> I will refrain from linking to the python list complaints about how
> unintuitive and obscure this is but rest assured it has been discussed
> many times with many alternatives offered, yet no resolution.
>
> K
>
> PS. are you new to the Chicago area or just new to Python or both?
>
> >
> > ____________________________________________________________________
> >
> > import re, mmap, os
> >
> > class token:
> > def __init__(self):
> > self.type = None
> > self.data = None
> >
> > class tokenizer:
> > def __init__(self, inmap, out):
> > self.inmap = inmap
> > self.out = out
> >
> > def nextToken(self):
> > line = inmap.readline()
> > if re.search("^graph\s.*\s{$", line):
> > ident = line[6:len(line)-3]
> > result = token()
> > result.type = 'graph'
> > result.data = ident
> > return result
> > elif re.search("^\s*subgraph.*{$", line):
> > parse = re.search("subgraph\s.*\s{$",
> line).group()
> > ident = line[10:len(line)-3]
> > result = token()
> > result.type = 'subgraph'
> > result.data = ident
> > return result
> > elif re.search("^\s*}$", line):
> > result = token()
> > result.type = 'endgroup'
> > return result
> > elif re.search("^\s*n\d+\s--\sn\d+;$", line):
> > parse = re.search("n\d+\s--\sn\d+",
> line).group()
> > split = parse.partition('--')
> > first = re.search("\d+", split[0]).group()
> > last = re.search("\d+", split[2]).group()
> > result = token()
> > result.type = 'edge'
> > result.data = [first,last]
> > return result
> > return None
> >
> > def processToken(self, t):
> > if not t:
> > return
> > if t.type == 'graph':
> > self.out.write('<graph id="%s">\n' % t.data)
> > elif t.type == 'subgraph':
> > self.out.write('<graph id="%s">\n' % t.data)
> > self.sg += 1
> > elif t.type == 'endgroup':
> > self.out.write ('</graph>\n')
> > if self.sg > 0:
> > self.sg -= 1
> > elif t.type == 'edge':
> > self.out.write('<edge source="%s"
> target="%s"/>\n' %
> > (t.data[0], t.data[1]))
> >
> > def go(self):
> > self.sg = 0
> > self.out.write("""<?xml version="1.0" encoding="UTF-8"?>
> > <graphml>
> > """)
> > while self.inmap.tell() < self.inmap.size ():
> > lex.processToken(lex.nextToken())
> >
> > self.out.write("</graphml>")
> >
> > try:
> > infile = "ug.txt"
> > insize = os.path.getsize(infile)
> > fd = open(infile, "r+")
> > inmap = mmap.mmap(fd.fileno(), insize, None, mmap.ACCESS_READ)
> > outfile = "out.txt"
> > out = open(outfile, "r+")
> > lex = tokenizer(inmap, out)
> > lex.go()
> > except IOError:
> > print "IO Error Occurred"
> > finally:
> > inmap.close()
> > out.close()
> >
> >
> > _______________________________________________
> > Chicago mailing list
> > Chicago at python.org
> > http://mail.python.org/mailman/listinfo/chicago
> >
> >
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/chicago/attachments/20071026/bc5a1065/attachment.htm
More information about the Chicago
mailing list