[Tutor] MemoryError

Liam Clarke cyresse at gmail.com
Fri Dec 10 13:07:06 CET 2004


Hi Kent, 

Thanks for the help, it worked third time around!

The final product is here if you have an interest  - 
http://www.rafb.net/paste/results/XCYthC70.html

But, I think I found a new best friend for this sort of thing - 
(?P<text>.*?)

Being able to label stuff is brilliant.

But yeah, thanks for the help, especially that sub method.

Regards,

Liam Clarke
On Thu, 09 Dec 2004 19:38:12 -0500, Kent Johnson <kent37 at tds.net> wrote:
> Liam,
> 
> Here's a nifty re trick for you. The sub() method can take a function as the replacement parameter.
> Instead of replacing with a fixed string, the function is called with the match object. Whatever
> string the function returns, is substituted for the match. So you can simplify your code a bit,
> something like this:
> 
> def replaceTag(item):   # item is a match object
>      # This is exactly your code
> 
> 
>      text=gettextFunc(item.group()) #Will try and stick to string method
>   for this, but I'll see.
>      if not text:
>         text="Default" #Will give a text value for the href, so some
>   lucky human can change it
>      url=geturlFunc(item.group()) # The simpler the better, and so far
>   re has been the simplest
>      if not url:
>        href = '"" #This will delete the applet, as there are applet's
>   acting as placeholders
>      else:
>        href='<a "%s">%s</a>' % (url, text)
> 
>      # Now return href
>      return href
> 
> now your loop and replacements get replaced by the single line
> codeSt = reObj.sub(replaceTag, codeSt)
> 
> :-)
> 
> Kent
> 
> 
> 
> 
> Liam Clarke wrote:
> > Hi all,
> >
> > Yeah, I should've written this in functions from the get go, but I
> > thought it would be a simple script. :/
> >
> > I'll come back to that script when I've had some sleep, my son was
> > recently born and it's amazing how dramatically lack of sleep affects
> > my acuity. But, I want to figure out what's going wrong.
> >
> > That said, the re path is bearing fruit. I love the method finditer(),
> >  as I can reduce my overly complicated string methods from my original
> > code to
> >
> > x=file("toolkit.txt",'r')
> > s=x.read()
> > x.close()
> > appList=[]
> >
> > regExIter=reObj.finditer(s) #Here's a re obj I compiled earlier.
> >
> > for item in regExIter:
> >    text=gettextFunc(item.group()) #Will try and stick to string method
> > for this, but I'll see.
> >    if not text:
> >       text="Default" #Will give a text value for the href, so some
> > lucky human can change it
> >    url=geturlFunc(item.group()) # The simpler the better, and so far
> > re has been the simplest
> >    if not url:
> >      href = '"" #This will delete the applet, as there are applet's
> > acting as placeholders
> >    else:
> >      href='<a "%s">%s</a>' % (url, text)
> >
> >    appList.append(item.span(), href)
> >
> > appList.reverse()
> >
> > for ((start, end), href) in appList:
> >
> >      codeSt=codeSt.replace(codeSt[start:end], href)
> >
> >
> > Of course, that's just a rought draft, but it seems a whole lot
> > simpler to me. S'pose code needs a modicum of planning.
> >
> > Oh, and I d/led BeautifulSoup, but I couldn't work it right, so I
> > tried re, and it suits my needs.
> >
> > Thanks for all the help.
> >
> > Regards,
> >
> > Liam Clarke
> > On Thu, 09 Dec 2004 11:53:46 -0800, Jeff Shannon <jeff at ccvcorp.com> wrote:
> >
> >>Liam Clarke wrote:
> >>
> >>
> >>>So, I'm going to throw caution to the wind, and try an re approach. It
> >>>can't be any more unwieldy and ugly than what I've got going at the
> >>>moment.
> >>
> >>If you're going to try a new approach, I'd strongly suggest using a
> >>proper html/xml parser instead of re's.  You'll almost certainly have
> >>an easier time using a tool that's designed for your specific problem
> >>domain than you will trying to force a more general tool to work.
> >>Since you're specifically trying to find (and replace) certain html
> >>tags and attributes, and that's exactly what html parsers *do*, well,
> >>the conclusions seems obvious (to me at least). ;)
> >>
> >>There are lots of html parsing tools available in Python (though I've
> >>never needed one myself). I've heard lots of good things about
> >>BeautifulSoup...
> >>
> >>
> >>
> >>Jeff Shannon
> >>Technician/Programmer
> >>Credit International
> >>
> >>_______________________________________________
> >>Tutor maillist  -  Tutor at python.org
> >>http://mail.python.org/mailman/listinfo/tutor
> >>
> >
> >
> > 
> _______________________________________________
> 
> 
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 


-- 
'There is only one basic human right, and that is to do as you damn well please.
And with it comes the only basic human duty, to take the consequences.


More information about the Tutor mailing list