[IronPython] Slow Performance of CPython libs?
Birsch
birsch at gmail.com
Thu Feb 21 17:30:12 CET 2008
I took on Cooper's advice and profiled with ANTS. Here are the top methods:
Namespace Method name Time (sec.) Time with children (sec.) Hit count Source
file System.Text.RegularExpressions RegexInterpreter.Go() 37.0189 94.4676
13689612
System.Text.RegularExpressions RegexInterpreter.Operator() 6.2411 6.2411
131146274
System.Text.RegularExpressions RegexInterpreter.Advance(int i) 5.9264
8.7202 66000263
System.Text.RegularExpressions RegexInterpreter.SetOperator(int op) 5.5750
5.5750 131146274
System.Text.RegularExpressions RegexInterpreter.Backtrack() 5.5692 9.4895
37781343
IronPython.Runtime.Operations Ops.CallWithContext(ICallerContext context,
object func, object arg0, object arg1) 5.5572 114.5245 79754
IronPython.Runtime.Calls Method.Call(ICallerContext context, object arg0)
4.9052 114.8251 50886
IronPython.Runtime.Calls PythonFunction.CallInstance(ICallerContext
context, object arg0, object arg1) 4.8876 114.8059 50886
IronPython.Runtime.Calls Function2.Call(ICallerContext context, object
arg0, object arg1) 4.6400 114.5471 47486
IronPython.Runtime.Operations Ops.CallWithContext(ICallerContext context,
object func, object arg0) 4.2344 114.1604 146658
System.Text.RegularExpressions RegexBoyerMoore.Scan(string text, int index,
int beglimit, int endlimit) 3.6465 3.6465 13678131
System.Text.RegularExpressions RegexCharClass.CharInClassRecursive(char ch,
string set, int start) 3.6288 5.7113 31508162
System.Text.RegularExpressions RegexInterpreter.Goto(int newpos) 3.2058
5.1470 27364668
System.Text.RegularExpressions RegexInterpreter.Operand(int i) 3.1923
3.1923 73230687
System.Text.RegularExpressions RegexRunner.EnsureStorage() 3.0803 3.0803
51474823
System.Text.RegularExpressions RegexCharClass.CharInClass(char ch, string
set) 3.0713 8.7827 31508162
IronPython.Runtime.Calls Method.Call(ICallerContext context, object arg0,
object arg1) 2.9821 7.8675 15012
IronPython.Runtime.Calls PythonFunction.CallInstance(ICallerContext
context, object arg0, object arg1, object arg2) 2.9794 7.8639 15012
System.Text.RegularExpressions RegexInterpreter.Forwardcharnext() 2.8852
2.8852 62865185
System.Text.RegularExpressions RegexInterpreter.Forwardchars() 2.8279
2.8279 59436277
System.Text.RegularExpressions RegexCharClass.CharInClassInternal(char ch,
string set, int start, int mySetLength, int myCategoryLength) 2.0632 2.0826
31508162
System.Text.RegularExpressions RegexRunner.Scan(Regex regex, string text,
int textbeg, int textend, int textstart, int prevlen, bool quick) 1.8376
101.7226 43009
System.Text.RegularExpressions RegexInterpreter.FindFirstChar() 1.6405
5.3456 13701755
IronPython.Runtime.Types OldClass.TryLookupSlot(SymbolId name, out object
ret) 1.5573 2.8124 389516
IronPython.Runtime.Operations Ops.GetAttr(ICallerContext context, object o,
SymbolId name) 1.5365 5.3456 558524
System.Text.RegularExpressions RegexInterpreter.Textpos() 1.4020 1.4020
32648926
System.Text.RegularExpressions RegexInterpreter.Advance() 1.1916 2.9526
13703950
System.Text.RegularExpressions RegexInterpreter.Textto(int newpos) 1.1218
1.1218 24120890
System.Text.RegularExpressions RegexInterpreter.TrackPeek() 1.0579 1.0579
24120894
System.Text.RegularExpressions Regex.Run(bool quick, int prevlen, string
input, int beginning, int length, int startat) 0.7280 102.4644 43009
System.Text.RegularExpressions RegexInterpreter.TrackPush(int I1) 0.6834
0.6834 13745149
System.Text.RegularExpressions RegexInterpreter.StackPush(int I1) 0.6542
0.6542 13703955
System.Text.RegularExpressions RegexInterpreter.TrackPop() 0.6068 0.6068
13663035
System.Text.RegularExpressions RegexInterpreter.TrackPush() 0.6049 0.6049
13708230
System.Text.RegularExpressions RegexInterpreter.StackPop() 0.5836 0.5836
13703956
System.Text.RegularExpressions RegexInterpreter.Bump() 0.4987 0.4987
10472790
System.Text.RegularExpressions RegexInterpreter.TrackPush(int I1, int I2)
0.4864 0.4864 10472790
System.Text.RegularExpressions RegexInterpreter.TrackPeek(int i) 0.4663
0.4663 10457859
System.Text.RegularExpressions RegexInterpreter.TrackPop(int framesize)
0.4396 0.4396 10457859
Moving up the stack of regex.Go(), most calls originate from sgmllib's
parse_starttag.
HTH,
-Birsch
On Thu, Feb 21, 2008 at 2:35 PM, Birsch <birsch at gmail.com> wrote:
> Thanks Michael and Dino.
>
> I'll prof and send update. Got a good profiler recommendation for .Net?
> Meanwhile I noticed the sample site below causes BeautifulSoup to generate
> quite a few [python] exceptions during __init__. Does IronPython handle
> exceptions significantly slower than CPtyhon?
>
> Repro code is simple (just build a BeautifulSoup obj with mininova's home
> page).
> Here are the .py and .cs I used to time the diffs:
>
> *bstest.py:*
> #Bypass CPython default socket implementation with IPCE/FePy
> import imp, os, sys
> sys.modules['socket'] = module = imp.new_module('socket')
> execfile('socket.py', module.__dict__)
>
> from BeautifulSoup import BeautifulSoup
> from urllib import urlopen
> import datetime
>
> def getContent(url):
> #Download html data
> startTime = datetime.datetime.now()
> print "Getting url", url
> html = urlopen(url).read()
> print "Time taken:", datetime.datetime.now() - startTime
>
> #Make soup
> startTime = datetime.datetime.now()
> print "Making soup..."
> soup = BeautifulSoup(markup=html)
> print "Time taken:", datetime.datetime.now() - startTime
>
> if __name__ == "__main__":
> print getContent("www.mininova.org")
>
>
> *C#:*
> using System;
> using System.Collections.Generic;
> using System.Text;
> using IronPython.Hosting;
>
> namespace IronPythonBeautifulSoupTest
> {
> public class Program
> {
> public static void Main(string[] args)
> {
> //Init
> System.Console.WriteLine("Starting...");
> DateTime start = DateTime.Now;
> PythonEngine engine = new PythonEngine();
>
> //Add paths:
> //BeautifulSoup.py, socket.py, bstest.py located on exe dir
> engine.AddToPath(@".");
> //CPython Lib (replace with your own)
> engine.AddToPath(@"D:\Dev\Python\Lib");
>
> //Import and load
> TimeSpan span = DateTime.Now - start;
> System.Console.WriteLine("[1] Import: " + span.TotalSeconds);
> DateTime d = DateTime.Now;
> engine.ExecuteFile(@"bstest.py");
> span = DateTime.Now - d;
> System.Console.WriteLine("[2] Load: " + span.TotalSeconds);
>
> //Execute
> d = DateTime.Now;
> engine.Execute("getContent(\"http://www.mininova.org\")");
> span = DateTime.Now - d;
> System.Console.WriteLine("[3] Execute: " + span.TotalSeconds);
> span = DateTime.Now - start;
> System.Console.WriteLine("Total: " + span.TotalSeconds);
>
> }
> }
> }
>
>
>
> On Wed, Feb 20, 2008 at 6:57 PM, Dino Viehland <
> dinov at exchange.microsoft.com> wrote:
>
> > We've actually had this issue reported once before a long time ago -
> > it's a very low CodePlex ID -
> > http://www.codeplex.com/IronPython/WorkItem/View.aspx?WorkItemId=651
> >
> > We haven't had a chance to investigate the end-to-end scenario. If
> > someone could come up with a smaller simpler repro that'd be great.
> > Otherwise we haven't forgotten about it we've just had more immediately
> > pressing issues to work on :(.
> >
> > -----Original Message-----
> > From: users-bounces at lists.ironpython.com [mailto:
> > users-bounces at lists.ironpython.com] On Behalf Of Michael Foord
> > Sent: Wednesday, February 20, 2008 5:20 AM
> > To: Discussion of IronPython
> > Subject: Re: [IronPython] Slow Performance of CPython libs?
> >
> > Birsch wrote:
> > > Hi - We've been using IronPython successfully to allow extensibility
> > > of our application.
> > >
> > > Overall we are happy with the performance, with the exception of
> > > BeautifulSoup which seems to run very slowly: x5 or more time to
> > > execute compared to CPython.
> > >
> > > Most of the time seems to be spent during __init__() of BS, where the
> > > markup is parsed.
> > >
> > > We suspect this has to do with the fact that our CPython env is
> > > executing .pyc files and can precompile its libs, while the IronPython
> > > environment compiles each iteration. We couldn't find a way to
> > > pre-compile the libs and then introduce them into the code, but in any
> > > case this will result in a large management overhead since the amount
> > > of CPython libs we expose to our users contains 100's of modules.
> > >
> > > Any ideas on how to optimize?
> >
> > I think it is worth doing real profiling to find out where the time is
> > being spent during parsing.
> >
> > If it is spending most of the time in '__init__' then the time is
> > probably not spent in importing - so compilation isn't relevant and it
> > is a runtime performance issue. (Importing is much slower with
> > IronPython and at Resolver Systems we do use precompiled binaries - but
> > strangely enough it doesn't provide much of a performance gain.)
> >
> > Michael
> > http://www.manning.com/foord
> >
> > >
> > > Thanks,
> > > -Birsch
> > >
> > > Note: we're using FePy/IPCE libs with regular IP v1.1.1 runtime DLLs
> > > (this was done to overcome library incompatibilities and network
> > > errors). However, the relevant slow .py code (mainly SGMLParser and
> > > BeautifulSoup) is the same.
> > >
> > ------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > Users mailing list
> > > Users at lists.ironpython.com
> > > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> > >
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.ironpython.com
> > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> > _______________________________________________
> > Users mailing list
> > Users at lists.ironpython.com
> > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20080221/744d61e7/attachment.html>
More information about the Ironpython-users
mailing list