[Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.
Arthur de Souza Ribeiro
arthurdesribeiro at gmail.com
Sun Apr 17 20:07:38 CEST 2011
2011/4/15 Stefan Behnel <stefan_ml at behnel.de>
> [please avoid top-posting]
>
> Arthur de Souza Ribeiro, 15.04.2011 04:31:
>
> I've created the .pyx files and it passed in all python tests.
>>
>
> Fine.
>
> As far as I can see, you only added static types in some places.
Did you test if they are actually required (maybe using "cython -a")? Some
> of them look rather counterproductive and should lead to a major slow-down.
In fact, I didn't, but, after you told me to do that, I run cython -a and
removed some unnecessary types.
> I added comments to your initial commit.
>
Hi Stefan, about your first comment : "And it's better to let Cython know
that this name refers to a function." in line 69 of encoder.pyx file I
didn't understand well what does that mean, can you explain more this
comment?
About the other comments, I think I solved them all, any problem with them
or other ones, please tell me. I'll try to fix.
> Note that it's not obvious from your initial commit what you actually
> changed. It would have been better to import the original file first, rename
> it to .pyx, and then commit your changes.
>
I created a directory named 'Diff files' where I put the files generated by
'diff' command that i run in my computer, if you think it still be better if
I commit and then change, there is no problem for me...
>
> It appears that you accidentally added your .c and .so files to your repo.
>
>
> https://github.com/arthursribeiro/JSON-module
>
>
Removed them.
>
> To test them, as I said, I copied the .py test files to my project
>> directory, generated the .so files, import them instead of python modules
>> and run. I run every test file and it passed in all of them. To run the
>> tests, run the file 'run-tests.sh'
>>
>> I used just .pyx in this module, should I reimplement it using pxd with
>> the
>> normal .py?
>>
>
> Not at this point. I think it's more important to get some performance
> numbers to see how your module behaves compared to the C accelerator module
> (_json.c). I think the best approach to this project would actually be to
> start with profiling the Python implementation to see where performance
> problems occur (or to look through _json.c to see what the CPython
> developers considered performance critical), and then put the focus on
> trying to speed up only those parts of the Python implementation, by adding
> static types and potentially even rewriting them in a way that Cython can
> optimise them better.
>
I've profilled the module I created and the module that is in Python 3.2,
the result is that the cython module spent about 73% less time then python's
one, the output to the module was like this (blue for cython, red for
python):
The behavior between my module and python's one seems to be the same I think
that's the way it should be.
JSONModule nested_dict
10004 function calls in 0.268 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
10000 0.196 0.000 0.196 0.000 :0(dumps)
1 0.000 0.000 0.268 0.268 :0(exec)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.072 0.072 0.268 0.268 <string>:1(<module>)
1 0.000 0.000 0.268 0.268 profile:0(for ii in
range(10000): fun(thing))
0 0.000 0.000 profile:0(profiler)
json nested_dict
60004 function calls in 1.016 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.016 1.016 :0(exec)
20000 0.136 0.000 0.136 0.000 :0(isinstance)
10000 0.120 0.000 0.120 0.000 :0(join)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.088 0.088 1.016 1.016 <string>:1(<module>)
10000 0.136 0.000 0.928 0.000 __init__.py:180(dumps)
10000 0.308 0.000 0.792 0.000 encoder.py:172(encode)
10000 0.228 0.000 0.228 0.000 encoder.py:193(iterencode)
1 0.000 0.000 1.016 1.016 profile:0(for ii in
range(10000): fun(thing))
0 0.000 0.000 profile:0(profiler)
JSONModule ustring
10004 function calls in 0.140 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
10000 0.072 0.000 0.072 0.000 :0(dumps)
1 0.000 0.000 0.140 0.140 :0(exec)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.068 0.068 0.140 0.140 <string>:1(<module>)
1 0.000 0.000 0.140 0.140 profile:0(for ii in
range(10000): fun(thing))
0 0.000 0.000 profile:0(profiler)
json ustring
40004 function calls in 0.580 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
10000 0.092 0.000 0.092 0.000 :0(encode_basestring_ascii)
1 0.004 0.004 0.580 0.580 :0(exec)
10000 0.060 0.000 0.060 0.000 :0(isinstance)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.100 0.100 0.576 0.576 <string>:1(<module>)
10000 0.152 0.000 0.476 0.000 __init__.py:180(dumps)
10000 0.172 0.000 0.324 0.000 encoder.py:172(encode)
1 0.000 0.000 0.580 0.580 profile:0(for ii in
range(10000): fun(thing))
0 0.000 0.000 profile:0(profiler)
The code is upated in repository, any comments that you might have, please,
let me know. Thank you very much for your feedback.
Best Regards.
[]s
Arthur
>
> Stefan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20110417/fb55539c/attachment.html>
More information about the cython-devel
mailing list