Python good for data mining?

Bruno Desthuilliers bdesth.quelquechose at free.quelquepart.fr
Sun Nov 4 16:36:35 EST 2007


Jens a écrit :
> I'm starting a project in data mining, and I'm considering Python and
> Java as possible platforms.
> 
> I'm conserned by performance. Most benchmarks report that Java is
> about 10-15 times faster than Python,

Benchmarking is difficult, and most benchmarks are easily 'oriented'. 
(pure) Python is slower than Java for some tasks, and as fast as C for 
some others. In the first case, it's quite possible that a C-based 
package exists.

> and my own experiments confirms
> this. 

<bis mode="Benchmarking is difficult">
If you go that way, Java is way slower than C++ - and let's not talk 
about resources...
</bis>

> I could imagine this to become a problem for very large
> datasets.

If you have very large datasets, you're probably using a serious RDBMS, 
that will do most of the job.

> How good is the integration with MySQL in Python?

Pretty good - but I wouldn't call MySQL a serious RDBMS.

> What about user interfaces? How easy is it to use Tkinter for
> developing a user interface without an IDE? And with an IDE? (which
> IDE?)

If your GUI is complex and important enough to need a GUI builder (which 
I guess is what you mean by IDE), then forget about Tkinter, and go for 
either pyGTK, pyQT or wxPython.

> What if I were to use my Python libraries with a web site written in
> PHP, Perl or Java - how do I intergrate with Python?

HTTP is language-agnostic.



More information about the Python-list mailing list