[SciPy-User] Trying hand-writing recognition with scikits.learn

Gael Varoquaux gael.varoquaux at normalesup.org
Mon May 9 09:32:46 EDT 2011


On Mon, May 09, 2011 at 01:46:45PM +0200, Klonuo Umom wrote:
> If you could point to some source how 'digits.csv.gz' was distilled from 
> 'http://archive.ics.uci.edu/ml/machine-learning-databases/pendigits/' 
> data, or some similar example, I could probably start wondering around 
> and maybe ask smarter questions at scikits.learn mailing list 

Your whishes are 

https://github.com/scikit-learn/scikit-learn/commit/348d9aa6cab8fe0c0819514fc0cc00c32f6abba1
https://github.com/scikit-learn/scikit-learn/commit/1c7c01145eb997ae3a95513ac0854d9db8105b1e

(I did spend an hour on this).

> I tried to look from other side, like 'reusing of existing data from 
> http://mlcomp.org', but I can't find my common denominator with their 
> provided datasets. 

Yes, and this is no suprise. In general, one can face data with
arbritrary shape, size, structure... In my experience of years of data
processing, there is always at least an hour or so to spend to massage
new data into shape before being able to use it.

Cheers,

Gael



More information about the SciPy-User mailing list