looking for "optimal weighting" algorithm

Magnus Lie Hetland mlh at furu.idi.ntnu.no
Sat May 31 21:57:27 EDT 2003


In article <3ed93a31$1 at ham>, Joel Neely wrote:
>Alex Martelli wrote:
>> 
>> The problem: I need to design a "decision criterion" to classify
>> "observations".  For each observation I measure the values of a
>> number N of features, x1, x2, ... xN; the desired design criterion
>> is a set of weights w1, w2, ... wN such that for any observation I 
>> will then just compute a weighted sum
>>   S = w1*x1 + w2*x2 + ... + wN*xN
>> and classify the observation as Black if S<=1, White if S>1...

To find an optimal solution (one with an optimal margin), you should
take a look at linear support vector machines. Neural network training
in general gives no such guarantees.

OTOH: If the classifier doesn't have to be based on a linear
discriminant (decision criterion), you can use almost any machine
learning/pattern recognition that has ever been devised... (See Weka,
http://www.cs.waikato.ac.nz/ml/weka, or Orange,
http://magix.fri.uni-lj.si/orange/default.asp, for some examples.)

The choice would most likely be guided by your knowledge about the
domain (for example, if the dimensions are independent, given the
class, you might want to go for a naive Bayesian classifier, because
it is so simple).  Non-linear support vector machines would probably
be my first choice if I knew nothing of the domain -- there are
existing implementations that give excellent results in most cases.
For example, LibSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) has
won several competitions in claassification/prediction, and has a
Python interface, which is, of course, useful if you're planning on
working in Python :]

You might also want to check out AdaBoost (and related methods), which
creates a linear combination of simple classifiers (which may, I
believe, simply be the value of each dimension) incrementally,
increasingly focusing on the 'hard' examples in the data set.

-- 
Magnus Lie Hetland                "In this house we obey the laws of
http://hetland.org                 thermodynamics!"    Homer Simpson




More information about the Python-list mailing list