[SciPy-dev] GSoC weekly report

Fri Jul 6 08:05:30 EDT 2007

Hi all,
today's topics:
1. Automatic gradient checks had been implemented
2. f,c,h patterns: need more time & efforts (I describe the problems 
here according to Alan Isaac's request)
3. Letters from ALGENCAN developers.
4. My optimization department staff have explained me their own opinion 
about OpenOpt renaming, that is insisted by Jarrod Millman and Alan 
Isaac (and maybe some other scipy developers?).

1. This week automatic gradient checks for f->min, c[i](x)<=0, h[j](x)=0 
had been implemented. I guess it's done in much more convenient way that 
MATLAB fmincon do, isn't it?
See 
http://openopt.blogspot.com/2007/07/automatic-gradient-check-ready.html 
for more details, and/or see the example below:
...
OpenOpt checks user-supplied gradient df (size: (15,))
according to:
prob.diffInt = 1e-07# step for numerical gradient obtaining
prob.check.maxViolation = 1e-07# lines where difference is less than the 
number will not be shown, default 1e-5
df num         user-supplied     numerical         difference
     0             +6.000e+00     +6.000e+00     -2.122e-07
     2             -1.666e+01     -1.666e+01     +1.048e-06
     3             -2.584e+01     -2.584e+01     +1.400e-06
     4             -2.046e+01     -2.046e+01     +9.434e-07
     5             -5.461e+00     -5.461e+00     +1.512e-06
     6             +5.363e+00     +5.363e+00     +4.905e-07
     9             -2.458e+01     -2.458e+01     +5.861e-07
     10            -2.343e+01     -2.343e+01     +1.162e-06
     11            -9.929e+00     -9.929e+00     +1.588e-07
     12            +3.502e+00     +3.502e+00     +7.354e-07
     14            -7.812e+00     -7.812e+00     -8.602e-07
max(abs(df_user - df_numerical)) = 1.51219384126e-06
(is registered in df number 5)
sum(abs(df_user - df_numerical)) = 9.37081698371e-06
========================
OpenOpt checks user-supplied gradient dc (size: (15, 2))
according to:
prob.diffInt = 1e-07
prob.check.maxViolation = 1e-07
dc num   i,j:dc[i]/dx[j]   user-supplied     numerical         difference
     0              0 / 0         +4.096e+03     +4.096e+03     -4.787e-05
     2              1 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     3              1 / 1         +8.645e+00     +8.645e+00     -1.301e-07
     4              2 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     5              2 / 1         -6.658e+00     -6.658e+00     -1.350e-07
     6              3 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     8              4 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     10             5 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     12             6 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     14             7 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     16             8 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     18             9 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     20            10 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     22            11 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     24            12 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     26            13 / 0         +0.000e+00     -3.638e-05     +3.638e-05
     28            14 / 0         +0.000e+00     -3.638e-05     +3.638e-05
max(abs(dc_user - dc_numerical)) = 4.78662564092e-05
(is registered in dc number 0)
sum(abs(dc_user - dc_numerical)) = 0.000557448428236
========================
OpenOpt checks user-supplied gradient dh (size: (15, 2))
according to:
prob.diffInt = 1e-07
prob.check.maxViolation = 1e-07
dh num   i,j:dh[i]/dx[j]   user-supplied     numerical         difference
     27            13 / 1         +7.642e+02     +7.642e+02     -2.108e-05
     28            14 / 0         +3.312e+03     +3.312e+03     -5.292e-03
max(abs(dh_user - dh_numerical)) = 0.00529207172031
(is registered in dh number 28)
sum(abs(dh_user - dh_numerical)) = 0.00531315227033
========================

(to see the messages you need to turn prob.check.df=1, prob.check.dh=1, 
prob.check.dc=1)

2. Patterns: I should decide which way of implementing those ones is the 
best.
For example, consider the problem
n,m,s = 10000,1000,1000
((x+15)^2).sum() -> min# size(x)=n
acoording to constraints
x1^2 + x2^2 + ... + xm^2 <=c1
x2^2 + x3^2 + ... + x[m+1]^2 <=c2
...
xs^2+x[s+1]^2+... + x[m+s]^2 <= c[s]

so cPattern will look like
1 1 1...1 0 0 0 0
0 1 1 ..1 1 0 0 0
0 0 1 ..1 1 1 0 0
...
0 0 0 ..0 0 0 0 0
0 0 0 ..0 0 0 0 0

if I will store dc as
1) dense matrix
2) sparse matrix
it will require
1) O(n x s) ~= 10000*1000*sizeof(float) memory bytes
2) O(m x s) = 1000*1000*(sizeof(float) + 2*sizeof(int)) memory bytes
However, lots of solvers require just dL/dx = df/dx + dc/dx + dh/dx = 
grad(f) + sum(dot(mu,c(x))) + sum(dot(lambda,h(x)))
so, if dc/dx will be obtained sequentially as dc[i]/dx or dc/dx[i], we 
can just summarize these ones step by step and amount of memory needed 
will be O(n) , approximately sizeof(float)*n
However, unfortunately not all solvers need just dL/dx, or L is not just 
f + <mu,c> + <lambda, h>, like for example in Augmented Lagrangian - 
related solvers. On the other hand, seems like it should always work 
with fPattern, because f = Fwhole is always just mere sum of f[j]
One more problem is how to implement data storing. The most obvious 
approach is of course double cycle
for i in xrange(n)
  for j in xrange(len(c))#number of c inequalities
    dc[indexes related to the vector (and its' length) returned from 
c[j](x) evaluation] = (c[j](x)-c0[j])/p.diffInt
('cause c[j] can return several numbers, if we consider the most general 
case)
However, double cycle is too slow for Python. I intend to write single 
cycle, based on
for i in xrange(n)
  j = where(cPattern[i])
 C = <somehow evaluate all c[j](x)>
 dc[some ind] = (C[some ind] - C0[some ind])/p.diffInt
here some ind could be something like [1,2,3] + [4,5] + [8,9,10] + ... + 
[100, 101] - values obtained from those j: cPattern[j]=1

of couse in code I will divide dcmatrix only once, using ufunc /, when 
whole matrix dc is obtained

However, some problems still exist.

3.This week I've received (and answered to) some letters from ALGENCAN 
developers, they are interested in connecting their solvers to Python 
and OO. Their solvers often works better than IPOPT, some results are 
attached in their articles 
http://www.ime.usp.br/~egbirgin/tango/publications.php
(Their software is Augmented Lagrangian - based)
+ They informed me that they had changed lots of their solvers licenses 
from "free for noncommercial usage" (that are not OSI-approved) to GPL 
(some their software remain commercial).

4. So, I have informed my department about your intention to change 
OpenOpt name.
They have answered me:
Dmitrey, we had agreed to make OpenOpt our department's vendor, like 
TomOpt, PENOPT, IPOPT, SolvOpt, CVXOPT are, as well as construct (in 
future) a website like openopt.net and create a banner like
"member of
OpenOpt
project"
for link exchange and/or our partners.
So you propose to turn it  into something like "member of 
scikits.optimize project"?
So it's your own decision, but please, don't annoy with your questions 
anymore - we have our own work, and we do not see any benefits of 
spending our time with your one, moreover, for free. Please, just finish 
your education as quickly as it can be and go away - we need people 
working in our projects.
Particularly, it means that I will not be able to implement the QP/QPQC 
solver that I told about (this one is needed as default lincher QP 
subproblem solver with BSD license).
Also, unfortunately it means problems with consulting about global 
GRASP-based solver (and implementing this one into 
scikits.optimization), because I still misunderstood some dr. Shilo's 
algorithm steps.

In brief that's all for the week. BTW it's last one before 1st GSoC 
milestone (July 9th) and I had done all mentioned in 
http://projects.scipy.org/scipy/scipy/wiki/OptimizationProblems
except patterns.

Regards, D.