[MATRIX-SIG] Much ado about nothingness.

Hoon Yoon - IPT Quant hyoon@nyptsrv1.etsd.ml.com
Wed, 9 Jul 1997 10:01:08 -0400


Gentlemen:

  I know some people are bit frustrated with this. And my earlier posts
have been some what compremised by my lack of underlying mechanics.
   First of all, I added None Python object to my array not because I
thought adding this object was appropriate, but this was about the most
closes
All the other stuff like filling in the last known value,
'hot-record-replacement', statistical inference via EM algorithm, robust
pseudo-values,
etc., are useful in their place and _mutually contradictory_.
t thing to NaN or missing in Gauss, Ox, and Splus (my matrix langs)
that I could find. I actually thought it was a bad idea, because it changed
my matrix to complex and it should have been basic Int type. Cost of this
overhead was clearly high.
   As Andrew mentioned:
   
All the other stuff like filling in the last known value,
'hot-record-replacement', statistical inference via EM algorithm, robust
pseudo-values,
etc., are useful in their place and _mutually contradictory_.
(snip)

   How I would deal with missing data once identified is completely depended
on kind of modeling I do. I was illustrating some basic methodology that I
would use this missing identifier. Once missing has been specified, I should
know what to do with it. As I have said no-package out there deals with msgs
well enough, but also there is no stat package out there without a missing
identifier inside an array that I know of. (I don't know mathlab, but above
mentioned plus SAS has one.) As everyone here realizes there are a lot of
missings and some can be easily controlled before it goes in. Some can't be
deal with untill it screws up your model. A very large number, often created
by dividing by near zero often creeps up. To generate this number you just
need to do a operation on two seeming innoncent numbers.
   Missing in other package is an error code, which can be operated on but
simply returns missing error code instead of blowing up the program.
msg > 1 ? = msg
msg==msg ? = msg
msg+1 = msg
msg/1 = msg
1/msg = msg
add.reduce([1, 3, 4, msg, 4]) = msg 
(If I want the answer, msg should either be taken out or turn to some other
number. As may said, this should be done prior, but too often how this is
done is completely conditional on other matrix and I also have to update
other matricies as well).
   Any operation done on msg will turn up msg, because it's an error condition.
And this goes back to garbage collection and ease of developemnt issues.
I am not sure [0,1] shadow matrix could deal with this adaquetly. I guess
I still have not thought and know about these methods.
   I got 256MB on my Sparc and Pentium, the reason I have it is because 
financial data especially realtime is extremely vast with a lot of missing
data points. 
   I hope some people are not suggesting that NumPy is for computer scientist
types only and whatever I should find for my use I have to build one. As I 
have said, every matrix language I ever used of looked had missing. If someone
told me that particular package does not have a some specific statistical
method like Logit, I would understand and build my own, but missing is way
too low level for someone like me. 
Frankly, I would love to pay someone to do this; however, lack of organized
structure in Python is problem. I believe Python is a wonderful language and
due to its free and open nature, I would like to be able to pay someone to
get this going and contribute it back to the community. This would benefit
both my employer and community quite well. Merrill, however, will not deal
with individuals and have hard time dealing with smalll operators off the
Web. I had Ox developer contacted and we agreed on a very nice fat support
contract recently only to have it turned down when the purchasing called 
the good professor and found out that he works by himself in his office.
   Never mind the fact that Ox is extremely fast and the developer was
incredibly helpful and only reason we agreed on a nice support contract
was because I could not get a consulting contract. The purchasing basically
said we do not deal with individuals, look for something else. I would love
to have someone come up with a support company for Python with some 
affilliation to some established entity. Pay the person to added needed
functionality and contribute general purpose codes back to commnunity. I
can reasonably do this every beginning of fiscal year on a contracted 
support. Any takers? ;)
   I guess without the missing issues settled, I will probably need to 
link in some other package for analytics and graphics to Python if
possible and only use NumPy for Derivative Calcs, where the inputs are
less uncertain than Portfolio Trading. It's not a perfect solution, but
nothing is so uncertain.

Hoon,

P.S.: I still welcome any more input to putting NA as an additinal class.
I don't think I am really capable yet, but I would like to contribute
something back if I can. Oh, my earlier posts about using Gauss like
operators in Python is clearly whining. I should just get used to my new
environment, but, unless NumPy is not for social science and financial
modeling, I believe I have justifiable complaint given so many other
languages I've used in past.

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________