[MATRIX-SIG] Much ado about nothingness.

Hoon Yoon - IPT Quant hyoon@nyptsrv1.etsd.ml.com
Tue, 8 Jul 1997 10:18:43 -0400


Hello fellow snake handlers,

  I am attempting to do some Trading (stocks) related number crunching
  in Python. I am having quite a difficulty trying to implement my old
  Gauss logic to Python. It should be dealt in a different way; however,
  my mind is still in the old 2D complete matrix stage that I am finding
  it difficult to find what I wanna to do in Python.
 =20
  # I have some x period of info into Python lists/arrays. Of course as
  of any financial data there are missing values. So, I stored them as
  None, thinking that None is closest to msg values, but, as some of the
  readers would have realized, none of the nonzero, choose, equal, =
where, etc...
  can be performed on array or list with None values. I get:
  	TypeError: function not supported for these types, and can't=20
  	coerceto supported types
  Now some might say why not just use either zeros or negative values. =
Well,
  if I am just dealing with price and vols that's okay, but I am also =
dealing
  with other info like earnings, which can be easily negative.
  There is at least distinction in mathmatics/databases about missing, =
undefined,
  and not available. This is very thorny issue in numerical computing. =
Is msg =3D=3D
  msg? (This is undefined) What's truly missing vs. undefined? This is =
quite a
  issue. For people brave enogh, I encourage Joe Celko's articles in =
DBMS with
  same title, which can be obtained through DBMS. He really does good =
job of=20
  explaining the issues. There are other issues like +&- infinity NA and =
so on,
  which Joe does not get into, but overall the article serves as good =
introduction
  to the concepts.
  May be all this issues have been considered and Python has way to =
handle all this,
  but I could not find it. I would really like to see find_msg & =
index_msg find of
  functions. The most often used way to deal with missing values are :
    a. Just copy over from last period or do some estimate based on =
others, which is
  bit dangerous due to introduction of serial correlation.
    b. Get rid of the entire row of the observation. Which means throw =
the whole stock
  or the whole day from the model. Less dangerous, but could be =
problematic, espcially
  if you don't have too many observation to play with.
 =20
  find_msg should return matrix of same dimention filled with 0,1. =
index_msg should
  only work on 1 dim and give me back index to an array. I can use =
find_msg for take
  kind of function, or rever of take.
 =20
=04 Hopefully I can do this without loops!

Thanks for reading,

Hoon,

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________