[SciPy-User] Molecular Index Calculation

Sun Apr 16 12:58:01 EDT 2017

On 04/16/2017 11:58 AM, David Mikolas wrote:
> wrong: I = Zij*trangle.sum()
>
> better: I = (Zij*trangle).sum()
>
>
> On Sun, Apr 16, 2017 at 11:56 PM, David Mikolas
> <david.mikolas1 at gmail.com <mailto:david.mikolas1 at gmail.com>> wrote:
>
>     I escaped a cluttered Fortran/C/Pascal/Basic/Perl jumble by "living"
>     in stackoverflow. Mostly by reading questions and answers, and
>     learning how not to get my questions closed in milliseconds. Break
>     your program into small pieces, and look for how each piece might be
>     done differently.
>
>     If you try to find ways to do things without ever manually checking
>     the sizes of arrays, you'll naturally be guided towards "pythonic"
>     methods and techniques. There is almost always a better way.
>
>     Computers are so fast these days, don't worry about ultimate speed
>     until you really hit a brick wall. Letting python, numpy, or scipy
>     methods handle housekeeping for you means you are usually defaulting
>     to very well written compiled routines.
>
>     So for example, let python do the thinking...
>
>     r = np.sqrt((diff ** 2).sum(2))
>     r[r < 0.0001] = 0.0001
>     # or just
>     r = np.maximum(np.sqrt((diff ** 2).sum(2)), 0.0001)
>
>     # and
>
>     Zij = Z * Z[:, None]
>     triangle = np.triu(np.ones_like(Zij), k=1)
>     I = Zij*trangle.sum()
>
>     # or just
>
>     I = np.triu(Z * Z[:, None], k=1).sum()
>
>
>     On Sun, Apr 16, 2017 at 8:25 PM, Stephen P. Molnar
>     <s.molnar at sbcglobal.net <mailto:s.molnar at sbcglobal.net>> wrote:
>
>         On 04/15/2017 04:07 PM, Hjalmar Turesson wrote:
>
>             Hi Stephen,
>
>
>             I see that N = 10 (ie. the length of Z), and that r is
>             indexed by i and
>             j, which run up to N. However, r is only a 5x5 array. Is
>             this correct?
>
>             Otherwise, something like this might work:
>
>             import numpy as np
>
>             start=1
>             finish=31
>             points=300
>             s = np.linspace(start, finish, points)
>
>             MASS =  np.array([12.011, 1.008, 79.9, 35.453, 18.998])
>
>             r = np.array([[0., 2.059801, 3.60937686, 3.32591826,
>             2.81569212],
>                             [2.059801, 0., 4.71452879, 4.45776445,
>             4.00467382],
>                             [3.60937686, 4.71452879, 0., 5.66500917,
>             5.26602175],
>                             [3.32591826, 4.45776445, 5.66500917, 0.,
>             5.02324896],
>                             [2.81569212, 4.00467382, 5.26602175,
>             5.02324896, 0.]])
>
>             r[r == 0.] = 0.0001
>
>             Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0,
>             595.0, 315.0,
>             153.0])
>
>             N = Z.size
>
>             I = np.zeros(s.shape)
>
>             for i in range(1, N):
>                   for j in range(j):
>                       I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
>
>
>             Do you have an example of the correct output?
>
>             Best,
>             Hjalmar
>
>             On Sat, Apr 15, 2017 at 6:29 AM, Stephen P. Molnar
>             <s.molnar at sbcglobal.net <mailto:s.molnar at sbcglobal.net>
>             <mailto:s.molnar at sbcglobal.net
>             <mailto:s.molnar at sbcglobal.net>>> wrote:
>
>                  I am a newcomer to Python and am attempting to
>             translate a FORTRAN
>                  program that I wrote about 20 years ago into Python,
>             specifically v-3.6.
>
>                  Let me say that I am not asking someone to write the
>             program for me,
>                  but only to point me is the right direction.
>
>                  So far, I have bumbled my way to the point that I can
>             get all of the
>                  input data resulting from a quantum mechanical
>             calculation of a very
>                  simple organic molecule in to a Python program, but am
>             encountering
>                  problems with processing the data.
>
>                  The equation that I a want to evaluate (the one that I
>             programmed in
>                  FORTRAN) is equation (7) in the attached scan of one of
>             the pages of
>                  the following document.
>
>                  Stephen P. Molnar and James W. King, Theory and
>             Applications of the
>                  Integrated Molecular Transform and the Normalized
>             Molecular Moment
>                  Structure Descriptors: QSAR and QSPR Paradigms, Int. J
>             Quantum
>                  Chem., 85, 662 (2001).
>
>                  the equation is
>
>                  I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *
>
>                      sin  s*r_sub_ij
>                     -----------------  Equation (7)
>                          s*r_sub_ij
>
>                  What I have managed to do so far is for a simple
>             organic molecule
>                  containing 5 atoms (CHFClBr):
>
>                  import numpy as np
>                  import numpy, pandas, scipy.spatial.distance as dist
>
>                  r=[]        #distances between pairs of atoms
>                  z = []      #atomic numbers of atoms
>                  Z_dist = [] #products of successive atomic numbers
>                  I = []      #vector calculated in equation  (7)
>
>                  start=1
>                  finish=31
>                  points=300
>                  s = np.linspace(start,finish,points)
>
>
>                  name = input("Enter Molecule ID: ")
>                  name = str(name)
>                  name_in = name+'.dat'
>
>                  df = pandas.read_table(name_in, skiprows=2, sep=" ",
>                  skipinitialspace=True)
>                  z = numpy.array(df["ZA"])
>                  N =  numpy.ma.size(z)  ## of atoms in molecule
>                  a = numpy.array(df[["X", "Y", "Z"]])
>                  dist.squareform(dist.pdist(a, "euclidean"))
>                  anrows, ancols = np.shape(a)
>                  a_new = a.reshape(anrows, 1, ancols)
>                  diff = a_new - a
>
>                  r = (diff ** 2).sum(2)
>                  r = np.sqrt(r)
>
>                  for j in range(0,N-1):
>                        for k in range(j+1,N):
>                            Z_diff = (z[j]*z[k])
>
>
>                  For the test molecule I get the following:
>
>                  MASS =  [ 12.011   1.008  79.9    35.453  18.998]
>             (Atomic weights
>                  of atoms)
>
>                  r:  [ 0.          2.059801    3.60937686  3.32591826
>             2.81569212]
>                        [ 2.059801    0.          4.71452879  4.45776445
>             4.00467382]
>                        [ 3.60937686  4.71452879  0.          5.66500917
>             5.26602175]
>                        [ 3.32591826  4.45776445  5.66500917  0.
>             5.02324896]
>                        [ 2.81569212  4.00467382  5.26602175  5.02324896 0
>                  <tel:02324896%20%200>.        ]
>
>
>                  z:
>
>                  6.0
>                  210.0
>                  102.0
>                  54.0
>                  35.0
>                  17.0
>                  9.0
>                  595.0
>                  315.0
>                  153.0
>
>                  I have checked these calculations with a spreadsheet
>             calculation,
>                  consequently I'm correct as far as I have gotten.
>
>                  However, my problem is calculating the value of I for
>             the 300
>                  x-coordinate values required for the molecular index
>             calculation.
>
>                  The problem that I am currently having is calculating
>             the product of
>                  the pre-sine function and the sine term for the 300
>             values of s.
>
>                  Unfortunately, the wealth of  function in Python makes
>             it difficult
>                  for me to ascertain just how to proceed.  It seems that
>             I have the
>                  problem of seeing the trees because of the size of the
>             forest.  At
>                  this point Goggling only makes my confusion deeper.
>             Any pointers in
>                  the correct direction will be greatly appreciated.
>
>                  Thanks in advance.
>
>                  --
>                  Stephen P. Molnar, Ph.D.                Life is a fuzzy set
>             www.molecular-modeling.net
>             <http://www.molecular-modeling.net>
>             <http://www.molecular-modeling.net
>             <http://www.molecular-modeling.net>>
>                           Stochastic and multivariate
>             (614)312-7528 <tel:%28614%29312-7528> (c)
>                  Skype: smolnar1
>
>
>                  _______________________________________________
>                  SciPy-User mailing list
>             SciPy-User at python.org <mailto:SciPy-User at python.org>
>             <mailto:SciPy-User at python.org <mailto:SciPy-User at python.org>>
>             https://mail.python.org/mailman/listinfo/scipy-user
>             <https://mail.python.org/mailman/listinfo/scipy-user>
>                  <https://mail.python.org/mailman/listinfo/scipy-user
>             <https://mail.python.org/mailman/listinfo/scipy-user>>
>
>
>
>
>             _______________________________________________
>             SciPy-User mailing list
>             SciPy-User at python.org <mailto:SciPy-User at python.org>
>             https://mail.python.org/mailman/listinfo/scipy-user
>             <https://mail.python.org/mailman/listinfo/scipy-user>
>
>         Hjalmar
>
>         Thanks for your response.
>
>         I'm sorry that my original request wasn't a bit clearer.  I
>         presume that the paper I attached to the request help was
>         deleted before the email was posted to the list.
>
>         Here is how I incorporated your suggestion in the code.
>
>         import numpy as np
>         import numpy, pandas, scipy.spatial.distance as dist
>
>         r = []
>         s=[]
>         z = []
>         Z = []
>         I = []
>
>         start=1
>         finish=31
>         points=300
>         s = np.linspace(start,finish,points)
>         np.savetxt('Number of Points',s,delimiter=' ')
>
>         name = input("Enter Molecule ID: ")
>         name = str(name)
>         name_in = name+'.dat'
>
>         df = pandas.read_table(name_in, skiprows=2, sep=" ",
>         skipinitialspace=True)
>         Z = numpy.array(df["ZA"])
>         print('z = ',z)
>         N =  numpy.ma.size(z)
>         a = numpy.array(df[["X", "Y", "Z"]])
>         dist.squareform(dist.pdist(a, "euclidean"))
>         anrows, ancols = np.shape(a)
>         a_new = a.reshape(anrows, 1, ancols)
>         diff = a_new - a
>
>         r = (diff ** 2).sum(2)
>         r = np.sqrt(z)
>
>         r[r == 0.] = 0.0001
>
>         N = Z.size
>
>         I = np.zeros(s.shape)
>
>         for i in range(1, N):
>              for j in range(j):
>                  I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
>                  print('I: ',I)
>
>         np.savetxt('I',I,delimiter=' ')
>
>         The for j in range(j): has a marginal note (I'm using the Spyder
>         IDE) "undefined name 'y' " and I is a vector with three hundred
>         0.0 terms.
>
>         also, please note that I removed 'Z = np.array([6.0, 210.0,
>         102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0, 153.0])' from your
>         suggestion as that is the result of the per-sine term 'Z[i] * Z[j]'
>
>         Regards,
>
>                  Steve
>
>
>         --
>         Stephen P. Molnar, Ph.D.                Life is a fuzzy set
>         www.molecular-modeling.net <http://www.molecular-modeling.net>
>                      Stochastic and multivariate
>         (614)312-7528 <tel:%28614%29312-7528> (c)
>         Skype: smolnar1
>         _______________________________________________
>         SciPy-User mailing list
>         SciPy-User at python.org <mailto:SciPy-User at python.org>
>         https://mail.python.org/mailman/listinfo/scipy-user
>         <https://mail.python.org/mailman/listinfo/scipy-user>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at python.org
> https://mail.python.org/mailman/listinfo/scipy-user
>
Great suggestions!

Many thanks.

-- 
Stephen P. Molnar, Ph.D.		Life is a fuzzy set
www.molecular-modeling.net		Stochastic and multivariate
(614)312-7528 (c)
Skype: smolnar1