[SciPy-User] Molecular Index Calculation
Stephen P. Molnar
s.molnar at sbcglobal.net
Sun Apr 16 12:58:01 EDT 2017
On 04/16/2017 11:58 AM, David Mikolas wrote:
> wrong: I = Zij*trangle.sum()
>
> better: I = (Zij*trangle).sum()
>
>
> On Sun, Apr 16, 2017 at 11:56 PM, David Mikolas
> <david.mikolas1 at gmail.com <mailto:david.mikolas1 at gmail.com>> wrote:
>
> I escaped a cluttered Fortran/C/Pascal/Basic/Perl jumble by "living"
> in stackoverflow. Mostly by reading questions and answers, and
> learning how not to get my questions closed in milliseconds. Break
> your program into small pieces, and look for how each piece might be
> done differently.
>
> If you try to find ways to do things without ever manually checking
> the sizes of arrays, you'll naturally be guided towards "pythonic"
> methods and techniques. There is almost always a better way.
>
> Computers are so fast these days, don't worry about ultimate speed
> until you really hit a brick wall. Letting python, numpy, or scipy
> methods handle housekeeping for you means you are usually defaulting
> to very well written compiled routines.
>
> So for example, let python do the thinking...
>
> r = np.sqrt((diff ** 2).sum(2))
> r[r < 0.0001] = 0.0001
> # or just
> r = np.maximum(np.sqrt((diff ** 2).sum(2)), 0.0001)
>
> # and
>
> Zij = Z * Z[:, None]
> triangle = np.triu(np.ones_like(Zij), k=1)
> I = Zij*trangle.sum()
>
> # or just
>
> I = np.triu(Z * Z[:, None], k=1).sum()
>
>
> On Sun, Apr 16, 2017 at 8:25 PM, Stephen P. Molnar
> <s.molnar at sbcglobal.net <mailto:s.molnar at sbcglobal.net>> wrote:
>
> On 04/15/2017 04:07 PM, Hjalmar Turesson wrote:
>
> Hi Stephen,
>
>
> I see that N = 10 (ie. the length of Z), and that r is
> indexed by i and
> j, which run up to N. However, r is only a 5x5 array. Is
> this correct?
>
> Otherwise, something like this might work:
>
> import numpy as np
>
> start=1
> finish=31
> points=300
> s = np.linspace(start, finish, points)
>
> MASS = np.array([12.011, 1.008, 79.9, 35.453, 18.998])
>
> r = np.array([[0., 2.059801, 3.60937686, 3.32591826,
> 2.81569212],
> [2.059801, 0., 4.71452879, 4.45776445,
> 4.00467382],
> [3.60937686, 4.71452879, 0., 5.66500917,
> 5.26602175],
> [3.32591826, 4.45776445, 5.66500917, 0.,
> 5.02324896],
> [2.81569212, 4.00467382, 5.26602175,
> 5.02324896, 0.]])
>
> r[r == 0.] = 0.0001
>
> Z = np.array([6.0, 210.0, 102.0, 54.0, 35.0, 17.0, 9.0,
> 595.0, 315.0,
> 153.0])
>
> N = Z.size
>
> I = np.zeros(s.shape)
>
> for i in range(1, N):
> for j in range(j):
> I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
>
>
> Do you have an example of the correct output?
>
> Best,
> Hjalmar
>
> On Sat, Apr 15, 2017 at 6:29 AM, Stephen P. Molnar
> <s.molnar at sbcglobal.net <mailto:s.molnar at sbcglobal.net>
> <mailto:s.molnar at sbcglobal.net
> <mailto:s.molnar at sbcglobal.net>>> wrote:
>
> I am a newcomer to Python and am attempting to
> translate a FORTRAN
> program that I wrote about 20 years ago into Python,
> specifically v-3.6.
>
> Let me say that I am not asking someone to write the
> program for me,
> but only to point me is the right direction.
>
> So far, I have bumbled my way to the point that I can
> get all of the
> input data resulting from a quantum mechanical
> calculation of a very
> simple organic molecule in to a Python program, but am
> encountering
> problems with processing the data.
>
> The equation that I a want to evaluate (the one that I
> programmed in
> FORTRAN) is equation (7) in the attached scan of one of
> the pages of
> the following document.
>
> Stephen P. Molnar and James W. King, Theory and
> Applications of the
> Integrated Molecular Transform and the Normalized
> Molecular Moment
> Structure Descriptors: QSAR and QSPR Paradigms, Int. J
> Quantum
> Chem., 85, 662 (2001).
>
> the equation is
>
> I(s) = sum from i=2toN sum from j=1toi-1 Z_sub_i*Z_sub_j *
>
> sin s*r_sub_ij
> ----------------- Equation (7)
> s*r_sub_ij
>
> What I have managed to do so far is for a simple
> organic molecule
> containing 5 atoms (CHFClBr):
>
> import numpy as np
> import numpy, pandas, scipy.spatial.distance as dist
>
> r=[] #distances between pairs of atoms
> z = [] #atomic numbers of atoms
> Z_dist = [] #products of successive atomic numbers
> I = [] #vector calculated in equation (7)
>
> start=1
> finish=31
> points=300
> s = np.linspace(start,finish,points)
>
>
> name = input("Enter Molecule ID: ")
> name = str(name)
> name_in = name+'.dat'
>
> df = pandas.read_table(name_in, skiprows=2, sep=" ",
> skipinitialspace=True)
> z = numpy.array(df["ZA"])
> N = numpy.ma.size(z) ## of atoms in molecule
> a = numpy.array(df[["X", "Y", "Z"]])
> dist.squareform(dist.pdist(a, "euclidean"))
> anrows, ancols = np.shape(a)
> a_new = a.reshape(anrows, 1, ancols)
> diff = a_new - a
>
> r = (diff ** 2).sum(2)
> r = np.sqrt(r)
>
> for j in range(0,N-1):
> for k in range(j+1,N):
> Z_diff = (z[j]*z[k])
>
>
> For the test molecule I get the following:
>
> MASS = [ 12.011 1.008 79.9 35.453 18.998]
> (Atomic weights
> of atoms)
>
> r: [ 0. 2.059801 3.60937686 3.32591826
> 2.81569212]
> [ 2.059801 0. 4.71452879 4.45776445
> 4.00467382]
> [ 3.60937686 4.71452879 0. 5.66500917
> 5.26602175]
> [ 3.32591826 4.45776445 5.66500917 0.
> 5.02324896]
> [ 2.81569212 4.00467382 5.26602175 5.02324896 0
> <tel:02324896%20%200>. ]
>
>
> z:
>
> 6.0
> 210.0
> 102.0
> 54.0
> 35.0
> 17.0
> 9.0
> 595.0
> 315.0
> 153.0
>
> I have checked these calculations with a spreadsheet
> calculation,
> consequently I'm correct as far as I have gotten.
>
> However, my problem is calculating the value of I for
> the 300
> x-coordinate values required for the molecular index
> calculation.
>
> The problem that I am currently having is calculating
> the product of
> the pre-sine function and the sine term for the 300
> values of s.
>
> Unfortunately, the wealth of function in Python makes
> it difficult
> for me to ascertain just how to proceed. It seems that
> I have the
> problem of seeing the trees because of the size of the
> forest. At
> this point Goggling only makes my confusion deeper.
> Any pointers in
> the correct direction will be greatly appreciated.
>
> Thanks in advance.
>
> --
> Stephen P. Molnar, Ph.D. Life is a fuzzy set
> www.molecular-modeling.net
> <http://www.molecular-modeling.net>
> <http://www.molecular-modeling.net
> <http://www.molecular-modeling.net>>
> Stochastic and multivariate
> (614)312-7528 <tel:%28614%29312-7528> (c)
> Skype: smolnar1
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at python.org <mailto:SciPy-User at python.org>
> <mailto:SciPy-User at python.org <mailto:SciPy-User at python.org>>
> https://mail.python.org/mailman/listinfo/scipy-user
> <https://mail.python.org/mailman/listinfo/scipy-user>
> <https://mail.python.org/mailman/listinfo/scipy-user
> <https://mail.python.org/mailman/listinfo/scipy-user>>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at python.org <mailto:SciPy-User at python.org>
> https://mail.python.org/mailman/listinfo/scipy-user
> <https://mail.python.org/mailman/listinfo/scipy-user>
>
> Hjalmar
>
> Thanks for your response.
>
> I'm sorry that my original request wasn't a bit clearer. I
> presume that the paper I attached to the request help was
> deleted before the email was posted to the list.
>
> Here is how I incorporated your suggestion in the code.
>
> import numpy as np
> import numpy, pandas, scipy.spatial.distance as dist
>
> r = []
> s=[]
> z = []
> Z = []
> I = []
>
> start=1
> finish=31
> points=300
> s = np.linspace(start,finish,points)
> np.savetxt('Number of Points',s,delimiter=' ')
>
> name = input("Enter Molecule ID: ")
> name = str(name)
> name_in = name+'.dat'
>
> df = pandas.read_table(name_in, skiprows=2, sep=" ",
> skipinitialspace=True)
> Z = numpy.array(df["ZA"])
> print('z = ',z)
> N = numpy.ma.size(z)
> a = numpy.array(df[["X", "Y", "Z"]])
> dist.squareform(dist.pdist(a, "euclidean"))
> anrows, ancols = np.shape(a)
> a_new = a.reshape(anrows, 1, ancols)
> diff = a_new - a
>
> r = (diff ** 2).sum(2)
> r = np.sqrt(z)
>
> r[r == 0.] = 0.0001
>
> N = Z.size
>
> I = np.zeros(s.shape)
>
> for i in range(1, N):
> for j in range(j):
> I += Z[i] * Z[j] * np.sin(s * r[i, j])/(s * r[i, j])
> print('I: ',I)
>
> np.savetxt('I',I,delimiter=' ')
>
> The for j in range(j): has a marginal note (I'm using the Spyder
> IDE) "undefined name 'y' " and I is a vector with three hundred
> 0.0 terms.
>
> also, please note that I removed 'Z = np.array([6.0, 210.0,
> 102.0, 54.0, 35.0, 17.0, 9.0, 595.0, 315.0, 153.0])' from your
> suggestion as that is the result of the per-sine term 'Z[i] * Z[j]'
>
> Regards,
>
> Steve
>
>
> --
> Stephen P. Molnar, Ph.D. Life is a fuzzy set
> www.molecular-modeling.net <http://www.molecular-modeling.net>
> Stochastic and multivariate
> (614)312-7528 <tel:%28614%29312-7528> (c)
> Skype: smolnar1
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at python.org <mailto:SciPy-User at python.org>
> https://mail.python.org/mailman/listinfo/scipy-user
> <https://mail.python.org/mailman/listinfo/scipy-user>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at python.org
> https://mail.python.org/mailman/listinfo/scipy-user
>
Great suggestions!
Many thanks.
--
Stephen P. Molnar, Ph.D. Life is a fuzzy set
www.molecular-modeling.net Stochastic and multivariate
(614)312-7528 (c)
Skype: smolnar1
More information about the SciPy-User
mailing list