[Tutor] Pointers Towards Appropriate Python Methods

Stephen P. Molnar s.molnar at sbcglobal.net
Mon Sep 30 07:28:34 EDT 2019


Please see my interspersed comments.

On 09/29/2019 03:13 PM, David L Neil via Tutor wrote:
> On 30/09/19 7:28 AM, Stephen P. Molnar wrote:
>> First, let me state that this is not a homework problem. 
>
> You are retired, but not working at home - on 'home work'?
homework within the educational definition. My study with three 
networked computers is my laoratory.
>
>
> I happen to
>> be a retired Research Chemist
>
> Is that a confession?
merely a statement of fact
>
>
>  whose rathre meager programming skills are
>> in FORTRAN.
>
> With such a pedigree, can you do any wrong?
Actually, quite a bit!!!!!
>
>
>> I have managed to write, and with help from the list debug, a very 
>> short Python script to extract a column of data from an ASCII file:
>>
>> #!/usr/bin/env python3
>> # -*- coding: utf-8 -*-
>> """
>>
>> Created on Tue Sep 24 07:51:11 2019
>>
>> """
>> import numpy as np
>>
>> fileName = []
>>
>> name = input("Enter Molecule ID: ")
>>
>> name_in = name+'_apo-1acl.RMSD'
>>
>> data = np.genfromtxt(name_in, usecols=(3), dtype=None, skip_header=8, 
>> skip_footer=1, encoding=None)
>>
>>
>> I have uploaded the script and an example of the input file to my 
>> Dropbox account in order to avoid scrambling of the file format.
>>
>> https://www.dropbox.com/sh/xwsv17vkh48tsaa/AAAfIe0miWrrk49hqZCkxe-aa?dl=0 
>>
>>
>> My problem is that I have a large number of data files that I wish to 
>> process for input to several other different Python scripts that I 
>> use as part of my Computational Chemistry research program.   I've 
>> also uploaded a bash script that illustrates what I want to do in 
>> Python.
>>
>> At this point what I would like are pointers towards python method 
>> for processing a large number of data files. I'm not asking anyone to 
>> write the coed for me.
>>
>> Thanks in advance.
>
>
> Have I understood you correctly? You have (sensibly) constructed a 
> processor which works on a single file, and now want to expand its 
> scope to process a series of similarly-formatted files?
The files I wish to process are the same and are extracted from 
molecular modeling software and are in the same format.
> (alternately: that the various files are in different formats?)
>
> One of the (many) beauties of the Python eco-system is that it has 
> "batteries included" (or pip-include-able) enabling an extremely wide 
> variety of tasks. In this case, there is no need to separate 'Python 
> work' from 'File system/BASH work' - it can ALL be done by Python!
I included the bash script in an attempt to illustrate what I would like 
to do with python.
>
> Rather than devolving the file system work to BASH, perhaps review 
> "pathlib" from the "PSL" (Python Standard Library - 
> https://docs.python.org/3/library/pathlib.html).
For example, if the files to be processed are collected into a single 
directory (or 'directory tree'), pathlib will accept the (top-level) 
directory name and then "iterdir" (iterate through all the files in that 
directory/tree). Code this into a loop (or a Python "generator") and the 
already-coded process could be serially applied to each file. This saves 
(a) BASH code, and (b) the "command-line interface" between BASH and 
Python.

This is, in fact the case, I have a series of scripts to use the 
software that produces the files that I wish to process further. 
Evehtually I will, most likely chain the together in some fashion.

>
> At the risk of causing cognitive-overload, may I also suggest reading 
> (some on-line articles/book-chapters) about "logging". If you plan to 
> follow the FORTRAN tradition of long-running batch programs, then this 
> is an ideal way to record progress, results, and errors. (IMHO logging 
> is sadly under-rated, but then much code these days is neither "batch" 
> nor server-oriented)
I have found that running a log script when working is very valuable.
>
>
> Apologies if I'm off on the wrong-track - having solved a long-time 
> issue I've had with pathlib incorrectly processing European-language 
> fileNMs, yesterday; this morning I'm re-factoring a bunch of programs 
> which 'walk a directory tree', to use a common/utility core 'walker' - 
> and "to a man with a hammer..."
I am most appreciative of your helpful comments.

Many thanks,

-- 
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1



More information about the Tutor mailing list