[Tutor] Pointers Towards Appropriate Python Methods

David bouncingcats at gmail.com
Sun Sep 29 22:56:54 EDT 2019


On Mon, 30 Sep 2019 at 04:28, Stephen P. Molnar <s.molnar at sbcglobal.net> wrote:

> First, let me state that this is not a homework problem.   I happen to
> be a retired Research Chemist whose rathre meager programming skills are
> in FORTRAN.
>
> I have managed to write, and with help from the list debug, a very short
> Python script to extract a column of data from an ASCII file:

[...]

> I have uploaded the script and an example of the input file to my
> Dropbox account in order to avoid scrambling of the file format.

[...]

> At this point what I would like are pointers towards python method for
> processing a large number of data files. I'm not asking anyone to write
> the coed for me.

Hi Stephen

Maybe it's just me but I frequently detect a "fish out of water" tone
in programming questions that I've seen you ask. And frequently
underneath the surface complexity that you present, the solution is
simple. And I imagine that the chemistry that you work on is vastly more
complex than any of the programming questions you ask, so I feel that
you're probably well smart enough to handle these basic programming
questions yourself. And I'm writing this message in an attempt to help
and encourage you with that. Please excuse me if I am misreading your
situation, I'm only trying with a possible approach that might help you.

People who have already replied here have responded at a higher level
of complexity than I do. That might be appropriate, I can't tell, but it
in case it helps you I am going to respond as if this is a beginner-level
question (no offense intended) from someone who has barely has any
ability to write anything at all in Python. Please correct me if I am wrong.

For anyone else reading, here's Stephen's "bash script that
illustrates what I want to do in Python.":

for d in $(cat ligand.list) ; do
  cd "${d}_apo-1acl"
  echo "${d}_apo-1acl"
  echo "${d}_apo-1acl.dpf"
  /home/comp/Apps/Autodock/autodock4 -p "${d}_apo-1acl.dpf" -l
"${d}_apo-1acl.dlg"
  cd ..
done

So it appears that you just want to appy some data
processing function or script to various datasets in files whose names
you know. This is an elementary question in any programming language.
No computer would be useful if we couldn't do this.

As an aside, you mention that the data processing functions are in
individual scripts. But given the simplicity and awkward nature of the
above bash script, I wonder how complicated those scripts are, and if
it is necessary for them to be separate scripts. What is the line
count of each of those python scripts you mentioned?

I suggest that if you spent a day or two learning some basic python then
you might realise how simple this could all be for you. We are here to
help if you want to do that. You could read a tutorial, write some toy
scripts, and then you might realise how simple this stuff is, at least
at the level of the problems you wish to solve.

Basically I am suggesting that I think it would be easy for you to reach a
level of competency in Python such that much of this would become
enjoyable and trivial for you, instead of difficult.

If the question boils down to 2 things:
  - There is a list of filenames (or maybe just one filename).
  - There is processing that we want to apply to each file content.
Then pseudo code for this can look like:

for a_file_name in list_of_file_names:
    file_we_are_reading = open_the_file(a_file_name)
    data = read_stuff_from(file_we_are_reading)
    close_the_file(file_we_are_reading)
    do_something_with(data)

And the nice thing about the Python language is that what I
have written there can be valid Python that will run.
Python syntax can be much closer to human readable text
than the Fortran you mentioned you had experience with!

Have you ever looked at the official Python tutorial?
  https://docs.python.org/3/tutorial/index.html

It covers the things you need for this task:
- use of lists:
  https://docs.python.org/3/tutorial/introduction.html#lists
  https://docs.python.org/3/tutorial/datastructures.html#more-on-lists
- using files:
  https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files
- doing something to data
  https://docs.python.org/3/tutorial/controlflow.html#defining-functions

In addition, there's a vast amount of resources for learning Python,
suitable for both beginners and experienced programmers:
  https://wiki.python.org/moin/BeginnersGuide


More information about the Tutor mailing list