XML Considered Harmful

Avi Gross avigross at verizon.net
Tue Sep 28 14:23:26 EDT 2021


I replied to Michael privately but am intrigued by his words here:

"The thing that creates realistic test cases is my brain."

I consider extensions to my brain to include using a language like Python on
my computer and in particular, to take a model I think of and instantiate
it. Lots of people have shared modules that can be tweaked to do all kinds
of simulations using a skeleton you provide that guides random number usage.
Some will generate lots of those and stare at them and use their brain to
further narrow it down to realistic ones. For example, in designing say a
car with characteristics like miles per gallon should randomly range between
10 and 100 while engine size ranges from this to that and so on, it may turn
out that large engines don't go well with large number for miles per gallon.

I have worked on projects where a set of guides then created hundreds of
thousands of fairly realistic scenarios using every combination of an
assortment of categorical variables and the rest of the program sliced and
diced the results and did all kinds of statistical calculations and then
generated all kinds of graphs. There was no real data but there was a
generator that was based on the kinds of distributions previously published
in the field that helped guide parameters to be somewhat realistic.

In your case, I understand you will decide how to do it and just note you
used language with multiple meanings that misled a few of us into thinking
you either had a python function in mind using one of several ways Python
refers to as generators, such as one that efficiently yields the next prime
number when asked. Clearly your explanation now shows you plan on making a
handful of data sets by hand using an editor like vi. Fair enough. No need
to write complex software if your mind is easily able to just make half a
dozen variations in files. And, frankly, not sure why you need XML or much
of anything. It obviously depends on how much you are working with and how
variable. For simpler things, you can hard-code your data structure directly
into your program, run an analysis, change the variables to your second
simulation and repeat.

I am afraid that I, like a few others here, assumed a more abstract and much
more complex need to be addressed. Yours may be complex in other parts but
may need nothing much for the part we are talking about. It sounds like you
do want something easier to create while editing.

-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon.net at python.org> On
Behalf Of Michael F. Stemper
Sent: Tuesday, September 28, 2021 11:38 AM
To: python-list at python.org
Subject: Re: XML Considered Harmful

On 27/09/2021 20.01, Avi Gross wrote:
> Michael,
> 
> Given your further explanation, indeed reading varying numbers of 
> points in using a CSV is not valid, albeit someone might just make N 
> columns (maybe a few more than 7) to handle a hopefully worst case. 
> Definitely it makes more sense to read in a list or other data structure.
> 
> You keep talking about generators, though. If the generators are 
> outside of your program, then yes, you need to read in whatever they
produce.

My original post (which is as the snows of yesteryear) made explicit the
fact that when I refer to a generator, I'm talking about something made from
tons of iron and copper that is oil-filled and rotates at 1800 rpm.
(In most of the world other than North America, they rotate at 1500 rpm.)

Nothing to do with the similarly-named python construct. Sorry for the
ambiguity.

> But if
> your data generator is within your own program,

The data is created in my mind, and approximates typical physical
characteristics of real generators.

> My impression is you may not be using your set of data points for any 
> other purposes except when ready to draw a spline.

Nope, the points give a piecewise-linear curve, and values between two
consecutive points are found by linear interpolation. It's industry standard
practice.


> Can I just ask if by a generator, you do NOT mean the more typical use 
> of "generator" as used in python

Nope; I mean something that weighs 500 tons and rotates, producing
electrical energy.

>   Do you mean something that creates
> realistic test cases to simulate a real-word scenario?

The thing that creates realistic test cases is my brain.

>   These often can
> create everything at once and often based on random numbers.

I have written such, but not in the last thirty years. At that time, I
needed to make up data for fifty or one hundred generators, along with tie
lines and loads.

What I'm working on now only needs a handful of generators at a time; just
enough to test my hypothesis. (Theoretically, I could get by with two, but
that offends my engineering sensibilities.)

> create everything at once and often based on random numbers. Again, if 
> you have or build such code, it is not clear it needs to be written to 
> disk and then read back.

Well, I could continue to hard-code the data into one of the test programs,
but that would mean that every time that I wanted to look at a different
scenario, I'd need to modify a program. And when I discover anomalous
behavior, I'd need to copy the hard-coded data into another program.

Having the data in a separate file means that I can provide a function to
read that file and return a list of generators (or fuels) to a program.
Multiple test cases are then just multiple files, all of which are available
to multiple programs.

>   You may of course want to save it, perhaps as a log, to show what 
> your program was working on.

That's another benefit of having the data in external files.

--
Michael F. Stemper
A preposition is something you should never end a sentence with.
--
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list