Sharing: File Reader Generator with & w/o Policy

Chris Angelico rosuav at gmail.com
Sun Mar 16 01:41:53 EDT 2014


On Sun, Mar 16, 2014 at 3:47 PM, Mark H Harris <harrismh777 at gmail.com> wrote:
> On 3/15/14 10:48 PM, Chris Angelico wrote:
>>
>> There's a cost to refactoring. Suddenly there's a new primitive on the
>> board - a new piece of language . . . Splitting out all sorts of things
>> into
>>
>> generators when you could use well-known primitives like enumerate
>> gets expensive fast {snip}
>>
>>
>> [1] https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)
>
>
> Very good to remember. I am finding the temptation to make all kinds of
> generators (as you noted above). Its just that the python generator makes it
> so easy to define a function that maintains state between calls (of next()
> in this case) and so its also so easy to want to use them... almost
> forgetting about primitives!

General rule of thumb: Every object in the same namespace should be
readily distinguishable by name alone. And if doing that makes your
names so long that the function signature is longer than the function
body, it might be better to not have that as a function :)

Also, I'd consider something code smell if a name is used in only one
place. Maybe not so much with local variables, as there are other
reasons to separate things out, but a function that gets called from
only one place probably doesn't need to exist at top-level. (Of
course, a published API will often seem to have unused or little-used
functions, because they're being provided to the caller. They don't
count.)

> And the rule of three is one of those things that sneaks up on oneself. I
> have actually coded about seven (7) such cases when I discovered that they
> were all identical. I am noticing that folks code the same file reader cases
> "with open() as fh: yadda yadda"  and I've noticed that they are all pretty
> close to the same. Wouldn't it be nice to have one simpler getline() or
> getnumline() name that does this one simple thing once and for all. But as
> simple as it is, it isn't. Well, as you say, use cases need to determine
> code refactoring.

If getline() is doing nothing that the primitive doesn't, and
getnumline is just enumerate, then they're not achieving anything
beyond shielding you from the primitives.

> The other thing I'm tempted to do is to find names (even new names) that
> read like English closely (whatever I mean by that) so that there is no
> question about what is going on to a non expert.
>
> for line in getnumline(file):
>       {whatever}

The trouble is that your idea of getnumline(file) might well differ
from someone else's idea of getnumline(file). Using Python's
primitives removes that confusion - if you see enumerate(file), you
know exactly what it's doing, even in someone else's code.

> Well, what if there were a project called SimplyPy, or some such, that
> boiled the python language down to a (Rexx like) or (BASIC like) syntax and
> usage so that ordinary folks could code out problems (like they did in 1964)
> and expert users could use it too including everything else they know about
> python? Would it be good?
>
> A SimplyPy coder would use constructs similar to other procedural languages
> (like Rexx, Pascal, even C) and without knowing the plethora of Python
> intrinsics could solve problems, yet not be an "expert".
>
> SimplyPy would be a structured subset of the normal language for learning
> and use (very small book/tutorial/ think the Rexx handbook, or the K&R).
>
> Its a long way off, and I'm just now experimenting. I'm trying to get my
> hands around context managers (and other things). This is an idea I got from
> Anthony Briggs' Hello Python! (forward SteveHolden) from Manning books.  Its
> very small, lite weight, handles real work, but--- its still too big. I am
> wanting to condense it even further, providing the minimal basic core
> language as an end application product rather than the "expert" computer
> science language that will run under it.
>
> or, over it, as you like.
>
> (you think this is a nutty idea?)

To be quite frank, yes I do think it's a nutty idea. Like most nutty
things, there's a kernel of something good in it, but that's not
enough to build a system on :)

Python is already pretty simple. The trouble with adding a layer of
indirection is that you'll generally be limiting what the code can do,
which is usually a bad idea for a general purpose programming
language, and also forcing you to predict everything the programmer
might want to do. Or you might have an "escape clause" that lets the
programmer drop to "real Python"... but as soon as you allow that, you
suddenly force the subsequent reader to comprehend all of Python,
defeating the purpose.

We had a discussion along these lines a little while ago, about
designing a DSL [1] for window creation. On one side of the debate was
"hey look how much cleaner the code is if I use this DSL", and on the
other side was "hey look how much work you don't have to do if you
just write code directly". The more cushioning between the programmer
and the language, the more the cushion has to be built to handle
everything the programmer might want to do. Python is a buffer between
me and C. C is itself a buffer between me and assembly language. Each
of them provides something that I want, but each of them has to be so
extensive as to be able to handle _anything_ I might want to write.
(Or, pretty much anything. Sometimes I find that a high level language
lacks some little thing - recently I was yearning for a beep feature -
and find that I can shell out to some external utility to do it for
me.) Creating SimplyPy would put the onus on you to make it possible
to write general code in it, and I think you'll find it's just not
worth trying - more and more you'll want to add features from Python
itself, until you achieve the inner-platform effect. [2]

Note that there are times when this sort of cushioning and limiting
are absolutely appropriate. Sometimes you want to limit end users to
certain pieces of functionality only, and the easiest way to do it is
to create a special-purpose language that is interpreted by a (say)
Python script. Or maybe you want to write a dice roller that takes a
specific notation like "2d8 + d6 (fire) + 2d6 (sneak attack) + 4
(STR)" [3] and interprets that appropriately. But the idea isn't to
simplify general programming, then, and if you're doing that sort of
thing, you still might want to consider a general-purpose language
(Lua's good for that, as is JavaScript/ECMAScript). That's not what
you're suggesting.

ChrisA

[1] Domain-specific language. I'm never sure whether to footnote these
kinds of acronyms, but I want to clarify that I am not talking about a
copper-based internet connection here.
[2] http://thedailywtf.com/Articles/The_Inner-Platform_Effect.aspx and
https://en.wikipedia.org/wiki/Inner-platform_effect
[3] I actually have a dice roller that does exactly that as part of
Minstrel Hall - http://minstrelhall.com/



More information about the Python-list mailing list