python 2.7.12 on Linux behaving differently than on Windows

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Dec 7 22:41:00 EST 2016


On Thursday 08 December 2016 12:15, BartC wrote:

> On 08/12/2016 00:09, Steve D'Aprano wrote:
>> On Thu, 8 Dec 2016 02:48 am, BartC wrote:
> 
>>> You make it sound like a big deal. Here's a program (in my language not
>>> Python) which prints the list of files matching a file-spec:
>>>
>>> println dirlist(cmdparams[2])
>>
>> Does dirlist provide an escaping mechanism so that you can disable filename
>> globbing?
> 
> No. It simply produces a list of files in a path matching a particular
> wildcard pattern using * and ?.
> 
> That's all. I know the value of keeping things straightforward instead
> of throwing in everything you can think of. The file-matching is done by
> WinAPI functions. The Linux version was done last week and I stopped
> when I proved the concept (of having a dirlist() that worked - to
> certain specs - on both OSes, 32- and 64-bit, without change).

So you're happy with the fact that there are legitimate file names that your 
program simply has no way of dealing with?

Thank you for making my point for me! If you leave the implementation of 
metacharacter expansion up to the individual program, then you get the 
situation where individual programs will fail to support certain features, or 
even basic functionality.

Python's fnmatch lib is a good example. It has, or at least had, no support for 
escaping metacharacters. Anyone relying on Python's fnmatch and glob modules 
alone for globbing will be unable to handle legitimate file names.



>   In another post, you claimed that in your programs, you wouldn't
>> use anything as clumsy and ambiguous as globbing.
> 
>>     My program wouldn't need anything so crude. The input syntax
>>     would be more carefully designed so as to not have the ambiguity.
> 
> I meant designing a CLI where *.* could be used in different parameters
> to mean different things.
> 
> Reading that post again, you presumably meant either *.* used to match
> all files (with an embedded "." in Linux, with or without in Windows;
> another difference), or *.* used to match a specific file called *.*.

Correct.

It doesn't matter whether metacharacters are expanded by the shell or the 
program itself, there needs to be an escaping mechanism for those cases where 
you want the metacharacters to be treated literally.



>> So presumably your dirlist() command can distinguish between the file called
>> literally "*.*" and the file spec "*.*" that should be expanded,
> 
> No. I won't support that (not unless it's supported by Posix's
> fnmatch()). Because it's the thin end of the wedge. I can show a few
> lines of code and then you will say, Ah, but what about this...
> 
> And I first used this function in early 90s I think, I don't recall it
> ever not working.

If you can't use it to specify a file called literally "*.*", then its not 
working.


>> And of course your program is also capable of variable and arithmetic
>> expansion, right?
> 
> Um, dirlist() is used within a language, of course it can do all that.
> If you mean within the file-pattern string submitted to dirlist, then I
> don't even know what that would mean.

I showed you an example in another post:

[steve at ando ~]$ export base="thefile"
[steve at ando ~]$ ls -l "$base$((1000 + 234))"
-rw-rw-r-- 1 steve steve 0 Dec  8 10:51 thefile1234


Of course it is silly to write 1000 + 234 as a literal constant like that, but 
either or both of those could just as easily be variables.

The point here is not that *you* should build a calculator into your version of 
ls. The complete opposite: the point is that application authors don't have to 
build in more and more non-core functionality into their own applications, 
because the shell handles all this peripheral functionality for them.

The user then can choose whether to use the shell's extra functionality or not, 
and your application doesn't need to care one way or the other.



>>> (Here are implementations of that dirlist() for both OSes:
>>> http://pastebin.com/knciRULE
> 
> [Sorry about that name; I didn't choose it!]
> 
>> I have no idea how good the Windows globbing support is, or whether it can
>> be escaped.
> 
> Why such a preoccupation with 'globbing'? It's something I may use from
> time to time with DIR or COPY or something, and that's it.

Right -- and that's why you're not the target audience for Unix shells. Unix 
sys admins use globs and other shell expansion features extensively, probably 
hundreds of times a day. Even me, hardly a sys admin at all, use it dozens of 
times a day.

For example, a quick and simple way of backing up a file with a date stamp:

steve at runes:~$ cp alloc.txt{,-`date +%Y%m%d`}
steve at runes:~$ ls alloc.txt* 
alloc.txt  alloc.txt-20161208


> I just didn't know someone could also use it with any user program and
> that program's command line would then fill with loads of unexpected
> files that can mess things up.

You're treating this as some unknown third party dumping junk into the command 
line. Its not. Its me, the user, *chosing* to use another program (the shell) 
to expand a bunch of metacharacters, to save me having to type things out by 
hand.

You're worried that 

    program *

will expand to a million file names? Okay, but what if I, the user, typed out a 
million file names by hand? I'd hate to do that, but if I need to process a 
million file names, by Jove that's what I'll do. (With tab completion, it won't 
be *quite* as awful as it seems.)

Or... I'll write a macro or script to generate a million file names, and pass 
them to your program.

Or... I'll make use of an existing program, the shell, and use that. Because 
I'm not an idiot and there's no way on Earth I'm actually going to type out a 
million file names, not when I have a computer that will do it for me.

So what's the difference? Why should you prohibit me from using automation to 
make my life easier?

Bart, we get it that you were taken by surprise by shell expansion, because in 
X decades of professional IT work you'd never seen it before. Okay, you were 
surprised. Oops, sorry. That's the risk you take when you move from a platform 
you know and are comfortable with onto an unfamiliar platform that breaks your 
expectations, and I do sympathise with your sense of shock.

But it's not a bug, its a feature, and Linux system admins use that feature, 
and others like it, *extensively*. Its not a feature you care for, but its not 
designed for you.

So get over it.

[...]
>> Looks like you have a lot of wheels that need re-inventing before you come
>> even close to parity with the features of the Linux shell.
> 
> These are wheels I don't *want* to re-invent! I'm not writing a shell.

Indeed, and nobody wants to force you to.

But the people who wrote sh, bash, zsh, etc ARE writing shells, and they've 
given them the features that shell users demand so that you, the application 
author, don't have to.




-- 
Steven
"Ever since I learned about confirmation bias, I've been seeing 
it everywhere." - Jon Ronson




More information about the Python-list mailing list