Unicode Chars in Windows Path

Chris Angelico rosuav at gmail.com
Thu Apr 3 21:16:58 EDT 2014


On Fri, Apr 4, 2014 at 11:15 AM, David <bouncingcats at gmail.com> wrote:
> On 4 April 2014 01:17, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> -- Get info on all .pyc files in a directory and all its subdirectories --
>> C:\>dir some_directory\*.pyc /s
>> $ ls -l `find some_directory -name \*.pyc`
>>
>> Except that the ls version there can't handle names with spaces in
>> them, so you need to faff around with null termination and stuff.
>
> Nooo, that stinks! There's no need to abuse 'find' like that, unless
> the version you have is truly ancient. Null termination is only
> necessary to pass 'find' results *via the shell*. Instead, ask 'find'
> to invoke the task itself.
>
> The simplest way is:
>
>     find some_directory -name '*.pyc' -ls
>
> 'find' is the tool to use for *finding* things, not 'ls', which is
> intended for terminal display of directory information.

I used ls only as a first example, and then picked up an extremely
common next example (deleting files). It so happens that find can
'-delete' its found files, but my point is that on DOS/Windows, every
command has to explicitly support subdirectories. If, instead, the
'find' command has to explicitly support everything you might want to
do to files, that's even worse! So we need an execution form...

> If you require a particular feature of 'ls', or any other command, you
> can ask 'find' to invoke it directly (not via a shell):
>
>     find some_directory -name '*.pyc' -exec ls -l {} \;

... which this looks like, but it's not equivalent. That will execute
'ls -l' once for each file. You can tell, because the columns aren't
aligned; for anything more complicated than simply 'ls -l', you
potentially destroy any chance at bulk operations. No, to be
equivalent it *must* pass all the args to a single invocation of the
program. You need to instead use xargs if you want it to be
equivalent, and it's now getting to be quite an incantation:

find some_directory -name \*.pyc -print0|xargs -0 ls -l

And *that* is equivalent to the original, but it's way *way* longer
and less convenient, which was my point. Plus, it's extremely tempting
to shorten that, because this will almost always work:

find some_directory -name \*.pyc|xargs ls -l

But it'll fail if you have newlines in file names. It'd probably work
every time you try it, and then you'll put that in a script and boom,
it stops working. (That's what I meant by "faffing about with null
termination". You have to go through an extra level of indirection,
making the command fairly unwieldy.)

> I know this is off-topic but because I learn so much from the
> countless terrific contributions to this list from Chris (and others)
> with wide expertise, I am motivated to give something back when I can.

Definitely! This is how we all learn :) And thank you, glad to hear that.

> And given that in the past I spent a little time and effort and
> eventually understood this, I summarise it here hoping it helps
> someone else. The unix-style tools are far more capable than the
> Microsoft shell when used as intended.

More specifically, the Unix model ("each tool should do one thing and
do it well") tends to make for more combinable tools. The DOS style
requires every program to reimplement the same directory-search
functionality, and then requires the user to figure out how it's been
written in this form ("zip -r" (or is it "zip -R"...), "dir /s", "del
/s", etc, etc). The Unix style requires applications to accept
arbitrary numbers of arguments (which they probably would anyway), and
then requires the user to learn some incantations that will then work
anywhere. If you're writing a script, you should probably use the
-print0|xargs -0 method (unless you already require bash for some
other reason); interactively, you more likely want to enable globstar
and use the much shorter double-star notation. Either way, it works
for any program, and that is indeed "far more capable".

ChrisA



More information about the Python-list mailing list