Behavior of the for-else construct

Chris Angelico rosuav at gmail.com
Fri Mar 4 12:46:14 EST 2022


 On Sat, 5 Mar 2022 at 02:02, Tim Chase <python.list at tim.thechases.com> wrote:
>
> On 2022-03-04 11:55, Chris Angelico wrote:
> > In MS-DOS, it was perfectly possible to have spaces in file names
>
> DOS didn't allow space (0x20) in filenames unless you hacked it by
> hex-editing your filesystem (which I may have done a couple times).
> However it did allow you to use 0xFF in filenames which *appeared* as
> a space in most character-sets.

Hmm, I'm not sure which APIs worked which way, but I do believe that I
messed something up at one point and made a file with an included
space (not FF, an actual 20) in it. Maybe it's something to do with
the (ancient) FCB-based calls. It was tricky to get rid of that file,
though I think it turned out that it could be removed by globbing,
putting a question mark where the space was.

(Of course, internally, MS-DOS considered that the base name was
padded to eight with spaces, and the extension padded to three with
spaces, so "READ.ME" would be "READ\x20\x20\x20\x20ME\x20", but that
doesn't count, since anything that enumerates the contents of a
directory would translate that into the way humans think of it.)

> I may have caused a mild bit of consternation in school computer labs
> doing this. ;-)

Nice :)

> > Windows forbade a bunch of characters in file names
>
> Both DOS and Windows also had certain reserved filenames
>
> https://www.howtogeek.com/fyi/windows-10-still-wont-let-you-use-these-file-names-reserved-in-1974/
>
> that could cause issues if passed to programs.

Yup. All because, way back in the day, they didn't want to demand the
colon. If you actually *want* to use the printer device, for instance,
you could get a hard copy of a directory listing like this:

DIR >LPT1:

and it's perfectly clear that you don't want to create a file called
"LPT1", you want to send it to the printer. But noooooo it had to be
that you could just write "LPT1" and it would go to the printer.

> To this day, if you poke around on microsoft.com and change random
> bits of URLs to include one of those reserved filenames in the GET
> path, you'll often trigger a 5xx error rather than a 404 that you
> receive with random jibberish in the same place.
>
>   https://microsoft.com/…/asdfjkl → 404
>   https://microsoft.com/…/lpt1 → 5xx
>   https://microsoft.com/…/asdfjkl/some/path → 404
>   https://microsoft.com/…/lpt1/some/path → 5xx
>
> Just in case you aspire to stir up some trouble.
>

In theory, file system based URLs could be parsed such that, if you
ever hit one of those, it returns "Directory not found". In
practice... apparently they didn't do that.

As a side point, I've been increasingly avoiding any sort of system
whereby I take anything from the user and hand it to the file system.
The logic is usually more like:

If path matches "/static/%s":
1) Get a full directory listing of the declared static-files directory
2) Search that for the token given
3) If not found, return 404
4) Return the contents of the file, with cache markers

Since Windows will never return "lpt1" in that directory listing, I
would simply never find it, never even try to open it. This MIGHT be
an issue with something that accepts file *uploads*, but I've been
getting paranoid about those too, so, uhh... my file upload system now
creates URLs that look like this:

https://sikorsky.rosuav.com/static/upload-49497888-6bede802d13c8d2f7b92ca9fac7c

That was uploaded as "pie.gif" but stored on the file system as
~/stillebot/httpstatic/uploads/49497888-6bede802d13c8d2f7b92ca9fac7c
with some metadata stored elsewhere about the user-specified file
name. So hey, if you were to try to upload a file that had an NTFS
invalid character in it, I wouldn't even notice.

Maybe I'm *too* paranoid, but at least I don't have to worry about
file system attacks.

ChrisA


More information about the Python-list mailing list