EOF while scanning triple-quoted string literal

Grant Edwards invalid at invalid.invalid
Fri Oct 15 16:07:37 EDT 2010


On 2010-10-15, Seebs <usenet-nospam at seebs.net> wrote:
> On 2010-10-15, Grant Edwards <invalid at invalid.invalid> wrote:

>> Yes, all of the Unix syscalls use NULL-terminated path parameters
>> (AKA "C strings").  What I don't know is whether the underlying
>> filesystem code also uses NULL-terminated strings for filenames or if
>> they have explicit lengths.  If the latter, there might be some way
>> to bypass the normal Unix syscalls and actually create a file with a
>> NULL in its name -- a file that then couldn't be accessed via the
>> normal Unix system calls.  My _guess_ is that the underlying
>> filesystem code in most all Unices also uses NULL-terminated strings,
>> but I haven't looked yet.
>
> There's some dire magic there.  The classic V7 or so filesystem had
> 16-byte file names which were null terminated unless they were 16
> characters, in which case they weren't but were still only 16
> characters.  Apart from that, though, so far as I know everything is
> always null terminated.

I've just verfied in the Linux sources that filenames passed to linux 
syscall API (as opposed to the C library API) are indeed null
terminated. Even if they're not stored as null-terminated strings in
the actual filesystem data-structures, there's no way to get a null
byte in there using standard syscalls.  I have found a few places in
the filesystem code where there's a structure field that seems to be a
"name length", but it's not obvious at first glance if that's a file
name.

> The weird special case is slashes; you can never have a slash in a
> file name, but at least one NFS implementation was able to create
> file names containing slashes, and if you had a Mac client (where
> slash was valid in file names), it could then create files with names
> that you could never use on the Unix side, because the path
> resolution code kept trying to find directories instead.

Fun!

> This was, worse yet, common, because so many people used "mm/dd/yy"
> in file names!  Later implementations changed to silently translating
> between colons and slashes.  (I think this still happened under the
> hood in at least some OS X, because the HFS filesystem really uses
> colons somewhere down in there.)
>
> ... But so far as I know, there's never been a Unix-type system where
> it was actually possible to get a null byte into a file name. Spaces,
> newlines, sure.

And even more fun, backspaces and ANSI escape sequences. :)

Back in the day when everybody was sitting at a terminal (or at least
an xterm), you could confuse somebody for days with judicious use of
filenames containing escape sequences.  Not that I'd ever do such a
thing.

> Slashes, under rare and buggy circumstances. But I've never heard of
> a null byte in a file name.

Nor I, which is why I was confused by the statement that in the "Unix
world" a lot of programs misbehaved when presented with files whose
names contained a null byte.

-- 
Grant Edwards               grant.b.edwards        Yow! I want my nose in
                                  at               lights!
                              gmail.com            



More information about the Python-list mailing list