EOF while scanning triple-quoted string literal

Grant Edwards invalid at invalid.invalid
Fri Oct 15 15:59:13 EDT 2010


On 2010-10-15, Martin Gregorie <martin at address-in-sig.invalid> wrote:
>> On 2010-10-15, Martin Gregorie <martin at address-in-sig.invalid> wrote:
>>> On Fri, 15 Oct 2010 17:02:07 +0000, Grant Edwards wrote:
>>>> On 2010-10-15, Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au>:
>>>> 
>>>>> In the Unix world, which includes OS X, text tools tend to have
>>>>> difficulty with tabs. Or try naming a file with a newline or carriage
>>>>> return in the file name, or a NULL byte.
>>>> 
>>>> How do you create a file with a name that contains a NULL byte?
>>>
>>> Use a language or program that doesn't use null-terminated strings.
>>>
>>> Its quite easy in many BASICs, [...]
>> 
>> I don't see what the in-program string representation has to do with
>> it. The Unix system calls that create files only accept NULL
>> terminated strings for the path parameter.
>
> Well, obviously you can't have null in a filename if the program is
> using null-terminated strings.

Obviously.

Just as obviously, you can't have a null in a filename if the OS
filesystem API uses null-terminated strings -- which the Linux
filesystem API does.  I just verified that by looking at the kernel
sources -- I can post the relevent code if you like.

I'm pretty sure all the other Unices are the same.  I've got BSD
sources laying around somewhere...

>> Are you saying that there are BASIC implementations for Unix that
>> create Unix files by directly accessing the disk rather than using
>> the Unix system calls?
>
> I'm saying that the only BASIC implementations I've looked at the
> guts of have used count-delimited strings. None were on *nixen but
> its a safe bet that if they were ported to a UNIX they'd retain their
> count-delimited nature.

And I'm saying _that_doesn't_matter_.  The _OS_ uses NULL-terminated
strings.  You can use a language the represents strings as braille
images encoded as in-memory PNG files if you want.  That still doesn't
let you create a Unix file whose name contains a NULL byte.

> Another language that will certainly do this is COBOL, which only
> uses fixed length, and therefore undelimited, strings. 

Again, what difference does it make?

If the OS uses null-terminated strings for filenames, what difference
does it make how the user-space program represents filenames internally?

> The point I'm making is that in both fixed length and counted string 
> representations you can put any character value at all into the
> string unless whatever mechanism you're using to read in the values
> recognises something, i.e. TAB, CR, LF, CRLF as a delimiter, and even
> then the program can generate a string containing arbitrary
> gibberish. 

I don't care how the program represents strings.

The OS doesn't care.

The filesystem doesn't care.

Please explain how to pass a filename containing a NULL byte to a Unix
syscall like creat() or open().  You don't even have to use the C
library API -- feel free to use the real syscall API for whatever Unix
on whatever architecture you want.

> If you then use the string as a file name you can end up with a file
> that can't be accessed or deleted if the name flouts the OS's file
> naming conventions. I've done it in the past with BASIC programs and
> finger trouble under FLEX09 and CP/M. In both cases I had to use a
> disk editor to fix the file name before the file could be deleted or
> accessed.

We're talking about Unix.

We're not talking about CP/M, DOS, RSX-11m, Apple-SOS, etc.

-- 
Grant Edwards               grant.b.edwards        Yow! I put aside my copy
                                  at               of "BOWLING WORLD" and
                              gmail.com            think about GUN CONTROL
                                                   legislation...



More information about the Python-list mailing list