EOF while scanning triple-quoted string literal

Martin Gregorie martin at address-in-sig.invalid
Fri Oct 15 15:13:16 EDT 2010


On Fri, 15 Oct 2010 18:14:13 +0000, Grant Edwards wrote:

> On 2010-10-15, Martin Gregorie <martin at address-in-sig.invalid> wrote:
>> On Fri, 15 Oct 2010 17:02:07 +0000, Grant Edwards wrote:
>>
>>> On 2010-10-15, Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au>
>>> wrote:
>>> 
>>>> In the Unix world, which includes OS X, text tools tend to have
>>>> difficulty with tabs. Or try naming a file with a newline or carriage
>>>> return in the file name, or a NULL byte.
>>> 
>>> How do you create a file with a name that contains a NULL byte?
>>
>> Use a language or program that doesn't use null-terminated strings.
>>
>> Its quite easy in many BASICs, which often delimit strings by
>> preceeding it with a with a byte count, and you hit Ctrl-SPACE by
>> accident....
> 
> I don't see what the in-program string representation has to do with it.
>  The Unix system calls that create files only accept NULL terminated
> strings for the path parameter.
>
Well, obviously you can't have null in a filename if the program is using 
null-terminated strings.

> Are you saying that there are BASIC implementations for Unix that create
> Unix files by directly accessing the disk rather than using the Unix
> system calls?
>
I'm saying that the only BASIC implementations I've looked at the guts of 
have used count-delimited strings. None were on *nixen but its a safe bet 
that if they were ported to a UNIX they'd retain their count-delimited 
nature.

Another language that will certainly do this is COBOL, which only uses 
fixed length, and therefore undelimited, strings. 

The point I'm making is that in both fixed length and counted string 
representations you can put any character value at all into the string 
unless whatever mechanism you're using to read in the values recognises 
something, i.e. TAB, CR, LF, CRLF as a delimiter, and even then the 
program can generate a string containing arbitrary gibberish. 

If you then use the string as a file name you can end up with a file that 
can't be accessed or deleted if the name flouts the OS's file naming 
conventions. I've done it in the past with BASIC programs and finger 
trouble under FLEX09 and CP/M. In both cases I had to use a disk editor 
to fix the file name before the file could be deleted or accessed.


-- 
martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |



More information about the Python-list mailing list