BOM should be ignored by Python

Donn Cave donn at oz.net
Thu May 4 01:40:42 EDT 2000


Quoth "Mark Hammond" <mhammond at skippinet.com.au>:
| "Andrew MacIntyre" <andymac at bullseye.apana.org.au> wrote in message
| news:Pine.OS2.3.95.1000503080812.374B-100000 at CENTRAL...
|
|> I can see this happening on Win2K systems, but on Unix where the #! hack
|> is extensively entrenched and unicode oblivious AFAIK (and likely to be
|> so for some time), I'm wondering where that would leave script
|> portability.
|
| Wont this simply indicate the file is ASCII?
|
| Note that editors etc do _not_ show you the BOM - so the Unix shell could
| still easily support it - when it opens the file, no BOM==text,
| otherwise==Unicode - first line after the BOM then is the directive.

The #! mechanism isn't really a shell feature, it's intended for the
UNIX exec mechanism that also handles your binary executables.
An executable file in this system identifies its type in these first
two bytes.  It could find ('#', '!') there for a script, or ('\314',
'\0') for example in a binary executable.  Of course in theory it's
possible to account for this Unicode BOM string in this function,
as long as it doesn't happen to conflict with some other code, and
I guess that's what you're saying.  But in the near term, obviously
the BOM would lose on UNIX.

	Donn Cave, donn at oz.net



More information about the Python-list mailing list