Python's 8-bit cleanness deprecated?

Wed Feb 5 15:41:26 EST 2003

Paul Rubin <phr-n2003b at NOSPAMnightsong.com> wrote in 
news:7x1y2mo80b.fsf at ruckus.brouhaha.com:

> Brian Quinlan <brian at sweetapp.com> writes:
>> > I think it's more in the Python tradition (although this particular
>> > tradition is one that I don't like) to use a variable:
>> > 
>> > __encoding__ = "latin-1"
>> 
>> It has to be a bit more special than that because the encoding must be
>> detected before the grammar is parsed. Variable assignment would be
>> acceptable, I guess, except that the assignment:
>> 
>> 1. would have to use a simplified grammar
>> 2. it would have to be near the top of the file
> 
> Yes, there are similar constraints for "from __future__" declarations.
> This would be similar.

anyway if the PEP proposes to seach for a regexp in comments, then it can 
do it equaly well over the rest of the source. meaning that  

regexp1 = r"__encoding__[\t ]*=[\t ]*[\"']+(\w+)[\"']+"
regexp2 = r"from[\t ]+__encoding__[\t ]+import[\t ]+[\"']+(\w+)[\"']+"

can be searched before parsing the grammar, both are valid python code that 
do not need any language extensions and it still works after removing all 
comments.

actualy i like the regexp1 way. you can even retreive the encoding at 
runtime and if encoding matters during parsing and executing it also 
matters during runtime, otherwise we would not need the PEP, right?

chris

a comment is a comment and should stay a comment...
-- 
Chris <cliechti at gmx.net>