detecting newline character
Thomas 'PointedEars' Lahn
PointedEars at web.de
Sun Apr 24 08:50:14 EDT 2011
Daniel Geržo wrote:
> Thomas 'PointedEars' Lahn wrote:
>> It is clear now that codecs.open() would not support universal newlines
>> from at least Python 2.6 forward as it is *documented* that it opens
>> files in *binary mode* only. The source code that I have posted shows
>> that it therefore actively removes 'U' from the mode string when the
>> `encoding' argument was passed, and always appends 'b' to the mode if not
>> present. As a result, __builtin__.open() is called without 'U' in the
>> `mode' argument, which is *documented* to set file.newlines to None
>> (regardless whether Python was compiled with universal newline support).
>
> OK, it makes much more sense now, thanks for explanation. I didn't
> understood it when reading the docs. The io module seems to be good
> choice for my use case so I switched to using that for now.
ACK
> What is still a little confusing for me is that you stated "WFM", which
> I interpreted as "Works For Me", in one of your previous replies for
> both with and without encoding specified.
This is now[1] easily explained by a typo in my quick-hacked test module,
where it said
if __name__ == "main":
CodecsTest()
instead of the proper
if __name__ == "__main__":
CodecsTest()
Testing too superficially, I concluded that because no exception was thrown
there, there was no problem. However, in fact, no exception was thrown
there because the method in question was never called. Sorry.
> I also have to state that it must have been changed sometime during 2.6
> line, because I started developing pysublib ca. 20 months ago on python
> 2.6 (don't know the minor version) and I am quite sure my tests were
> passing back in that time...
Yes, I have subsequently found the changelogs saying:
| What's New in Python 2.7 alpha 4?
| =================================
|
| *Release date: 2010-03-06*
|
| […]
| Library
| -------
|
| […]
| - Issue #691291: ``codecs.open()`` should not convert end of lines on
| reading and writing.
| What's New in Python 2.6.5 rc 1?
| ================================
|
| *Release date: 2010-03-01*
|
| […]
| Library
| -------
|
| […]
| - Issue #691291: codecs.open() should not convert end of lines on reading
| and writing.
See also <http://bugs.python.org/issue691291> for the rationale.
I have python2.6_2.6.6-8+b1_i386 and python2.7_2.7.1-8_i386 installed.
Fixing the typo above, both throw the exception under said circumstances, as
expected.
That is why I suggested RTSL (which should not be that hard to do, see
below.)
[1] Aside: I had already noticed that PyDev would show me the 2.6 source
code of codecs.py when Ctrl-clicking e.g. `codecs' or `open' in a PyDev
project where the grammar was set to Python 2.7. Now I know why: The
project's interpreter setting was set to "Default". However, apparently
"Default" refers to the first interpreter in the list in the Preferences,
and that was Python 2.6 (as I added 2.7 later). I have found that out by
placing the Python 2.7 interpreter entry at the top of the list, clicking
Apply twice, thereby "restoring the interpreter"; I got the mentioned
exception then. Although the logic of validating against e.g. the Python
3.0 grammar and using a Python 2.7 interpreter escapes me, it should be
noted that you should set *both* settings if unsure (using PyDev
2.0.0.2011040403).
> Thank you for your help it's very appreciated.
You're welcome.
--
PointedEars
More information about the Python-list
mailing list