From jython-checkins at python.org Mon May 1 12:34:20 2017 From: jython-checkins at python.org (stefan.richthofer) Date: Mon, 01 May 2017 16:34:20 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Merged_PR_=2373_--_Update_N?= =?utf-8?q?EWS?= Message-ID: <20170501163419.34187.21508.27B150A9@psf.io> https://hg.python.org/jython/rev/a2bf481be3d1 changeset: 8076:a2bf481be3d1 user: James Mudd date: Mon May 01 18:34:02 2017 +0200 summary: Merged PR #73 -- Update NEWS files: NEWS | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/NEWS b/NEWS --- a/NEWS +++ b/NEWS @@ -4,6 +4,7 @@ Jython 2.7.1rc1 Bugs fixed + - [ 2313 ] test_jython_initializer failure on Windows - [ 2399 ] test_sort failure on Java 8 - [ 2309 ] test_classpathimporter fails on Windows. - [ 2318 ] test_zipimport_jy failure on Windows -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Mon May 1 12:45:14 2017 From: jython-checkins at python.org (stefan.richthofer) Date: Mon, 01 May 2017 16:45:14 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Merged_PR_=2374_--_Update_c?= =?utf-8?q?opyright_dates?= Message-ID: <20170501164514.33405.69136.44CBEE46@psf.io> https://hg.python.org/jython/rev/b163e369ae97 changeset: 8077:b163e369ae97 user: James Mudd date: Mon May 01 18:45:03 2017 +0200 summary: Merged PR #74 -- Update copyright dates files: ACKNOWLEDGMENTS | 2 +- src/org/python/core/PySystemState.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ACKNOWLEDGMENTS b/ACKNOWLEDGMENTS --- a/ACKNOWLEDGMENTS +++ b/ACKNOWLEDGMENTS @@ -3,7 +3,7 @@ Jython: Python for the Java Platform -Copyright (c) 2000-2016 Jython Developers. +Copyright (c) 2000-2017 Jython Developers. All rights reserved. Copyright (c) 2000 BeOpen.com. diff --git a/src/org/python/core/PySystemState.java b/src/org/python/core/PySystemState.java --- a/src/org/python/core/PySystemState.java +++ b/src/org/python/core/PySystemState.java @@ -95,7 +95,7 @@ */ public static final PyObject copyright = Py.newString( - "Copyright (c) 2000-2016 Jython Developers.\n" + "All rights reserved.\n\n" + + "Copyright (c) 2000-2017 Jython Developers.\n" + "All rights reserved.\n\n" + "Copyright (c) 2000 BeOpen.com.\n" + "All Rights Reserved.\n\n" + "Copyright (c) 2000 The Apache Software Foundation.\n" + "All rights reserved.\n\n" + "Copyright (c) 1995-2000 Corporation for National Research Initiatives.\n" -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Thu May 4 19:31:06 2017 From: jython-checkins at python.org (stefan.richthofer) Date: Thu, 04 May 2017 23:31:06 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Merged_PR_=2375_/_PR_=2377_?= =?utf-8?q?--_Fix_=232585_test=5Fssl_test_failure?= Message-ID: <20170504233105.36628.A7C3A97375250315@psf.io> https://hg.python.org/jython/rev/7ad8eedaa14a changeset: 8078:7ad8eedaa14a user: James Mudd date: Fri May 05 01:30:47 2017 +0200 summary: Merged PR #75 / PR #77 -- Fix #2585 test_ssl test failure files: Lib/_socket.py | 4 ++++ NEWS | 1 + build.xml | 2 -- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/Lib/_socket.py b/Lib/_socket.py --- a/Lib/_socket.py +++ b/Lib/_socket.py @@ -355,6 +355,10 @@ # need to work around. if isinstance(java_exception, java.net.ConnectException): mapped_exception = _exception_map.get(java.net.ConnectException) + # Netty AnnotatedNoRouteToHostException extends NoRouteToHostException + # so also needs work around. + elif isinstance(java_exception, java.net.NoRouteToHostException): + mapped_exception = _exception_map.get(java.net.NoRouteToHostException) else: mapped_exception = _exception_map.get(java_exception.__class__) if mapped_exception: diff --git a/NEWS b/NEWS --- a/NEWS +++ b/NEWS @@ -4,6 +4,7 @@ Jython 2.7.1rc1 Bugs fixed + - [ 2585 ] test_ssl failure due to Netty exception mapping - [ 2313 ] test_jython_initializer failure on Windows - [ 2399 ] test_sort failure on Java 8 - [ 2309 ] test_classpathimporter fails on Windows. diff --git a/build.xml b/build.xml --- a/build.xml +++ b/build.xml @@ -1110,7 +1110,6 @@ - @@ -1129,7 +1128,6 @@ - -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Tue May 16 16:13:45 2017 From: jython-checkins at python.org (darjus.loktevic) Date: Tue, 16 May 2017 20:13:45 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Add_a_Java_8_linux_build_t?= =?utf-8?q?argets_on_Travis?= Message-ID: <20170516201345.40721.9210.959724F8@psf.io> https://hg.python.org/jython/rev/425b612a86cf changeset: 8079:425b612a86cf user: James Mudd date: Tue May 16 13:13:38 2017 -0700 summary: Add a Java 8 linux build targets on Travis files: .travis.yml | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/.travis.yml b/.travis.yml --- a/.travis.yml +++ b/.travis.yml @@ -16,6 +16,7 @@ - CUSTOM_JDK="default" - CUSTOM_JDK="oraclejdk7" - CUSTOM_JDK="openjdk7" + - CUSTOM_JDK="oraclejdk8" matrix: exclude: @@ -24,6 +25,8 @@ env: CUSTOM_JDK="oraclejdk7" - os: osx env: CUSTOM_JDK="openjdk7" + - os: osx + env: CUSTOM_JDK="oraclejdk8" # On Linux, run with specific JDKs only. - os: linux env: CUSTOM_JDK="default" -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Thu May 18 19:32:47 2017 From: jython-checkins at python.org (stefan.richthofer) Date: Thu, 18 May 2017 23:32:47 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Fixed_=232570_by_monkeypat?= =?utf-8?q?ching_setuptools=2Ecommand=2Eeasy=5Finstall=2Epy_in_case_of_OSX?= =?utf-8?q?=2E?= Message-ID: <20170518233247.39165.591B0D9315D9D6C2@psf.io> https://hg.python.org/jython/rev/aed63ef4113d changeset: 8080:aed63ef4113d user: Stefan Richthofer date: Fri May 19 01:32:32 2017 +0200 summary: Fixed #2570 by monkeypatching setuptools.command.easy_install.py in case of OSX. files: Lib/_fix_jython_setuptools_osx.py | 33 +++++++++++++++++++ src/org/python/core/imp.java | 7 ++++ 2 files changed, 40 insertions(+), 0 deletions(-) diff --git a/Lib/_fix_jython_setuptools_osx.py b/Lib/_fix_jython_setuptools_osx.py new file mode 100644 --- /dev/null +++ b/Lib/_fix_jython_setuptools_osx.py @@ -0,0 +1,33 @@ +''' +Import of this module is triggered by org.python.core.imp.import_next +on first import of setuptools.command. It essentially restores a +Jython specific fix for OSX shebang line via monkeypatching. + +See http://bugs.jython.org/issue2570 +Related: http://bugs.jython.org/issue1112 +''' + +from setuptools.command import easy_install as ez + +_as_header = ez.CommandSpec.as_header + +def _jython_as_header(self): + '''Workaround Jython's sys.executable being a .sh (an invalid + shebang line interpreter) + ''' + if not ez.is_sh(self[0]): + return _as_header(self) + + if self.options: + # Can't apply the workaround, leave it broken + log.warn( + "WARNING: Unable to adapt shebang line for Jython," + " the following script is NOT executable\n" + " see http://bugs.jython.org/issue1112 for" + " more information.") + return _as_header(self) + + items = ['/usr/bin/env'] + self + list(self.options) + return self._render(items) + +ez.CommandSpec.as_header = _jython_as_header diff --git a/src/org/python/core/imp.java b/src/org/python/core/imp.java --- a/src/org/python/core/imp.java +++ b/src/org/python/core/imp.java @@ -36,6 +36,8 @@ // imports unless `from __future__ import absolute_import` public static final int DEFAULT_LEVEL = -1; + private static final boolean IS_OSX = PySystemState.getNativePlatform().equals("darwin"); + public static class CodeData { private final byte[] bytes; @@ -847,6 +849,11 @@ } else { ret = modules.__finditem__(fullName); } + if (IS_OSX && fullName.equals("setuptools.command")) { + // On OSX we currently have to monkeypatch setuptools.command.easy_install. + // See http://bugs.jython.org/issue2570 + load("_fix_jython_setuptools_osx"); + } return ret; } -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Thu May 18 21:31:01 2017 From: jython-checkins at python.org (stefan.richthofer) Date: Fri, 19 May 2017 01:31:01 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Fixed_=232579=2E_Updated_N?= =?utf-8?q?EWS=2E?= Message-ID: <20170519013101.82729.A78947870B50CAAA@psf.io> https://hg.python.org/jython/rev/c382818607a0 changeset: 8081:c382818607a0 user: Stefan Richthofer date: Fri May 19 03:30:48 2017 +0200 summary: Fixed #2579. Updated NEWS. files: NEWS | 2 + src/org/python/compiler/Module.java | 24 +++++++++++++++- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS --- a/NEWS +++ b/NEWS @@ -4,6 +4,8 @@ Jython 2.7.1rc1 Bugs fixed + - [ 2579 ] Pyc files are not loading for too large modules if path contains __pyclasspath__ + - [ 2570 ] Wrong shebang set for OS X installation of Jython - [ 2585 ] test_ssl failure due to Netty exception mapping - [ 2313 ] test_jython_initializer failure on Windows - [ 2399 ] test_sort failure on Java 8 diff --git a/src/org/python/compiler/Module.java b/src/org/python/compiler/Module.java --- a/src/org/python/compiler/Module.java +++ b/src/org/python/compiler/Module.java @@ -12,6 +12,7 @@ import java.io.File; import java.io.BufferedReader; import java.io.InputStreamReader; +import java.net.URL; import java.util.ArrayList; import java.util.Enumeration; import java.util.Hashtable; @@ -28,10 +29,12 @@ import org.python.antlr.ast.Str; import org.python.antlr.ast.Suite; import org.python.antlr.base.mod; +import org.python.core.ClasspathPyImporter; import org.python.core.CodeBootstrap; import org.python.core.CodeFlag; import org.python.core.CodeLoader; import org.python.core.CompilerFlags; +import org.python.core.imp; import org.python.core.Py; import org.python.core.PyCode; import org.python.core.PyBytecode; @@ -717,6 +720,23 @@ private static PyBytecode loadPyBytecode(String filename, boolean try_cpython) throws RuntimeException { + if (filename.startsWith(ClasspathPyImporter.PYCLASSPATH_PREFIX)) { + ClassLoader cld = Py.getSystemState().getClassLoader(); + if (cld == null) { + cld = imp.getParentClassLoader(); + } + URL py_url = cld.getResource(filename.replace( + ClasspathPyImporter.PYCLASSPATH_PREFIX, "")); + if (py_url != null) { + filename = py_url.getPath(); + } else { + // Should never happen, but let's play it safe and treat this case. + throw new RuntimeException( + "\nEncountered too large method code in \n"+filename+",\n"+ + "but couldn't resolve that filename within classpath.\n"+ + "Make sure the source file is at a proper location."); + } + } String cpython_cmd_msg = "\n\nAlternatively provide proper CPython 2.7 execute command via"+ "\ncpython_cmd property, e.g. call "+ @@ -875,7 +895,7 @@ * * Note that this approach is provisional. In future, Jython might contain * the bytecode directly as bytecode-objects. The current approach was - * feasible with much less complicated JVM bytecode-manipulation, but needs + * feasible with far less complicated JVM bytecode-manipulation, but needs * special treatment after class-loading. */ private static void insert_code_str_to_classfile(String name, String code_str, Module module) @@ -935,7 +955,7 @@ } else { // If a function needs to be represented as CPython bytecode, we create // all inner PyCode-items (classes, functions, methods) also as CPython - // bytecode implicitly, so no need so look at them individually. + // bytecode implicitly, so no need to look at them individually. // Maybe we can later optimize this such that inner methods can be // JVM-bytecode as well (if not oversized themselves). for (PyObject item: bcode.co_consts) { -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:01:59 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:01:59 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Tweak_build=2Exml_to_allow?= =?utf-8?q?_for_non-ascii_paths_on_Windows=2E?= Message-ID: <20170521090142.111472.E1372D84621627DF@psf.io> https://hg.python.org/jython/rev/8b6113558573 changeset: 8082:8b6113558573 parent: 8072:b051f30c4cd4 user: Jeff Allen date: Sun Mar 26 08:32:21 2017 +0100 summary: Tweak build.xml to allow for non-ascii paths on Windows. Previously ANTLR (write) and javac (read), were given different information about the encoding of files, and javac would choke on (the comments in) ANTLR's output, where the ANTLR source file is mentioned. This is fixed, and we add an encoding attribuite to the javadoc target for good measure. files: build.xml | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/build.xml b/build.xml --- a/build.xml +++ b/build.xml @@ -236,6 +236,7 @@ output.dir = '${output.dir}' compile.dir = '${compile.dir}' exposed.dir = '${exposed.dir}' + gensrc.dir = '${gensrc.dir}' dist.dir = '${dist.dir}' apidoc.dir = '${apidoc.dir}' templates.dir = '${templates.dir}' @@ -434,6 +435,7 @@ + @@ -704,6 +706,7 @@ https://hg.python.org/jython/rev/4ebf44457697 changeset: 8093:4ebf44457697 user: Jeff Allen date: Sun May 21 09:18:19 2017 +0100 summary: Update to NEWS concerning (fixes #1839, #2356) files: NEWS | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/NEWS b/NEWS --- a/NEWS +++ b/NEWS @@ -4,6 +4,8 @@ Jython 2.7.1rc1 Bugs fixed + - [ 2356 ] java.lang.IllegalArgumentException on startup on Windows if username not ASCII + - [ 1839 ] sys.getfilesystemencoding() is None (now utf-8) - [ 2579 ] Pyc files are not loading for too large modules if path contains __pyclasspath__ - [ 2570 ] Wrong shebang set for OS X installation of Jython - [ 2585 ] test_ssl failure due to Netty exception mapping @@ -111,6 +113,12 @@ - Fixed platform.mac_ver to provide actual info on Mac OS similar to CPython behavior. - Added uname function to posix module. The mostly Java-based implementation even works to some extend on non-posix systems (e.g. Windows). + - There is now support for non-ascii paths in all (home, installation, temporary) + directories, which previously caused failures. sys.getplatformencoding() returns + 'utf-8' as the nominal file-system encoding, irrespective of localization. This may + differ from what CPython reports on the same OS. In Jython a file path presented + as bytes is the UTF-8 encoding of the unicode file path as Java sees it. (See issues + #1839 and #2356.) This matter is unrelated to file.encoding or the console. Jython 2.7.1b3 Bugs fixed -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:01:59 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:01:59 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Fix_regressions_caused_=28?= =?utf-8?q?bugs_revealed=29_by_UTF-8_file_paths=2E_Fixes_=232356=2E?= Message-ID: <20170521090144.71645.AFE1C0A6871E8F2E@psf.io> https://hg.python.org/jython/rev/147fe05920a4 changeset: 8086:147fe05920a4 user: Jeff Allen date: Sun Apr 30 08:13:12 2017 +0100 summary: Fix regressions caused (bugs revealed) by UTF-8 file paths. Fixes #2356. Wide-ranging change set fixing residual test failures induced by the uniform adoption of UTF-8 for non-ascii paths when found as a str/bytes. "ant regrtest" runs without errors. files: Lib/javashell.py | 2 +- Lib/sysconfig.py | 6 + Lib/test/test_java_visibility.py | 11 +- Lib/test/test_jser.py | 4 +- Lib/test/test_ssl.py | 8 +- Lib/test/test_support.py | 2 +- Lib/test/test_zipimport_jy.py | 6 +- src/org/python/core/Py.java | 2 +- src/org/python/core/PySystemState.java | 28 +++- src/org/python/core/__builtin__.java | 8 +- src/org/python/core/imp.java | 15 ++- src/org/python/core/io/FileIO.java | 10 +- src/org/python/modules/_imp.java | 53 +++++++-- src/org/python/modules/_py_compile.java | 36 ++++-- src/org/python/modules/zipimport/zipimporter.java | 8 +- 15 files changed, 140 insertions(+), 59 deletions(-) diff --git a/Lib/javashell.py b/Lib/javashell.py --- a/Lib/javashell.py +++ b/Lib/javashell.py @@ -55,7 +55,7 @@ env = self._formatEnvironment( self.environment ) try: - p = Runtime.getRuntime().exec( shellCmd, env, File(os.getcwd()) ) + p = Runtime.getRuntime().exec( shellCmd, env, File(os.getcwdu()) ) return p except IOException, ex: raise OSError( diff --git a/Lib/sysconfig.py b/Lib/sysconfig.py --- a/Lib/sysconfig.py +++ b/Lib/sysconfig.py @@ -5,6 +5,11 @@ import os from os.path import pardir, realpath +def fileSystemEncode(path): + if isinstance(path, unicode): + return path.encode(sys.getfilesystemencoding()) + return path + _INSTALL_SCHEMES = { 'posix_prefix': { 'stdlib': '{base}/lib/python{py_version_short}', @@ -116,6 +121,7 @@ def _safe_realpath(path): try: + path = fileSystemEncode(path) return realpath(path) except OSError: return path diff --git a/Lib/test/test_java_visibility.py b/Lib/test/test_java_visibility.py --- a/Lib/test/test_java_visibility.py +++ b/Lib/test/test_java_visibility.py @@ -13,6 +13,7 @@ from org.python.tests.multihidden import BaseConnection class VisibilityTest(unittest.TestCase): + def test_invisible(self): for item in dir(Invisible): self.assert_(not item.startswith("package")) @@ -178,6 +179,7 @@ class JavaClassTest(unittest.TestCase): + def test_class_methods_visible(self): self.assertFalse(HashMap.isInterface(), 'java.lang.Class methods should be visible on Class instances') @@ -198,6 +200,7 @@ self.assertEquals(3, s.b, "Defined fields should take precedence") class CoercionTest(unittest.TestCase): + def test_int_coercion(self): c = Coercions() self.assertEquals("5", c.takeInt(5)) @@ -234,6 +237,7 @@ self.assertEquals(c.tellClassNameObject(ht), "class java.util.Hashtable") class RespectJavaAccessibilityTest(unittest.TestCase): + def run_accessibility_script(self, script, error=AttributeError): fn = test_support.findfile(script) self.assertRaises(error, execfile, fn) @@ -254,6 +258,7 @@ self.run_accessibility_script("call_overridden_method.py") class ClassloaderTest(unittest.TestCase): + def test_loading_classes_without_import(self): cl = test_support.make_jar_classloader("../callbacker_test.jar") X = cl.loadClass("org.python.tests.Callbacker") @@ -265,11 +270,13 @@ self.assertEquals(None, called[0]) def test_main(): - test_support.run_unittest(VisibilityTest, + test_support.run_unittest( + VisibilityTest, JavaClassTest, CoercionTest, RespectJavaAccessibilityTest, - ClassloaderTest) + ClassloaderTest + ) if __name__ == "__main__": test_main() diff --git a/Lib/test/test_jser.py b/Lib/test/test_jser.py --- a/Lib/test/test_jser.py +++ b/Lib/test/test_jser.py @@ -15,7 +15,9 @@ class JavaSerializationTests(unittest.TestCase): def setUp(self): - self.sername = os.path.join(sys.prefix, "test.ser") + name = os.path.join(sys.prefix, "test.ser") + # As we are using java.io directly, ensure file name is a unicode + self.sername = name.decode(sys.getfilesystemencoding()) def tearDown(self): os.remove(self.sername) diff --git a/Lib/test/test_ssl.py b/Lib/test/test_ssl.py --- a/Lib/test/test_ssl.py +++ b/Lib/test/test_ssl.py @@ -27,7 +27,13 @@ HOST = support.HOST def data_file(*name): - return os.path.join(os.path.dirname(__file__), *name) + file = os.path.join(os.path.dirname(__file__), *name) + # Ensure we return unicode path. This tweak is not a divergence: + # CPython 2.7.13 fails the same way for a non-ascii location. + if isinstance(file, unicode): + return file + else: + return file.decode(sys.getfilesystemencoding()) # The custom key and certificate files used in test_ssl are generated # using Lib/test/make_ssl_certs.py. diff --git a/Lib/test/test_support.py b/Lib/test/test_support.py --- a/Lib/test/test_support.py +++ b/Lib/test/test_support.py @@ -509,7 +509,7 @@ if is_jython: # Jython disallows @ in module names TESTFN = '$test' - TESTFN_UNICODE = "$test-\xe0\xf2" + TESTFN_UNICODE = u"$test-\u87d2\u86c7" # = test python (Chinese) TESTFN_ENCODING = sys.getfilesystemencoding() elif os.name == 'riscos': TESTFN = 'testfile' diff --git a/Lib/test/test_zipimport_jy.py b/Lib/test/test_zipimport_jy.py --- a/Lib/test/test_zipimport_jy.py +++ b/Lib/test/test_zipimport_jy.py @@ -51,8 +51,10 @@ A(path).somevar = 1 def test_main(): - test_support.run_unittest(SyspathZipimportTest) - test_support.run_unittest(ZipImporterDictTest) + test_support.run_unittest( + SyspathZipimportTest, + ZipImporterDictTest + ) if __name__ == "__main__": test_main() diff --git a/src/org/python/core/Py.java b/src/org/python/core/Py.java --- a/src/org/python/core/Py.java +++ b/src/org/python/core/Py.java @@ -223,7 +223,7 @@ } public static PyException IOError(Constant errno, String filename) { - return new PyException(Py.IOError, Py.fileSystemEncode(filename)); // XXX newStringOrUnicode? + return IOError(errno, Py.fileSystemEncode(filename)); } public static PyException IOError(Constant errno, PyObject filename) { diff --git a/src/org/python/core/PySystemState.java b/src/org/python/core/PySystemState.java --- a/src/org/python/core/PySystemState.java +++ b/src/org/python/core/PySystemState.java @@ -117,8 +117,21 @@ private static PyObject defaultExecutable; // bytes or unicode or None public static Properties registry; // = init_registry(); - public static PyObject prefix; // bytes or unicode - public static PyObject exec_prefix = Py.EmptyString; // bytes or unicode + /** + * A string giving the site-specific directory prefix where the platform independent Python + * files are installed; by default, this is based on the property python.home or + * the location of the Jython JAR. The main collection of Python library modules is installed in + * the directory prefix/Lib. This object should contain bytes in the file system + * encoding for consistency with use in the standard library (see sysconfig.py). + */ + public static PyObject prefix; + /** + * A string giving the site-specific directory prefix where the platform-dependent Python files + * are installed; by default, this is the same as {@link #exec_prefix}. This object should + * contain bytes in the file system encoding for consistency with use in the standard library + * (see sysconfig.py). + */ + public static PyObject exec_prefix = Py.EmptyString; public static final PyString byteorder = new PyString("big"); public static final int maxint = Integer.MAX_VALUE; @@ -843,10 +856,10 @@ } } if (prefix != null) { - PySystemState.prefix = Py.newStringOrUnicode(prefix); + PySystemState.prefix = Py.fileSystemEncode(prefix); } if (exec_prefix != null) { - PySystemState.exec_prefix = Py.newStringOrUnicode(exec_prefix); + PySystemState.exec_prefix = Py.fileSystemEncode(exec_prefix); } try { String jythonpath = System.getenv("JYTHONPATH"); @@ -1158,7 +1171,8 @@ } cachedir = new File(props.getProperty(PYTHON_CACHEDIR, CACHEDIR_DEFAULT_NAME)); if (!cachedir.isAbsolute()) { - cachedir = new File(prefix == null ? null : prefix.toString(), cachedir.getPath()); + String prefixString = prefix == null ? null : Py.fileSystemDecode(prefix); + cachedir = new File(prefixString, cachedir.getPath()); } } @@ -1357,7 +1371,7 @@ addPaths(path, props.getProperty("python.path", "")); if (prefix != null) { String libpath = new File(Py.fileSystemDecode(prefix), "Lib").toString(); - path.append(Py.fileSystemEncode(libpath)); // XXX or newStringOrUnicode or newUnicode? + path.append(Py.fileSystemEncode(libpath)); // XXX or newUnicode? } if (standalone) { // standalone jython: add the /Lib directory inside JYTHON_JAR to the path @@ -1401,7 +1415,7 @@ StringTokenizer tok = new StringTokenizer(pypath, java.io.File.pathSeparator); while (tok.hasMoreTokens()) { // Use unicode object if necessary to represent the element - path.append(Py.newStringOrUnicode(tok.nextToken().trim())); + path.append(Py.newStringOrUnicode(tok.nextToken().trim())); // XXX or newUnicode? } } diff --git a/src/org/python/core/__builtin__.java b/src/org/python/core/__builtin__.java --- a/src/org/python/core/__builtin__.java +++ b/src/org/python/core/__builtin__.java @@ -85,7 +85,7 @@ case 18: return __builtin__.eval(arg1); case 19: - __builtin__.execfile(arg1.asString()); + __builtin__.execfile(Py.fileSystemDecode(arg1)); return Py.None; case 23: return __builtin__.hex(arg1); @@ -141,7 +141,7 @@ case 18: return __builtin__.eval(arg1, arg2); case 19: - __builtin__.execfile(arg1.asString(), arg2); + __builtin__.execfile(Py.fileSystemDecode(arg1), arg2); return Py.None; case 20: return __builtin__.filter(arg1, arg2); @@ -191,7 +191,7 @@ case 18: return __builtin__.eval(arg1, arg2, arg3); case 19: - __builtin__.execfile(arg1.asString(), arg2, arg3); + __builtin__.execfile(Py.fileSystemDecode(arg1), arg2, arg3); return Py.None; case 21: return __builtin__.getattr(arg1, arg2, arg3); @@ -1629,7 +1629,7 @@ "dont_inherit"}, 3); PyObject source = ap.getPyObject(0); - String filename = ap.getString(1); + String filename = Py.fileSystemDecode(ap.getPyObject(1)); String mode = ap.getString(2); int flags = ap.getInt(3, 0); boolean dont_inherit = ap.getPyObject(4, Py.False).__nonzero__(); diff --git a/src/org/python/core/imp.java b/src/org/python/core/imp.java --- a/src/org/python/core/imp.java +++ b/src/org/python/core/imp.java @@ -294,6 +294,7 @@ return compileSource(name, makeStream(file), sourceFilename, mtime); } + /** Remove the last three characters of a file name and add the compiled suffix "$py.class". */ public static String makeCompiledFilename(String filename) { return filename.substring(0, filename.length() - 3) + "$py.class"; } @@ -639,7 +640,7 @@ compiledFile = new File(dirName, compiledName); } else { PyModule m = addModule(modName); - PyObject filename = Py.newStringOrUnicode(new File(displayDirName, name).getPath()); // XXX fileSystemEncode? + PyObject filename = Py.newStringOrUnicode(new File(displayDirName, name).getPath()); m.__dict__.__setitem__("__path__", new PyList(new PyObject[] {filename})); } @@ -927,9 +928,6 @@ } } } - if (name.indexOf(File.separatorChar) != -1) { - throw Py.ImportError("Import by filename is not supported."); - } PyObject modules = Py.getSystemState().modules; PyObject pkgMod = null; String pkgName = null; @@ -973,6 +971,13 @@ return mod; } + /** Defend against attempt to import by filename (withdrawn feature). */ + private static void checkNotFile(String name){ + if (name.indexOf(File.separatorChar) != -1) { + throw Py.ImportError("Import by filename is not supported."); + } + } + private static void ensureFromList(PyObject mod, PyObject fromlist, String name) { ensureFromList(mod, fromlist, name, false); } @@ -1015,6 +1020,7 @@ * @return an imported module (Java or Python) */ public static PyObject importName(String name, boolean top) { + checkNotFile(name); PyUnicode.checkEncoding(name); ReentrantLock importLock = Py.getSystemState().getImportLock(); importLock.lock(); @@ -1035,6 +1041,7 @@ */ public static PyObject importName(String name, boolean top, PyObject modDict, PyObject fromlist, int level) { + checkNotFile(name); PyUnicode.checkEncoding(name); ReentrantLock importLock = Py.getSystemState().getImportLock(); importLock.lock(); diff --git a/src/org/python/core/io/FileIO.java b/src/org/python/core/io/FileIO.java --- a/src/org/python/core/io/FileIO.java +++ b/src/org/python/core/io/FileIO.java @@ -64,10 +64,10 @@ private boolean emulateAppend; /** - * @see #FileIO(String name, String mode) + * @see #FileIO(PyString name, String mode) */ - public FileIO(PyString name, String mode) { - this(Py.fileSystemDecode(name), mode); + public FileIO(String name, String mode) { + this(Py.newUnicode(name), mode); } /** @@ -80,9 +80,9 @@ * @param name the name of the file * @param mode a raw io file mode String */ - public FileIO(String name, String mode) { + public FileIO(PyString name, String mode) { parseMode(mode); - File absPath = new RelativeFile(name); + File absPath = new RelativeFile(Py.fileSystemDecode(name)); try { if ((appending && !(reading || plus)) || (writing && !reading && !plus)) { diff --git a/src/org/python/modules/_imp.java b/src/org/python/modules/_imp.java --- a/src/org/python/modules/_imp.java +++ b/src/org/python/modules/_imp.java @@ -75,7 +75,7 @@ static ModuleInfo findFromSource(String name, String entry, boolean findingPackage, boolean preferSource) { String sourceName = "__init__.py"; - String compiledName = makeCompiledFilename(sourceName); + String compiledName = imp.makeCompiledFilename(sourceName); String directoryName = PySystemState.getPathLazy(entry); // displayDirName is for identification purposes: when null it // forces java.io.File to be a relative path (e.g. foo/bar.py @@ -97,7 +97,7 @@ } else { Py.writeDebug("import", "trying source " + dir.getPath()); sourceName = name + ".py"; - compiledName = makeCompiledFilename(sourceName); + compiledName = imp.makeCompiledFilename(sourceName); sourceFile = new File(directoryName, sourceName); compiledFile = new File(directoryName, compiledName); } @@ -152,8 +152,7 @@ throw Py.TypeError("must be a file-like object"); } PySystemState sys = Py.getSystemState(); - String compiledFilename = - makeCompiledFilename(sys.getPath(filename)); + String compiledFilename = imp.makeCompiledFilename(sys.getPath(filename)); mod = imp.createFromSource(modname.intern(), (InputStream)o, filename, compiledFilename); PyObject modules = sys.modules; @@ -161,15 +160,38 @@ return mod; } - public static PyObject load_compiled(String name, String pathname) { - return load_compiled(name, pathname, new PyFile(pathname, "rb", -1)); - } - public static PyObject reload(PyObject module) { return __builtin__.reload(module); } - public static PyObject load_compiled(String name, String pathname, PyObject file) { + /** + * Return a module with the given name, the result of executing the compiled code + * at the given pathname. If this path is a PyUnicode, it is used + * exactly; if it is a PyString it is taken to be file-system encoded. + * + * @param name the module name + * @param pathname to the compiled module (becomes __file__) + * @return the module called name + */ + public static PyObject load_compiled(String name, PyString pathname) { + String _pathname = Py.fileSystemDecode(pathname); + return _load_compiled(name, _pathname, new PyFile(_pathname, "rb", -1)); + } + + /** + * Return a module with the given name, the result of executing the compiled code + * in the given file stream. + * + * @param name the module name + * @param pathname a file path that is not null (becomes __file__) + * @param file stream from which the compiled code is taken + * @return the module called name + */ + public static PyObject load_compiled(String name, PyString pathname, PyObject file) { + return _load_compiled(name, Py.fileSystemDecode(pathname), file); + } + + private static PyObject _load_compiled(String name, String pathname, PyObject file) { InputStream stream = (InputStream) file.__tojava__(InputStream.class); if (stream == Py.NoConversion) { throw Py.TypeError("must be a file-like object"); @@ -230,7 +252,7 @@ // XXX: This should load the accompanying byte code file instead, if it exists String resolvedFilename = sys.getPath(filenameString); - compiledName = makeCompiledFilename(resolvedFilename); + compiledName = imp.makeCompiledFilename(resolvedFilename); if (name.endsWith(".__init__")) { name = name.substring(0, name.length() - ".__init__".length()); } else if (name.equals("__init__")) { @@ -247,7 +269,7 @@ filenameString, compiledName, mtime); break; case PY_COMPILED: - mod = load_compiled(name, filenameString, file); + mod = _load_compiled(name, filenameString, file); break; case PKG_DIRECTORY: PyModule m = imp.addModule(name); @@ -268,8 +290,13 @@ return mod; } - public static String makeCompiledFilename(String filename) { - return imp.makeCompiledFilename(filename); + /** + * Variant of {@link imp#makeCompiledFilename(String)} dealing with encoded bytes. In the context + * where this is used from Python, a result in encoded bytes is preferable. + */ + public static PyString makeCompiledFilename(PyString filename) { + filename = Py.fileSystemEncode(filename); + return Py.newString(imp.makeCompiledFilename(filename.getString())); } public static PyObject get_magic() { diff --git a/src/org/python/modules/_py_compile.java b/src/org/python/modules/_py_compile.java --- a/src/org/python/modules/_py_compile.java +++ b/src/org/python/modules/_py_compile.java @@ -12,22 +12,30 @@ public class _py_compile { public static PyList __all__ = new PyList(new PyString[] { new PyString("compile") }); - public static boolean compile(String filename, String cfile, String dfile) { - // Resolve relative path names. dfile is only used for error messages and should not be - // resolved + /** + * Java wrapper on the module compiler in support of of py_compile.compile. Filenames here will + * be interpreted as Unicode if they are PyUnicode, and as byte-encoded names if they only + * PyString. + * + * @param fileName actual source file name + * @param compiledName compiled filename + * @param displayName displayed source filename, only used for error messages (and not resolved) + * @return true if successful + */ + public static boolean compile(PyString fileName, PyString compiledName, PyString displayName) { + // Resolve source path and check it exists PySystemState sys = Py.getSystemState(); - filename = sys.getPath(filename); - cfile = sys.getPath(cfile); + String file = sys.getPath(Py.fileSystemDecode(fileName)); + File f = new File(file); + if (!f.exists()) { + throw Py.IOError(Errno.ENOENT, file); + } - File file = new File(filename); - if (!file.exists()) { - throw Py.IOError(Errno.ENOENT, Py.newString(filename)); - } - String name = getModuleName(file); - - byte[] bytes = org.python.core.imp.compileSource(name, file, dfile, cfile); - org.python.core.imp.cacheCompiledSource(filename, cfile, bytes); - + // Convert file in which to put the byte code and display name (each may be null) + String c = (compiledName == null) ? null : sys.getPath(Py.fileSystemDecode(compiledName)); + String d = (displayName == null) ? null : Py.fileSystemDecode(displayName); + byte[] bytes = org.python.core.imp.compileSource(getModuleName(f), f, d, c); + org.python.core.imp.cacheCompiledSource(file, c, bytes); return bytes.length > 0; } diff --git a/src/org/python/modules/zipimport/zipimporter.java b/src/org/python/modules/zipimport/zipimporter.java --- a/src/org/python/modules/zipimport/zipimporter.java +++ b/src/org/python/modules/zipimport/zipimporter.java @@ -20,6 +20,7 @@ import org.python.core.PySystemState; import org.python.core.PyTuple; import org.python.core.PyType; +import org.python.core.PyUnicode; import org.python.core.Traverseproc; import org.python.core.Visitproc; import org.python.core.util.FileUtil; @@ -80,7 +81,7 @@ @ExposedMethod final void zipimporter___init__(PyObject[] args, String[] kwds) { ArgParser ap = new ArgParser("__init__", args, kwds, new String[] {"path"}); - String path = ap.getString(0); + String path = Py.fileSystemDecode(ap.getPyObject(0)); zipimporter___init__(path); } @@ -113,10 +114,11 @@ pathFile = parentFile; } if (archive != null) { - files = zipimport._zip_directory_cache.__finditem__(archive); + PyUnicode archivePath = Py.newUnicode(archive); + files = zipimport._zip_directory_cache.__finditem__(archivePath); if (files == null) { files = readDirectory(archive); - zipimport._zip_directory_cache.__setitem__(archive, files); + zipimport._zip_directory_cache.__setitem__(archivePath, files); } } else { throw zipimport.ZipImportError("not a Zip file: " + path); -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:01:59 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:01:59 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Handle_paths_used_in_impor?= =?utf-8?q?t_and_exception_text_as_String_or_Unicode=2E?= Message-ID: <20170521090144.71025.F7342049B9474ED0@psf.io> https://hg.python.org/jython/rev/3a01ed2a8554 changeset: 8085:3a01ed2a8554 user: Jeff Allen date: Wed Apr 26 22:26:16 2017 +0100 summary: Handle paths used in import and exception text as String or Unicode. This change set is part of a sequence necessary to handle non-ascii file and path names correctly. See notes to issue #2356. PyString file paths appearing during import are handled consistently as UTF-8. Exception processing in Java allows for non-ascii paths appearing in the traceback, by handling everything as Java String, although it is not possible to expose this as unicode because the standard library too often needs bytes. files: Lib/test/test_exceptions.py | 3 - Lib/test/test_exceptions_jy.py | 5 +- Lib/test/test_httpservers.py | 3 + src/org/python/core/Py.java | 169 ++++++--- src/org/python/core/PyBaseException.java | 17 +- src/org/python/core/PyException.java | 25 +- src/org/python/core/PyString.java | 6 +- src/org/python/core/PyUnicode.java | 4 +- src/org/python/core/StdoutWrapper.java | 3 +- src/org/python/core/SyspathArchive.java | 2 +- src/org/python/core/SyspathJavaLoader.java | 55 +- src/org/python/core/packagecache/PathPackageManager.java | 14 +- src/org/python/util/jython.java | 4 +- 13 files changed, 188 insertions(+), 122 deletions(-) diff --git a/Lib/test/test_exceptions.py b/Lib/test/test_exceptions.py --- a/Lib/test/test_exceptions.py +++ b/Lib/test/test_exceptions.py @@ -524,7 +524,6 @@ self.check_same_msg(Exception(), '') - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_0_args_with_overridden___str__(self): """Check same msg for exceptions with 0 args and overridden __str__""" # str() and unicode() on an exception with overridden __str__ that @@ -550,7 +549,6 @@ self.assertRaises(UnicodeEncodeError, str, e) self.assertEqual(unicode(e), u'f\xf6\xf6') - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_1_arg_with_overridden___str__(self): """Check same msg for exceptions with overridden __str__ and 1 arg""" # when __str__ is overridden and __unicode__ is not implemented @@ -575,7 +573,6 @@ for args in argslist: self.check_same_msg(Exception(*args), repr(args)) - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_many_args_with_overridden___str__(self): """Check same msg for exceptions with overridden __str__ and many args""" # if __str__ returns an ascii string / ascii unicode string diff --git a/Lib/test/test_exceptions_jy.py b/Lib/test/test_exceptions_jy.py --- a/Lib/test/test_exceptions_jy.py +++ b/Lib/test/test_exceptions_jy.py @@ -70,11 +70,12 @@ # But the exception hook, via Py#displayException, does not fail when attempting to __str__ the exception args with test_support.captured_stderr() as s: sys.excepthook(RuntimeError, u"Drink \u2615", None) - self.assertEqual(s.getvalue(), "RuntimeError\n") + # At minimum, it tells us what kind of exception it was + self.assertEqual(s.getvalue()[:12], "RuntimeError") # It is fine with ascii values, of course with test_support.captured_stderr() as s: sys.excepthook(RuntimeError, u"Drink java", None) - self.assertEqual(s.getvalue(), "RuntimeError: Drink java\n") + self.assertEqual(s.getvalue(), "RuntimeError: Drink java\n") def test_main(): diff --git a/Lib/test/test_httpservers.py b/Lib/test/test_httpservers.py --- a/Lib/test/test_httpservers.py +++ b/Lib/test/test_httpservers.py @@ -378,6 +378,9 @@ @unittest.skipIf(hasattr(os, 'geteuid') and os.geteuid() == 0, "This test can't be run reliably as root (issue #13308).") + at unittest.skipIf((not hasattr(os, 'symlink')) and + sys.executable.encode('ascii', 'replace') != sys.executable, + "Executable path is not pure ASCII.") # these fail for CPython too class CGIHTTPServerTestCase(BaseTestCase): class request_handler(NoLogRequestHandler, CGIHTTPRequestHandler): pass diff --git a/src/org/python/core/Py.java b/src/org/python/core/Py.java --- a/src/org/python/core/Py.java +++ b/src/org/python/core/Py.java @@ -2,6 +2,7 @@ package org.python.core; import java.io.ByteArrayOutputStream; +import java.io.CharArrayWriter; import java.io.File; import java.io.FileDescriptor; import java.io.FileNotFoundException; @@ -10,7 +11,7 @@ import java.io.InputStream; import java.io.ObjectStreamException; import java.io.OutputStream; -import java.io.PrintStream; +import java.io.PrintWriter; import java.io.Serializable; import java.io.StreamCorruptedException; import java.lang.reflect.InvocationTargetException; @@ -25,7 +26,14 @@ import java.util.List; import java.util.Set; +import org.python.antlr.base.mod; +import org.python.core.adapter.ClassicPyObjectAdapter; +import org.python.core.adapter.ExtensiblePyObjectAdapter; +import org.python.modules.posix.PosixModule; +import org.python.util.Generic; + import com.google.common.base.CharMatcher; + import jline.console.UserInterruptException; import jnr.constants.Constant; import jnr.constants.platform.Errno; @@ -33,14 +41,6 @@ import jnr.posix.POSIXFactory; import jnr.posix.util.Platform; -import org.python.antlr.base.mod; -import org.python.core.adapter.ClassicPyObjectAdapter; -import org.python.core.adapter.ExtensiblePyObjectAdapter; -import org.python.core.Traverseproc; -import org.python.core.Visitproc; -import org.python.modules.posix.PosixModule; -import org.python.util.Generic; - /** Builtin types that are used to setup PyObject. * * Resolve circular dependency with some laziness. */ @@ -130,7 +130,6 @@ public final static long TPFLAGS_IS_ABSTRACT = 1L << 20; - /** A unique object to indicate no conversion is possible in __tojava__ methods **/ public final static Object NoConversion = new PySingleton("Error"); @@ -1175,11 +1174,11 @@ } Py.getSystemState().callExitFunc(); } - //XXX: this needs review to make sure we are cutting out all of the Java - // exceptions. + + //XXX: this needs review to make sure we are cutting out all of the Java exceptions. private static String getStackTrace(Throwable javaError) { - ByteArrayOutputStream buf = new ByteArrayOutputStream(); - javaError.printStackTrace(new PrintStream(buf)); + CharArrayWriter buf = new CharArrayWriter(); + javaError.printStackTrace(new PrintWriter(buf)); String str = buf.toString(); int index = -1; @@ -1272,31 +1271,55 @@ ts.exception = null; } - public static void displayException(PyObject type, PyObject value, PyObject tb, - PyObject file) { + /** + * Print the description of an exception as a big string. The arguments are closely equivalent + * to the tuple returned by Python sys.exc_info, on standard error or a given + * byte-oriented file. Compare with Python traceback.print_exception. + * + * @param type of exception + * @param value the exception parameter (second argument to raise) + * @param tb traceback of the call stack where the exception originally occurred + * @param file to print encoded string to, or null meaning standard error + */ + public static void displayException(PyObject type, PyObject value, PyObject tb, PyObject file) { + + // Output is to standard error, unless a file object has been given. StdoutWrapper stderr = Py.stderr; if (file != null) { stderr = new FixedFileWrapper(file); } flushLine(); + // The creation of the report operates entirely in Java String (to support Unicode). + String formattedException = exceptionToString(type, value, tb); + stderr.print(formattedException); + } + + /** + * Format the description of an exception as a big string. The arguments are closely equivalent + * to the tuple returned by Python sys.exc_info. Compare with Python + * traceback.format_exception. + * + * @param type of exception + * @param value the exception parameter (second argument to raise) + * @param tb traceback of the call stack where the exception originally occurred + * @return string representation of the traceback and exception + */ + static String exceptionToString(PyObject type, PyObject value, PyObject tb) { + + // Compose the stack dump, syntax error, and actual exception in this buffer: + StringBuilder buf; + if (tb instanceof PyTraceback) { - stderr.print(((PyTraceback) tb).dumpStack()); + buf = new StringBuilder(((PyTraceback)tb).dumpStack()); + } else { + buf = new StringBuilder(); } + if (__builtin__.isinstance(value, Py.SyntaxError)) { - PyObject filename = value.__findattr__("filename"); - PyObject text = value.__findattr__("text"); - PyObject lineno = value.__findattr__("lineno"); - stderr.print(" File \""); - stderr.print(filename == Py.None || filename == null ? - "" : filename.toString()); - stderr.print("\", line "); - stderr.print(lineno == null ? Py.newString("0") : lineno); - stderr.print("\n"); - if (text != Py.None && text != null && text.__len__() != 0) { - printSyntaxErrorText(stderr, value.__findattr__("offset").asInt(), - text.toString()); - } + // The value part of the exception is a syntax error: first emit that. + appendSyntaxError(buf, value); + // Now supersede it with just the syntax error message for the next phase. value = value.__findattr__("msg"); if (value == null) { value = Py.None; @@ -1305,26 +1328,53 @@ if (value.getJavaProxy() != null) { Object javaError = value.__tojava__(Throwable.class); - if (javaError != null && javaError != Py.NoConversion) { - stderr.println(getStackTrace((Throwable) javaError)); + // The value is some Java Throwable: append that too + buf.append(getStackTrace((Throwable)javaError)); } } + + // Be prepared for formatting the value part to fail (fall back to just the type) try { - stderr.println(formatException(type, value)); + buf.append(formatException(type, value)); } catch (Exception ex) { - stderr.println(formatException(type, Py.None)); + buf.append(formatException(type, Py.None)); + } + buf.append('\n'); + + return buf.toString(); + } + + /** + * Helper to {@link #tracebackToString(PyObject, PyObject)} when the value in an exception turns + * out to be a syntax error. + */ + private static void appendSyntaxError(StringBuilder buf, PyObject value) { + + PyObject filename = value.__findattr__("filename"); + PyObject text = value.__findattr__("text"); + PyObject lineno = value.__findattr__("lineno"); + + buf.append(" File \""); + buf.append(filename == Py.None || filename == null ? "" : filename.toString()); + buf.append("\", line "); + buf.append(lineno == null ? Py.newString('0') : lineno); + buf.append('\n'); + + if (text != Py.None && text != null && text.__len__() != 0) { + appendSyntaxErrorText(buf, value.__findattr__("offset").asInt(), text.toString()); } } + /** - * Print the two lines showing where a SyntaxError was caused. + * Generate two lines showing where a SyntaxError was caused. * - * @param out StdoutWrapper to print to + * @param buf to append with generated message text * @param offset the offset into text - * @param text a source code String line + * @param text a source code line */ - private static void printSyntaxErrorText(StdoutWrapper out, int offset, String text) { + private static void appendSyntaxErrorText(StringBuilder buf, int offset, String text) { if (offset >= 0) { if (offset > 0 && offset == text.length()) { offset--; @@ -1352,19 +1402,21 @@ text = text.substring(i, text.length()); } - out.print(" "); - out.print(text); + buf.append(" "); + buf.append(text); if (text.length() == 0 || !text.endsWith("\n")) { - out.print("\n"); + buf.append('\n'); } if (offset == -1) { return; } - out.print(" "); + + // The indicator line " ^" + buf.append(" "); for (offset--; offset > 0; offset--) { - out.print(" "); + buf.append(' '); } - out.print("^\n"); + buf.append("^\n"); } public static String formatException(PyObject type, PyObject value) { @@ -1384,7 +1436,7 @@ if (moduleName == null) { buf.append(""); } else { - String moduleStr = Py.fileSystemDecode(moduleName); + String moduleStr = moduleName.toString(); if (!moduleStr.equals("exceptions")) { buf.append(moduleStr); buf.append("."); @@ -1392,19 +1444,34 @@ } buf.append(className); } else { - buf.append(useRepr ? type.__repr__() : type.__str__()); + // Never happens since Python 2.7? Do something sensible anyway. + buf.append(asMessageString(type, useRepr)); } + if (value != null && value != Py.None) { - // only print colon if the str() of the object is not the empty string - PyObject s = useRepr ? value.__repr__() : value; - if (!(s instanceof PyString) || s.__len__() != 0) { - buf.append(": "); + String s = asMessageString(value, useRepr); + // Print colon and object (unless it renders as "") + if (s.length() > 0) { + buf.append(": ").append(s); } - buf.append(s); } + return buf.toString(); } + /** Defensive method to avoid exceptions from decoding (or import encodings) */ + private static String asMessageString(PyObject value, boolean useRepr) { + if (useRepr) + value = value.__repr__(); + if (value instanceof PyUnicode) { + return value.asString(); + } else { + // Carefully avoid decoding errors that would swallow the intended message + String s = value.__str__().getString(); + return PyString.encode_UnicodeEscape(s, false); + } + } + public static void writeUnraisable(Throwable unraisable, PyObject obj) { PyException pye = JavaError(unraisable); stderr.println(String.format("Exception %s in %s ignored", diff --git a/src/org/python/core/PyBaseException.java b/src/org/python/core/PyBaseException.java --- a/src/org/python/core/PyBaseException.java +++ b/src/org/python/core/PyBaseException.java @@ -169,12 +169,17 @@ @ExposedMethod(doc = BuiltinDocs.BaseException___str___doc) final PyString BaseException___str__() { switch (args.__len__()) { - case 0: - return Py.EmptyString; - case 1: - return args.__getitem__(0).__str__(); - default: - return args.__str__(); + case 0: + return Py.EmptyString; + case 1: + PyObject arg = args.__getitem__(0); + if (arg instanceof PyString) { + return (PyString)arg; + } else { + return arg.__str__(); + } + default: + return args.__str__(); } } diff --git a/src/org/python/core/PyException.java b/src/org/python/core/PyException.java --- a/src/org/python/core/PyException.java +++ b/src/org/python/core/PyException.java @@ -62,21 +62,31 @@ } private boolean printingStackTrace = false; + @Override public void printStackTrace() { Py.printException(this); } + @Override public Throwable fillInStackTrace() { return Options.includeJavaStackInExceptions ? super.fillInStackTrace() : this; } + @Override public synchronized void printStackTrace(PrintStream s) { if (printingStackTrace) { super.printStackTrace(s); } else { try { + /* + * Ensure that non-ascii characters are made printable. IOne would prefer to emit + * Unicode, but the output stream too often only accepts bytes. (s is not + * necessarily a console, e.g. during a doctest.) + */ + PyFile err = new PyFile(s); + err.setEncoding("ascii", "backslashreplace"); printingStackTrace = true; - Py.displayException(type, value, traceback, new PyFile(s)); + Py.displayException(type, value, traceback, err); } finally { printingStackTrace = false; } @@ -92,12 +102,9 @@ } } + @Override public synchronized String toString() { - ByteArrayOutputStream buf = new ByteArrayOutputStream(); - if (!printingStackTrace) { - printStackTrace(new PrintStream(buf)); - } - return buf.toString(); + return Py.exceptionToString(type, value, traceback); } /** @@ -332,10 +339,11 @@ public static String exceptionClassName(PyObject obj) { return obj instanceof PyClass ? ((PyClass)obj).__name__ : ((PyType)obj).fastGetName(); } - - + + /* Traverseproc support */ + @Override public int traverse(Visitproc visit, Object arg) { int retValue; if (type != null) { @@ -357,6 +365,7 @@ return 0; } + @Override public boolean refersDirectlyTo(PyObject ob) { return ob != null && (type == ob || value == ob || traceback == ob); } diff --git a/src/org/python/core/PyString.java b/src/org/python/core/PyString.java --- a/src/org/python/core/PyString.java +++ b/src/org/python/core/PyString.java @@ -79,7 +79,7 @@ } PyString(StringBuilder buffer) { - this(TYPE, new String(buffer)); + this(TYPE, buffer.toString()); } /** @@ -3998,9 +3998,9 @@ * Implements PEP-3101 {}-formatting methods str.format() and * unicode.format(). When called with enclosingIterator == null, this * method takes this object as its formatting string. The method is also called (calls itself) - * to deal with nested formatting sepecifications. In that case, enclosingIterator + * to deal with nested formatting specifications. In that case, enclosingIterator * is a {@link MarkupIterator} on this object and value is a substring of this - * object needing recursive transaltion. + * object needing recursive translation. * * @param args to be interpolated into the string * @param keywords for the trailing args diff --git a/src/org/python/core/PyUnicode.java b/src/org/python/core/PyUnicode.java --- a/src/org/python/core/PyUnicode.java +++ b/src/org/python/core/PyUnicode.java @@ -89,7 +89,7 @@ } PyUnicode(StringBuilder buffer) { - this(TYPE, new String(buffer)); + this(TYPE, buffer.toString()); } private static StringBuilder fromCodePoints(Iterator iter) { @@ -713,7 +713,7 @@ for (Iterator iter = newSubsequenceIterator(start, stop, step); iter.hasNext();) { buffer.appendCodePoint(iter.next()); } - return createInstance(new String(buffer)); + return createInstance(buffer.toString()); } @ExposedMethod(type = MethodType.CMP, doc = BuiltinDocs.unicode___getslice___doc) diff --git a/src/org/python/core/StdoutWrapper.java b/src/org/python/core/StdoutWrapper.java --- a/src/org/python/core/StdoutWrapper.java +++ b/src/org/python/core/StdoutWrapper.java @@ -105,8 +105,7 @@ String s; if (o instanceof PyUnicode) { // Use the encoding and policy defined for the stream. (Each may be null.) - s = ((PyUnicode)o).encode(file.encoding, "replace"); //FIXME: back to ... - // s = ((PyUnicode)o).encode(file.encoding, file.errors); + s = ((PyUnicode)o).encode(file.encoding, file.errors); } else { s = o.__str__().toString(); } diff --git a/src/org/python/core/SyspathArchive.java b/src/org/python/core/SyspathArchive.java --- a/src/org/python/core/SyspathArchive.java +++ b/src/org/python/core/SyspathArchive.java @@ -4,7 +4,7 @@ import java.util.zip.*; @Untraversable -public class SyspathArchive extends PyString { +public class SyspathArchive extends PyUnicode { private ZipFile zipFile; public SyspathArchive(String archiveName) throws IOException { diff --git a/src/org/python/core/SyspathJavaLoader.java b/src/org/python/core/SyspathJavaLoader.java --- a/src/org/python/core/SyspathJavaLoader.java +++ b/src/org/python/core/SyspathJavaLoader.java @@ -26,20 +26,20 @@ public SyspathJavaLoader(ClassLoader parent) { super(parent); } - - /** + + /** * Returns a byte[] with the contents read from an InputStream. - * + * * The stream is closed after reading the bytes. - * - * @param input The input stream + * + * @param input The input stream * @param size The number of bytes to read - * + * * @return an array of byte[size] with the contents read * */ private byte[] getBytesFromInputStream(InputStream input, int size) { - try { + try { byte[] buffer = new byte[size]; int nread = 0; while(nread < size) { @@ -56,9 +56,9 @@ } } } - + private byte[] getBytesFromDir(String dir, String name) { - try { + try { File file = getFile(dir, name); if (file == null) { return null; @@ -71,7 +71,7 @@ } } - + private byte[] getBytesFromArchive(SyspathArchive archive, String name) { String entryname = name.replace('.', SLASH_CHAR) + ".class"; ZipEntry ze = archive.getEntry(entryname); @@ -79,7 +79,7 @@ return null; } try { - return getBytesFromInputStream(archive.getInputStream(ze), + return getBytesFromInputStream(archive.getInputStream(ze), (int)ze.getSize()); } catch (IOException e) { return null; @@ -98,11 +98,11 @@ } return pkg; } - + @Override protected Class findClass(String name) throws ClassNotFoundException { PySystemState sys = Py.getSystemState(); - ClassLoader sysClassLoader = sys.getClassLoader(); + ClassLoader sysClassLoader = sys.getClassLoader(); if (sysClassLoader != null) { // sys.classLoader overrides this class loader! return sysClassLoader.loadClass(name); @@ -114,13 +114,10 @@ PyObject entry = replacePathItem(sys, i, path); if (entry instanceof SyspathArchive) { SyspathArchive archive = (SyspathArchive)entry; - buffer = getBytesFromArchive(archive, name); + buffer = getBytesFromArchive(archive, name); } else { - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); - buffer = getBytesFromDir(dir, name); + String dir = Py.fileSystemDecode(entry); + buffer = getBytesFromDir(dir, name); } if (buffer != null) { definePackageForClass(name); @@ -130,7 +127,7 @@ // couldn't find the .class file on sys.path throw new ClassNotFoundException(name); } - + @Override protected URL findResource(String res) { PySystemState sys = Py.getSystemState(); @@ -157,10 +154,7 @@ } continue; } - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = sys.getPath(entry.toString()); + String dir = sys.getPath(Py.fileSystemDecode(entry)); try { File resource = new File(dir, res); if (!resource.exists()) { @@ -179,7 +173,7 @@ throws IOException { List resources = new ArrayList(); - + PySystemState sys = Py.getSystemState(); res = deslashResource(res); @@ -204,10 +198,7 @@ } continue; } - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = sys.getPath(entry.toString()); + String dir = sys.getPath(Py.fileSystemDecode(entry)); try { File resource = new File(dir, res); if (!resource.exists()) { @@ -220,7 +211,7 @@ } return Collections.enumeration(resources); } - + static PyObject replacePathItem(PySystemState sys, int idx, PyList paths) { PyObject path = paths.__getitem__(idx); if (path instanceof SyspathArchive) { @@ -229,9 +220,9 @@ } try { - // this has the side affect of adding the jar to the PackageManager during the + // this has the side effect of adding the jar to the PackageManager during the // initialization of the SyspathArchive - path = new SyspathArchive(sys.getPath(path.toString())); + path = new SyspathArchive(sys.getPath(Py.fileSystemDecode(path))); } catch (Exception e) { return path; } diff --git a/src/org/python/core/packagecache/PathPackageManager.java b/src/org/python/core/packagecache/PathPackageManager.java --- a/src/org/python/core/packagecache/PathPackageManager.java +++ b/src/org/python/core/packagecache/PathPackageManager.java @@ -40,12 +40,9 @@ + name; for (int i = 0; i < path.__len__(); i++) { + // Each entry in the path may be byte-encoded or unicode PyObject entry = path.pyget(i); - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); - + String dir = Py.fileSystemDecode(entry); File f = new RelativeFile(dir, child); try { if (f.isDirectory() && imp.caseok(f, name)) { @@ -103,11 +100,8 @@ String child = jpkg.__name__.replace('.', File.separatorChar); for (int i = 0; i < path.__len__(); i++) { - PyObject entry = path.pyget(i); - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); + // Each entry in the path may be byte-encoded or unicode + String dir = Py.fileSystemDecode(path.pyget(i)); if (dir.length() == 0) { dir = null; diff --git a/src/org/python/util/jython.java b/src/org/python/util/jython.java --- a/src/org/python/util/jython.java +++ b/src/org/python/util/jython.java @@ -341,8 +341,8 @@ } else { try { interp.globals.__setitem__(new PyString("__file__"), - new PyString(opts.filename)); - + // Note that __file__ is widely expected to be encoded bytes + Py.fileSystemEncode(opts.filename)); FileInputStream file; try { file = new FileInputStream(new RelativeFile(opts.filename)); -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:01:59 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:01:59 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Handle_filenames_consisten?= =?utf-8?q?tly_as_bytes_or_unicode_fixes_=231839_and_=232356=2E?= Message-ID: <20170521090146.111299.68B0BABDB1011FE4@psf.io> https://hg.python.org/jython/rev/cb01e444e8e2 changeset: 8090:cb01e444e8e2 user: Jeff Allen date: Sun May 14 05:28:44 2017 +0100 summary: Handle filenames consistently as bytes or unicode fixes #1839 and #2356. This change completes the switch to utf-8 as the nominal file-system encoding addressing #1839 and #2356. test.regrtest now passes on Windows with Chinese localisation and with non-ascii cwd and TMP. The solution principles are ... 1. The file-system encoding (FS encoding) is utf-8, whatever the localisation. 2. File names in bytes/str objects are always FS-encoded (most Python API). 3. File names in unicode objects are not encoded (pure Java API). 4. Python API implemented in Java are declares file namesas PyString/PyObject not String. An argument is interpreted as FS-encoded if it is not PyUnicode. 5. File names exposed as java.lang.String (pure Java API) must be unicode. files: Lib/lib2to3/tests/test_main.py | 155 ++++++++++ Lib/test/test_bytecodetools_jy.py | 10 +- Lib/test/test_java_integration.py | 15 +- Lib/test/test_os_jy.py | 85 ++++- Lib/test/test_support.py | 7 +- Lib/test/test_sys.py | 2 - Lib/test/test_zipimport_support.py | 20 +- src/org/python/core/Py.java | 77 +++- src/org/python/core/PyBaseCode.java | 8 +- src/org/python/core/PyFile.java | 2 +- src/org/python/core/PyJavaPackage.java | 8 +- src/org/python/core/PyString.java | 47 ++- src/org/python/core/PySyntaxError.java | 8 +- src/org/python/core/PySystemState.java | 3 +- src/org/python/core/StdoutWrapper.java | 35 +- src/org/python/core/imp.java | 3 +- src/org/python/core/io/FileIO.java | 9 +- src/org/python/core/packagecache/PathPackageManager.java | 6 +- src/org/python/modules/_imp.java | 3 +- src/org/python/modules/posix/PosixModule.java | 12 +- src/org/python/modules/zipimport/zipimporter.java | 11 +- src/org/python/util/jython.java | 6 +- 22 files changed, 423 insertions(+), 109 deletions(-) diff --git a/Lib/lib2to3/tests/test_main.py b/Lib/lib2to3/tests/test_main.py new file mode 100644 --- /dev/null +++ b/Lib/lib2to3/tests/test_main.py @@ -0,0 +1,155 @@ +# -*- coding: utf-8 -*- +import sys +import codecs +import logging +import os +import re +import shutil +import StringIO +import sys +import tempfile +import unittest + +from lib2to3 import main + + +TEST_DATA_DIR = os.path.join(os.path.dirname(__file__), "data") +PY2_TEST_MODULE = os.path.join(TEST_DATA_DIR, "py2_test_grammar.py") + + +class TestMain(unittest.TestCase): + + if not hasattr(unittest.TestCase, 'assertNotRegex'): + # This method was only introduced in 3.2. + def assertNotRegex(self, text, regexp, msg=None): + import re + if not hasattr(regexp, 'search'): + regexp = re.compile(regexp) + if regexp.search(text): + self.fail("regexp %s MATCHED text %r" % (regexp.pattern, text)) + + def setUp(self): + self.temp_dir = None # tearDown() will rmtree this directory if set. + + def tearDown(self): + # Clean up logging configuration down by main. + del logging.root.handlers[:] + if self.temp_dir: + shutil.rmtree(self.temp_dir) + + def run_2to3_capture(self, args, in_capture, out_capture, err_capture): + save_stdin = sys.stdin + save_stdout = sys.stdout + save_stderr = sys.stderr + sys.stdin = in_capture + sys.stdout = out_capture + sys.stderr = err_capture + try: + return main.main("lib2to3.fixes", args) + finally: + sys.stdin = save_stdin + sys.stdout = save_stdout + sys.stderr = save_stderr + + def test_unencodable_diff(self): + input_stream = StringIO.StringIO(u"print 'nothing'\nprint u'?ber'\n") + out = StringIO.StringIO() + out_enc = codecs.getwriter("ascii")(out) + err = StringIO.StringIO() + ret = self.run_2to3_capture(["-"], input_stream, out_enc, err) + self.assertEqual(ret, 0) + output = out.getvalue() + self.assertTrue("-print 'nothing'" in output) + self.assertTrue("WARNING: couldn't encode 's diff for " + "your terminal" in err.getvalue()) + + def setup_test_source_trees(self): + """Setup a test source tree and output destination tree.""" + self.temp_dir = tempfile.mkdtemp() # tearDown() cleans this up. + + # Make the directory names unicode, in case the temporary directory has + # a non-ascii name, since refactor.py uses unicode strings internally. + # (Added for Jython but is test failure in CPython 2.7.13 too.) + self.temp_dir = self.temp_dir.decode(sys.getfilesystemencoding()) + + self.py2_src_dir = os.path.join(self.temp_dir, "python2_project") + self.py3_dest_dir = os.path.join(self.temp_dir, "python3_project") + os.mkdir(self.py2_src_dir) + os.mkdir(self.py3_dest_dir) + # Turn it into a package with a few files. + self.setup_files = [] + open(os.path.join(self.py2_src_dir, "__init__.py"), "w").close() + self.setup_files.append("__init__.py") + shutil.copy(PY2_TEST_MODULE, self.py2_src_dir) + self.setup_files.append(os.path.basename(PY2_TEST_MODULE)) + self.trivial_py2_file = os.path.join(self.py2_src_dir, "trivial.py") + self.init_py2_file = os.path.join(self.py2_src_dir, "__init__.py") + with open(self.trivial_py2_file, "w") as trivial: + trivial.write("print 'I need a simple conversion.'") + self.setup_files.append("trivial.py") + + def test_filename_changing_on_output_single_dir(self): + """2to3 a single directory with a new output dir and suffix.""" + self.setup_test_source_trees() + out = StringIO.StringIO() + err = StringIO.StringIO() + suffix = "TEST" + ret = self.run_2to3_capture( + ["-n", "--add-suffix", suffix, "--write-unchanged-files", + "--no-diffs", "--output-dir", + self.py3_dest_dir, self.py2_src_dir], + StringIO.StringIO(""), out, err) + self.assertEqual(ret, 0) + stderr = err.getvalue() + self.assertIn(" implies -w.", stderr) + self.assertIn( + "Output in %r will mirror the input directory %r layout" % ( + self.py3_dest_dir, self.py2_src_dir), stderr) + self.assertEqual(set(name+suffix for name in self.setup_files), + set(os.listdir(self.py3_dest_dir))) + for name in self.setup_files: + self.assertIn("Writing converted %s to %s" % ( + os.path.join(self.py2_src_dir, name), + os.path.join(self.py3_dest_dir, name+suffix)), stderr) + sep = re.escape(os.sep) + self.assertRegexpMatches( + stderr, r"No changes to .*/__init__\.py".replace("/", sep)) + self.assertNotRegex( + stderr, r"No changes to .*/trivial\.py".replace("/", sep)) + + def test_filename_changing_on_output_two_files(self): + """2to3 two files in one directory with a new output dir.""" + self.setup_test_source_trees() + err = StringIO.StringIO() + py2_files = [self.trivial_py2_file, self.init_py2_file] + expected_files = set(os.path.basename(name) for name in py2_files) + ret = self.run_2to3_capture( + ["-n", "-w", "--write-unchanged-files", + "--no-diffs", "--output-dir", self.py3_dest_dir] + py2_files, + StringIO.StringIO(""), StringIO.StringIO(), err) + self.assertEqual(ret, 0) + stderr = err.getvalue() + self.assertIn( + "Output in %r will mirror the input directory %r layout" % ( + self.py3_dest_dir, self.py2_src_dir), stderr) + self.assertEqual(expected_files, set(os.listdir(self.py3_dest_dir))) + + def test_filename_changing_on_output_single_file(self): + """2to3 a single file with a new output dir.""" + self.setup_test_source_trees() + err = StringIO.StringIO() + ret = self.run_2to3_capture( + ["-n", "-w", "--no-diffs", "--output-dir", self.py3_dest_dir, + self.trivial_py2_file], + StringIO.StringIO(""), StringIO.StringIO(), err) + self.assertEqual(ret, 0) + stderr = err.getvalue() + self.assertIn( + "Output in %r will mirror the input directory %r layout" % ( + self.py3_dest_dir, self.py2_src_dir), stderr) + self.assertEqual(set([os.path.basename(self.trivial_py2_file)]), + set(os.listdir(self.py3_dest_dir))) + + +if __name__ == '__main__': + unittest.main() diff --git a/Lib/test/test_bytecodetools_jy.py b/Lib/test/test_bytecodetools_jy.py --- a/Lib/test/test_bytecodetools_jy.py +++ b/Lib/test/test_bytecodetools_jy.py @@ -69,7 +69,11 @@ """ProxyDebugDirectory used to be the only way to save proxied classes""" def setUp(self): - self.tmpdir = tempfile.mkdtemp() + tmp = tempfile.mkdtemp() + # Ensure Unicode since derived file paths are used in Java calls + if isinstance(tmp, bytes): + tmp = tmp.decode(sys.getfilesystemencoding()) + self.tmpdir = tmp def tearDown(self): test_support.rmtree(self.tmpdir) @@ -82,7 +86,7 @@ class C(Callable): def call(self): return 47 - + self.assertEqual(C().call(), 47) proxy_dir = os.path.join(self.tmpdir, "org", "python", "proxies") # If test script is run outside of regrtest, the first path is used; @@ -93,7 +97,7 @@ self.assertRegexpMatches( proxy_classes[0], r'\$C\$\d+.class$') - + def test_main(): test_support.run_unittest( diff --git a/Lib/test/test_java_integration.py b/Lib/test/test_java_integration.py --- a/Lib/test/test_java_integration.py +++ b/Lib/test/test_java_integration.py @@ -485,8 +485,11 @@ # script must lie within python.home for this test to work return policy = test_support.findfile("python_home.policy") - self.assertEquals(subprocess.call([sys.executable, "-J-Dpython.cachedir.skip=true", - "-J-Djava.security.manager", "-J-Djava.security.policy=%s" % policy, script]), + self.assertEquals( + subprocess.call([sys.executable, + "-J-Dpython.cachedir.skip=true", + "-J-Djava.security.manager", + "-J-Djava.security.policy=%s" % policy, script]), 0) def test_import_signal_fails_with_import_error_using_security(self): @@ -693,7 +696,9 @@ def test_proxy_serialization(self): # Proxies can be deserializable in a fresh JVM, including being able # to "findPython" to get a PySystemState. - tempdir = tempfile.mkdtemp() + # tempdir gets combined with unicode paths derived from class names, + # so make it a unicode object. + tempdir = tempfile.mkdtemp().decode(sys.getfilesystemencoding()) old_proxy_debug_dir = org.python.core.Options.proxyDebugDirectory try: # Generate a proxy for Cat class; @@ -738,7 +743,9 @@ @unittest.skipUnless(find_executable('jar'), 'Need the jar command to run') def test_custom_proxymaker(self): # Verify custom proxymaker supports direct usage of Python code in Java - tempdir = tempfile.mkdtemp() + # tempdir gets combined with unicode paths derived from class names, + # so make it a unicode object. + tempdir = tempfile.mkdtemp().decode(sys.getfilesystemencoding()) try: SerializableProxies.serialized_path = tempdir import bark diff --git a/Lib/test/test_os_jy.py b/Lib/test/test_os_jy.py --- a/Lib/test/test_os_jy.py +++ b/Lib/test/test_os_jy.py @@ -198,14 +198,41 @@ def test_env(self): with test_support.temp_cwd(name=u"tempcwd-??"): + # os.environ is constructed with FS-encoded values (as in CPython), + # but it will accept unicode additions. newenv = os.environ.copy() - newenv["TEST_HOME"] = u"??" - p = subprocess.Popen([sys.executable, "-c", - 'import sys,os;' \ - 'sys.stdout.write(os.getenv("TEST_HOME").encode("utf-8"))'], - stdout=subprocess.PIPE, - env=newenv) - self.assertEqual(p.stdout.read().decode("utf-8"), u"??") + newenv["TEST_HOME"] = expected = u"??" + # Environment passed as UTF-16 String[] by Java, arrives FS-encoded. + for encoding in ('utf-8', 'gbk'): + # Emit the value of TEST_HOME explicitly encoded. + p = subprocess.Popen( + [sys.executable, "-c", + 'import sys, os;' \ + 'sys.stdout.write(os.getenv("TEST_HOME")' \ + '.decode(sys.getfilesystemencoding())' \ + '.encode("%s"))' \ + % encoding], + stdout=subprocess.PIPE, + env=newenv) + # Decode with chosen encoding + self.assertEqual(p.stdout.read().decode(encoding), u"??") + + def test_env_naively(self): + with test_support.temp_cwd(name=u"tempcwd-??"): + # os.environ is constructed with FS-encoded values (as in CPython), + # but it will accept unicode additions. + newenv = os.environ.copy() + newenv["TEST_HOME"] = expected = u"??" + # Environment passed as UTF-16 String[] by Java, arrives FS-encoded. + # However, emit TEST_HOME without thinking about the encoding. + p = subprocess.Popen( + [sys.executable, "-c", + 'import sys, os;' \ + 'sys.stdout.write(os.getenv("TEST_HOME"))'], + stdout=subprocess.PIPE, + env=newenv) + # Decode with default encoding utf-8 (because ... ?) + self.assertEqual(p.stdout.read().decode('utf-8'), expected) def test_getcwd(self): with test_support.temp_cwd(name=u"tempcwd-??") as temp_cwd: @@ -216,38 +243,46 @@ self.assertEqual(p.stdout.read().decode("utf-8"), temp_cwd) def test_listdir(self): - # It is hard to avoid Unicode paths on systems like OS X. Use - # relative paths from a temp CWD to work around this + # It is hard to avoid Unicode paths on systems like OS X. Use relative + # paths from a temp CWD to work around this. But when you don't, + # it behaves like this ... with test_support.temp_cwd() as new_cwd: - unicode_path = os.path.join(".", "unicode") - self.assertIs(type(unicode_path), str) - chinese_path = os.path.join(unicode_path, u"??") + + basedir = os.path.join(".", "unicode") + self.assertIs(type(basedir), bytes) + chinese_path = os.path.join(basedir, u"??") self.assertIs(type(chinese_path), unicode) home_path = os.path.join(chinese_path, u"??") os.makedirs(home_path) + FS = sys.getfilesystemencoding() + with open(os.path.join(home_path, "test.txt"), "w") as test_file: test_file.write("42\n") - # Verify works with str paths, returning Unicode as necessary - entries = os.listdir(unicode_path) - self.assertIn(u"??", entries) + # listdir(bytes) includes encoded form of ?? + entries = os.listdir(basedir) + self.assertIn(u"??".encode(FS), entries) + for entry in entries: + self.assertIs(type(entry), bytes) - # Verify works with Unicode paths + # listdir(unicode) includes unicode form of ?? entries = os.listdir(chinese_path) self.assertIn(u"??", entries) + for entry in entries: + self.assertIs(type(entry), unicode) # glob.glob builds on os.listdir; note that we don't use - # Unicode paths in the arg to glob + # Unicode paths in the arg to glob so the result is bytes self.assertEqual( glob.glob(os.path.join("unicode", "*")), - [os.path.join(u"unicode", u"??")]) + [os.path.join(u"unicode", u"??").encode(FS)]) self.assertEqual( glob.glob(os.path.join("unicode", "*", "*")), - [os.path.join(u"unicode", u"??", u"??")]) + [os.path.join(u"unicode", u"??", u"??").encode(FS)]) self.assertEqual( glob.glob(os.path.join("unicode", "*", "*", "*")), - [os.path.join(u"unicode", u"??", u"??", "test.txt")]) + [os.path.join(u"unicode", u"??", u"??", "test.txt").encode(FS)]) # Now use a Unicode path as well as in the glob arg self.assertEqual( @@ -263,11 +298,15 @@ # Verify Java integration. But we will need to construct # an absolute path since chdir doesn't work with Java # (except for subprocesses, like below in test_env) - for entry in entries: + for entry in entries: # list(unicode) + # new_cwd is bytes while chinese_path is unicode. + # But new_cwd is not guaranteed to be just ascii, so decode it. + new_cwd = new_cwd.decode(FS) entry_path = os.path.join(new_cwd, chinese_path, entry) f = File(entry_path) - self.assertTrue(f.exists(), "File %r (%r) should be testable for existence" % ( - f, entry_path)) + self.assertTrue(f.exists(), + "File %r (%r) should be testable for existence" % + (f, entry_path)) class LocaleTestCase(unittest.TestCase): diff --git a/Lib/test/test_support.py b/Lib/test/test_support.py --- a/Lib/test/test_support.py +++ b/Lib/test/test_support.py @@ -490,8 +490,13 @@ def make_jar_classloader(jar): import os from java.net import URL, URLClassLoader + from java.io import File - url = URL('jar:file:%s!/' % jar) + if isinstance(jar, bytes): # Java will expect a unicode file name + jar = jar.decode(sys.getfilesystemencoding()) + jar_url = File(jar).toURI().toURL().toString() + url = URL(u'jar:%s!/' % jar_url) + if is_jython_nt: # URLJarFiles keep a cached open file handle to the jar even # after this ClassLoader is GC'ed, disallowing Windows tests diff --git a/Lib/test/test_sys.py b/Lib/test/test_sys.py --- a/Lib/test/test_sys.py +++ b/Lib/test/test_sys.py @@ -253,8 +253,6 @@ self.assert_(vi[3] in ("alpha", "beta", "candidate", "final")) self.assert_(isinstance(vi[4], int)) - @unittest.skipIf(test.test_support.is_jython_nt, - "FIXME: fails probably due to issue 2312") def test_ioencoding(self): # from v2.7 test import subprocess,os env = dict(os.environ) diff --git a/Lib/test/test_zipimport_support.py b/Lib/test/test_zipimport_support.py --- a/Lib/test/test_zipimport_support.py +++ b/Lib/test/test_zipimport_support.py @@ -240,6 +240,14 @@ print data self.assertIn(expected, data) + def assertNormalisedIn(self, target, data): + # bdb/pdb applies normcase to its filename before displaying. + # Also, it emerges as FS-encoded bytes, so do the same to the target. + target = os.path.normcase(target) + if not isinstance(target, bytes): + target = target.encode(sys.getfilesystemencoding()) + self.assertIn(target, data) + def test_pdb_issue4201(self): test_src = textwrap.dedent("""\ def f(): @@ -248,22 +256,22 @@ import pdb pdb.runcall(f) """) + with temp_dir() as d: script_name = make_script(d, 'script', test_src) p = spawn_python(script_name) p.stdin.write('l\n') data = kill_python(p) - # bdb/pdb applies normcase to its filename before displaying - # See CPython Issue 14255 (back-ported for Jython) - self.assertIn(os.path.normcase(script_name.encode('utf-8')), data) + # Back-port from CPython 3 (see CPython Issue 14255). + self.assertNormalisedIn(script_name, data) + zip_name, run_name = make_zip_script(d, "test_zip", script_name, '__main__.py') p = spawn_python(zip_name) p.stdin.write('l\n') data = kill_python(p) - # bdb/pdb applies normcase to its filename before displaying - # See CPython Issue 14255 (back-ported for Jython) - self.assertIn(os.path.normcase(run_name.encode('utf-8')), data) + # Back-port from CPython 3 (see CPython Issue 14255). + self.assertNormalisedIn(run_name, data) def test_main(): diff --git a/src/org/python/core/Py.java b/src/org/python/core/Py.java --- a/src/org/python/core/Py.java +++ b/src/org/python/core/Py.java @@ -239,7 +239,7 @@ } if (ioe instanceof FileNotFoundException) { PyTuple args = new PyTuple(Py.newInteger(Errno.ENOENT.intValue()), - Py.newString("File not found - " + message)); + Py.newStringOrUnicode("File not found - " + message)); return new PyException(err, args); } return new PyException(err, message); @@ -721,9 +721,10 @@ public static String fileSystemDecode(PyObject filename) { if (filename instanceof PyString) { return fileSystemDecode((PyString)filename); - } else + } else { throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", filename.getType().fastGetName())); + } } /** @@ -1285,14 +1286,66 @@ // Output is to standard error, unless a file object has been given. StdoutWrapper stderr = Py.stderr; + + // As we format the exception in Unicode, we deal with encoding in this method + String encoding, errors = codecs.REPLACE; + if (file != null) { + // Ostensibly writing to a file: assume file content encoding (file.encoding) stderr = new FixedFileWrapper(file); + encoding = codecs.getDefaultEncoding(); + } else { + // Not a file, assume we should encode for the console + encoding = getAttr(Py.getSystemState().__stderr__, "encoding", null); } + + // But if the stream can tell us directly, of course we use that answer. + encoding = getAttr(stderr.myFile(), "encoding", encoding); + errors = getAttr(stderr.myFile(), "errors", errors); + flushLine(); // The creation of the report operates entirely in Java String (to support Unicode). - String formattedException = exceptionToString(type, value, tb); - stderr.print(formattedException); + try { + // Be prepared for formatting or printing to fail + PyString bytes = exceptionToBytes(type, value, tb, encoding, errors); + stderr.print(bytes); + } catch (Exception ex) { + // Looks like that exception just won't convert or print + value = Py.newString(""); + PyString bytes = exceptionToBytes(type, value, tb, encoding, errors); + stderr.print(bytes); + } + } + + /** Get a String attribute from an object or a return a default. */ + private static String getAttr(PyObject target, String internedName, String def) { + PyObject attr = target.__findattr__(internedName); + if (attr == null) { + return def; + } else if (attr instanceof PyUnicode) { + return ((PyUnicode)attr).getString(); + } else { + return attr.__str__().getString(); + } + } + + /** + * Helper for {@link #displayException(PyObject, PyObject, PyObject, PyObject)}, falling back to + * US-ASCII as the last resort encoding. + */ + private static PyString exceptionToBytes(PyObject type, PyObject value, PyObject tb, + String encoding, String errors) { + String string = exceptionToString(type, value, tb); + String bytes; // not UTF-16 + try { + // Format the exception and stack-trace in all its glory + bytes = codecs.encode(Py.newUnicode(string), encoding, errors); + } catch (Exception ex) { + // Sometimes a working codec is just too much to ask + bytes = codecs.PyUnicode_EncodeASCII(string, string.length(), codecs.REPLACE); + } + return Py.newString(bytes); } /** @@ -1334,14 +1387,8 @@ } } - // Be prepared for formatting the value part to fail (fall back to just the type) - try { - buf.append(formatException(type, value)); - } catch (Exception ex) { - buf.append(formatException(type, Py.None)); - } - buf.append('\n'); - + // Formatting the value may raise UnicodeEncodeError: client must deal + buf.append(formatException(type, value)).append('\n'); return buf.toString(); } @@ -1366,7 +1413,6 @@ } } - /** * Generate two lines showing where a SyntaxError was caused. * @@ -1478,7 +1524,6 @@ formatException(pye.type, pye.value, true), obj)); } - /* Equivalent to Python's assert statement */ public static void assert_(PyObject test, PyObject message) { if (!test.__nonzero__()) { @@ -1755,8 +1800,8 @@ PySystemState sys = Py.getSystemState(); String value = pye.value.__getattr__("args").__getitem__(0).toString(); List path = fileSystemDecode(sys.path); - throw Py.ImportError( - String.format(IMPORT_SITE_ERROR, value, path, PySystemState.prefix)); + String prefix = fileSystemDecode(PySystemState.prefix); + throw Py.ImportError(String.format(IMPORT_SITE_ERROR, value, path, prefix)); } else { throw pye; } diff --git a/src/org/python/core/PyBaseCode.java b/src/org/python/core/PyBaseCode.java --- a/src/org/python/core/PyBaseCode.java +++ b/src/org/python/core/PyBaseCode.java @@ -170,7 +170,7 @@ } return call(state, frame, closure); } - + @Override public PyObject call(ThreadState state, PyObject arg1, PyObject arg2, PyObject arg3, PyObject arg4, PyObject globals, @@ -309,8 +309,10 @@ } public String toString() { - return String.format("", - co_name, Py.idstr(this), co_filename, co_firstlineno); + // Result must be convertible to a str (for __repr__()), but let's make it fully printable. + String filename = PyString.encode_UnicodeEscape(co_filename, '"'); + return String.format("", + co_name, Py.idstr(this), filename, co_firstlineno); } protected abstract PyObject interpret(PyFrame f, ThreadState ts); diff --git a/src/org/python/core/PyFile.java b/src/org/python/core/PyFile.java --- a/src/org/python/core/PyFile.java +++ b/src/org/python/core/PyFile.java @@ -175,7 +175,7 @@ } private void file___init__(RawIOBase raw, String name, String mode, int bufsize) { - file___init__(raw, new PyString(name), mode, bufsize); + file___init__(raw, Py.newStringOrUnicode(name), mode, bufsize); } private void file___init__(RawIOBase raw, PyObject name, String mode, int bufsize) { diff --git a/src/org/python/core/PyJavaPackage.java b/src/org/python/core/PyJavaPackage.java --- a/src/org/python/core/PyJavaPackage.java +++ b/src/org/python/core/PyJavaPackage.java @@ -138,9 +138,8 @@ if (name == "__dict__") return __dict__; if (name == "__mgr__") return Py.java2py(__mgr__); if (name == "__file__") { - if (__file__ != null) return new PyString(__file__); - - return Py.None; + // Stored as UTF-16 for Java but expected as bytes in Python + return __file__ == null ? Py.None : Py.fileSystemEncode(__file__); } return null; @@ -157,7 +156,8 @@ return; } if (attr == "__file__") { - __file__ = value.__str__().toString(); + // Stored as UTF-16 for Java but presented as bytes from Python + __file__ = Py.fileSystemDecode(value); return; } diff --git a/src/org/python/core/PyString.java b/src/org/python/core/PyString.java --- a/src/org/python/core/PyString.java +++ b/src/org/python/core/PyString.java @@ -302,19 +302,51 @@ private static char[] hexdigit = "0123456789abcdef".toCharArray(); public static String encode_UnicodeEscape(String str, boolean use_quotes) { + char quote = use_quotes ? '?' : 0; + return encode_UnicodeEscape(str, quote); + } + + /** + * The inner logic of the string __repr__ producing an ASCII representation of the target + * string, optionally in quotations. The caller can determine whether the returned string will + * be wrapped in quotation marks, and whether Python rules are used to choose them through + * quote. + * + * @param str + * @param quoteChar '"' or '\'' use that, '?' = let Python choose, 0 or anything = no quotes + * @return encoded string (possibly the same string if unchanged) + */ + public static String encode_UnicodeEscape(String str, char quote) { + + // Choose whether to quote and the actual quote character + boolean use_quotes; + switch (quote) { + case '?': + use_quotes = true; + // Python rules + quote = str.indexOf('\'') >= 0 && str.indexOf('"') == -1 ? '"' : '\''; + break; + case '"': + case '\'': + use_quotes = true; + break; + default: + use_quotes = false; + break; + } + + // Allocate a buffer for the result (25% bigger and room for quotes) int size = str.length(); - StringBuilder v = new StringBuilder(str.length()); - - char quote = 0; + StringBuilder v = new StringBuilder(size + (size >> 2) + 2); if (use_quotes) { - quote = str.indexOf('\'') >= 0 && str.indexOf('"') == -1 ? '"' : '\''; v.append(quote); } + // Now chunter through the original string a character at a time for (int i = 0; size-- > 0;) { int ch = str.charAt(i++); - /* Escape quotes */ + // Escape quotes and backslash if ((use_quotes && ch == quote) || ch == '\\') { v.append('\\'); v.append((char)ch); @@ -368,10 +400,13 @@ v.append((char)ch); } } + if (use_quotes) { v.append(quote); } - return v.toString(); + + // Return the original string if we didn't quote or escape anything + return v.length() > size ? v.toString() : str; } private static ucnhashAPI pucnHash = null; diff --git a/src/org/python/core/PySyntaxError.java b/src/org/python/core/PySyntaxError.java --- a/src/org/python/core/PySyntaxError.java +++ b/src/org/python/core/PySyntaxError.java @@ -16,17 +16,15 @@ String filename; - public PySyntaxError(String s, int line, int column, String text, - String filename) + public PySyntaxError(String s, int line, int column, String text, String filename) { super(Py.SyntaxError); - //XXX: null text causes Java error, though I bet I'm not supposed to - // get null text. + //XXX: null text causes Java error, though I bet I'm not supposed to get null text. if (text == null) { text = ""; } PyObject[] tmp = new PyObject[] { - new PyString(filename), new PyInteger(line), + Py.fileSystemEncode(filename), new PyInteger(line), new PyInteger(column), new PyString(text) }; diff --git a/src/org/python/core/PySystemState.java b/src/org/python/core/PySystemState.java --- a/src/org/python/core/PySystemState.java +++ b/src/org/python/core/PySystemState.java @@ -1191,7 +1191,8 @@ PyList argv = new PyList(); if (args != null) { for (String arg : args) { - argv.append(Py.newStringOrUnicode(arg)); // XXX or always newUnicode? + // For consistency with CPython and the standard library, sys.argv is FS-encoded. + argv.append(Py.fileSystemEncode(arg)); } } return argv; diff --git a/src/org/python/core/StdoutWrapper.java b/src/org/python/core/StdoutWrapper.java --- a/src/org/python/core/StdoutWrapper.java +++ b/src/org/python/core/StdoutWrapper.java @@ -102,28 +102,33 @@ } private String printToFile(PyFile file, PyObject o) { - String s; + // We must ensure o is a byte string before we write it to the stream + String bytes; + if (!(o instanceof PyUnicode)) { + o = o.__str__(); + } + // o is now a PyString, but it might be unicode or bytes if (o instanceof PyUnicode) { // Use the encoding and policy defined for the stream. (Each may be null.) - s = ((PyUnicode)o).encode(file.encoding, file.errors); + bytes = ((PyUnicode)o).encode(file.encoding, file.errors); } else { - s = o.__str__().toString(); + bytes = ((PyString)o).getString(); } - file.write(s); - return s; + file.write(bytes); + return bytes; } private String printToFileWriter(PyFileWriter file, PyObject o) { - // since we are outputting directly to a character stream, - // avoid doing an encoding - String s; - if (o instanceof PyString) { - s = ((PyString) o).getString(); + // since we are outputting directly to a character stream, avoid encoding + String chars; + if (o instanceof PyUnicode) { + chars = ((PyString) o).getString(); } else { - s = o.__str__().toString(); + // Bytes here are assumed to be code points, as in PyFileWriter.write() + chars = o.__str__().getString(); } - file.write(s); - return s; + file.write(chars); + return chars; } private void printToFileObject(PyObject file, PyObject o) { @@ -248,11 +253,11 @@ } public void print(String s) { - print(Py.newStringOrUnicode(s), false, false); + print(Py.newUnicode(s), false, false); } public void println(String s) { - print(Py.newStringOrUnicode(s), false, true); + print(Py.newUnicode(s), false, true); } public void print(PyObject o) { diff --git a/src/org/python/core/imp.java b/src/org/python/core/imp.java --- a/src/org/python/core/imp.java +++ b/src/org/python/core/imp.java @@ -622,8 +622,9 @@ if (caseok(dir, name) && (sourceFile.isFile() || compiledFile.isFile())) { pkg = true; } else { + String printDirName = PyString.encode_UnicodeEscape(displayDirName, '\''); Py.warning(Py.ImportWarning, String.format( - "Not importing directory '%s': missing __init__.py", dirName)); + "Not importing directory %s: missing __init__.py", printDirName)); } } } catch (SecurityException e) { diff --git a/src/org/python/core/io/FileIO.java b/src/org/python/core/io/FileIO.java --- a/src/org/python/core/io/FileIO.java +++ b/src/org/python/core/io/FileIO.java @@ -71,11 +71,12 @@ } /** - * Construct a FileIO instance for the specified file name. + * Construct a FileIO instance for the specified file name, which will be decoded using the + * nominal Jython file system encoding if it is a str/bytes rather than a + * unicode. * - * The mode can be 'r', 'w' or 'a' for reading (default), writing - * or appending. Add a '+' to the mode to allow simultaneous - * reading and writing. + * The mode can be 'r', 'w' or 'a' for reading (default), writing or appending. Add a '+' to the + * mode to allow simultaneous reading and writing. * * @param name the name of the file * @param mode a raw io file mode String diff --git a/src/org/python/core/packagecache/PathPackageManager.java b/src/org/python/core/packagecache/PathPackageManager.java --- a/src/org/python/core/packagecache/PathPackageManager.java +++ b/src/org/python/core/packagecache/PathPackageManager.java @@ -216,10 +216,8 @@ * true if path refers to a jar. */ public void addClassPath(String path) { - PyList paths = new PyString(path).split(java.io.File.pathSeparator); - - for (int i = 0; i < paths.__len__(); i++) { - String entry = paths.pyget(i).toString(); + String[] paths = path.split(java.io.File.pathSeparator); + for (String entry: paths) { if (entry.endsWith(".jar") || entry.endsWith(".zip")) { addJarToPackages(new File(entry), true); } else { diff --git a/src/org/python/modules/_imp.java b/src/org/python/modules/_imp.java --- a/src/org/python/modules/_imp.java +++ b/src/org/python/modules/_imp.java @@ -228,7 +228,8 @@ continue; } return new PyTuple(mi.file, - Py.newStringOrUnicode(mi.filename), + // File names generally expected in the FS encoding + Py.fileSystemEncode(mi.filename), new PyTuple(Py.newString(mi.suffix), Py.newString(mi.mode), Py.newInteger(mi.type))); diff --git a/src/org/python/modules/posix/PosixModule.java b/src/org/python/modules/posix/PosixModule.java --- a/src/org/python/modules/posix/PosixModule.java +++ b/src/org/python/modules/posix/PosixModule.java @@ -57,6 +57,7 @@ import org.python.core.PyString; import org.python.core.PySystemState; import org.python.core.PyTuple; +import org.python.core.PyUnicode; import org.python.core.Untraversable; import org.python.core.imp; import org.python.core.io.FileIO; @@ -677,9 +678,16 @@ throw Py.OSError("listdir(): an unknown error occurred: " + path); } + // Return names as bytes or unicode according to the type of the original argument PyList list = new PyList(); - for (String name : names) { - list.append(Py.newStringOrUnicode(path, name)); + if (path instanceof PyUnicode) { + for (String name : names) { + list.append(Py.newUnicode(name)); + } + } else { + for (String name : names) { + list.append(Py.fileSystemEncode(name)); + } } return list; } diff --git a/src/org/python/modules/zipimport/zipimporter.java b/src/org/python/modules/zipimport/zipimporter.java --- a/src/org/python/modules/zipimport/zipimporter.java +++ b/src/org/python/modules/zipimport/zipimporter.java @@ -174,11 +174,12 @@ */ @Override public String get_data(String path) { - return zipimporter_get_data(path); + return zipimporter_get_data(Py.newUnicode(path)); } @ExposedMethod - final String zipimporter_get_data(String path) { + final String zipimporter_get_data(PyObject opath) { + String path = Py.fileSystemDecode(opath); int len = archive.length(); if (len < path.length() && path.startsWith(archive + File.separator)) { path = path.substring(len + 1); @@ -248,7 +249,8 @@ final PyObject zipimporter_get_filename(String fullname) { ModuleCodeData moduleCodeData = getModuleCode(fullname); if (moduleCodeData != null) { - return Py.newStringOrUnicode(moduleCodeData.path); + // File names generally expected in the FS encoding at the Python level + return Py.fileSystemEncode(moduleCodeData.path); } return Py.None; } @@ -399,7 +401,8 @@ ZipEntry zipEntry = zipEntries.nextElement(); String name = zipEntry.getName().replace('/', File.separatorChar); - PyObject __file__ = Py.newStringOrUnicode(archive + File.separator + name); + // File names generally expected in the FS encoding at the Python level + PyObject __file__ = Py.fileSystemEncode(archive + File.separator + name); PyObject compress = Py.newInteger(zipEntry.getMethod()); PyObject data_size = new PyLong(zipEntry.getCompressedSize()); PyObject file_size = new PyLong(zipEntry.getSize()); diff --git a/src/org/python/util/jython.java b/src/org/python/util/jython.java --- a/src/org/python/util/jython.java +++ b/src/org/python/util/jython.java @@ -196,7 +196,7 @@ try { PyObject runpy = imp.importName("runpy", true); PyObject runmodule = runpy.__findattr__("_run_module_as_main"); - runmodule.__call__(Py.newStringOrUnicode(moduleName), Py.newBoolean(set_argv0)); + runmodule.__call__(Py.fileSystemEncode(moduleName), Py.newBoolean(set_argv0)); } catch (Throwable t) { Py.printException(t); interp.cleanup(); @@ -206,7 +206,7 @@ private static boolean runMainFromImporter(InteractiveConsole interp, String filename) { // Support http://bugs.python.org/issue1739468 - Allow interpreter to execute a zip file or directory - PyString argv0 = Py.newStringOrUnicode(filename); + PyString argv0 = Py.fileSystemEncode(filename); PyObject importer = imp.getImporter(argv0); if (!(importer instanceof PyNullImporter)) { /* argv0 is usable as an import source, so @@ -323,7 +323,7 @@ if (path == null) { path = ""; } - Py.getSystemState().path.insert(0, Py.newStringOrUnicode(path)); + Py.getSystemState().path.insert(0, Py.fileSystemEncode(path)); if (opts.jar) { try { runJar(opts.filename); -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:01:59 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:01:59 +0000 Subject: [Jython-checkins] =?utf-8?q?jython_=28merge_default_-=3E_default?= =?utf-8?q?=29=3A_Merge_work_on_non-ascii_file/user_names_to_trunk=2E?= Message-ID: <20170521090145.81833.3AB34B6980EB2FE7@psf.io> https://hg.python.org/jython/rev/060e4e4a06d8 changeset: 8087:060e4e4a06d8 parent: 8075:0a00982f6ea5 parent: 8086:147fe05920a4 user: Jeff Allen date: Sun Apr 30 23:07:30 2017 +0100 summary: Merge work on non-ascii file/user names to trunk. files: CPythonLib.includes | 1 + Lib/javashell.py | 2 +- Lib/ntpath.py | 560 ---------- Lib/subprocess.py | 38 +- Lib/sysconfig.py | 6 + Lib/test/test_exceptions.py | 3 - Lib/test/test_exceptions_jy.py | 5 +- Lib/test/test_httpservers.py | 3 + Lib/test/test_java_visibility.py | 11 +- Lib/test/test_jser.py | 4 +- Lib/test/test_jython_launcher.py | 8 +- Lib/test/test_ssl.py | 8 +- Lib/test/test_support.py | 2 +- Lib/test/test_zipimport_jy.py | 6 +- build.xml | 3 + src/org/python/core/Py.java | 297 ++++- src/org/python/core/PyBaseException.java | 17 +- src/org/python/core/PyBytecode.java | 9 +- src/org/python/core/PyException.java | 25 +- src/org/python/core/PyFile.java | 4 - src/org/python/core/PyNullImporter.java | 13 +- src/org/python/core/PyString.java | 6 +- src/org/python/core/PySystemState.java | 65 +- src/org/python/core/PyTableCode.java | 6 +- src/org/python/core/PyUnicode.java | 4 +- src/org/python/core/SyspathArchive.java | 2 +- src/org/python/core/SyspathJavaLoader.java | 55 +- src/org/python/core/__builtin__.java | 8 +- src/org/python/core/imp.java | 26 +- src/org/python/core/io/FileIO.java | 4 +- src/org/python/core/packagecache/PathPackageManager.java | 14 +- src/org/python/modules/_imp.java | 81 +- src/org/python/modules/_py_compile.java | 36 +- src/org/python/modules/posix/PosixModule.java | 18 +- src/org/python/modules/zipimport/zipimporter.java | 8 +- src/org/python/util/jython.java | 4 +- src/shell/jython.exe | Bin src/shell/jython.py | 314 +++-- 38 files changed, 733 insertions(+), 943 deletions(-) diff --git a/CPythonLib.includes b/CPythonLib.includes --- a/CPythonLib.includes +++ b/CPythonLib.includes @@ -110,6 +110,7 @@ netrc.py nntplib.py numbers.py +ntpath.py nturl2path.py opcode.py optparse.py diff --git a/Lib/javashell.py b/Lib/javashell.py --- a/Lib/javashell.py +++ b/Lib/javashell.py @@ -55,7 +55,7 @@ env = self._formatEnvironment( self.environment ) try: - p = Runtime.getRuntime().exec( shellCmd, env, File(os.getcwd()) ) + p = Runtime.getRuntime().exec( shellCmd, env, File(os.getcwdu()) ) return p except IOException, ex: raise OSError( diff --git a/Lib/ntpath.py b/Lib/ntpath.py deleted file mode 100644 --- a/Lib/ntpath.py +++ /dev/null @@ -1,560 +0,0 @@ -# Module 'ntpath' -- common operations on WinNT/Win95 pathnames -"""Common pathname manipulations, WindowsNT/95 version. - -Instead of importing this module directly, import os and refer to this -module as os.path. -""" - -import os -import sys -import stat -import genericpath -import warnings - -from genericpath import * - -__all__ = ["normcase","isabs","join","splitdrive","split","splitext", - "basename","dirname","commonprefix","getsize","getmtime", - "getatime","getctime", "islink","exists","lexists","isdir","isfile", - "ismount","walk","expanduser","expandvars","normpath","abspath", - "splitunc","curdir","pardir","sep","pathsep","defpath","altsep", - "extsep","devnull","realpath","supports_unicode_filenames","relpath"] - -# strings representing various path-related bits and pieces -curdir = '.' -pardir = '..' -extsep = '.' -sep = '\\' -pathsep = ';' -altsep = '/' -defpath = '.;C:\\bin' -if 'ce' in sys.builtin_module_names: - defpath = '\\Windows' -elif 'os2' in sys.builtin_module_names: - # OS/2 w/ VACPP - altsep = '/' -devnull = 'nul' - -# Normalize the case of a pathname and map slashes to backslashes. -# Other normalizations (such as optimizing '../' away) are not done -# (this is done by normpath). - -def normcase(s): - """Normalize case of pathname. - - Makes all characters lowercase and all slashes into backslashes.""" - return s.replace("/", "\\").lower() - - -# Return whether a path is absolute. -# Trivial in Posix, harder on the Mac or MS-DOS. -# For DOS it is absolute if it starts with a slash or backslash (current -# volume), or if a pathname after the volume letter and colon / UNC resource -# starts with a slash or backslash. - -def isabs(s): - """Test whether a path is absolute""" - s = splitdrive(s)[1] - return s != '' and s[:1] in '/\\' - - -# Join two (or more) paths. - -def join(a, *p): - """Join two or more pathname components, inserting "\\" as needed. - If any component is an absolute path, all previous path components - will be discarded.""" - path = a - for b in p: - b_wins = 0 # set to 1 iff b makes path irrelevant - if path == "": - b_wins = 1 - - elif isabs(b): - # This probably wipes out path so far. However, it's more - # complicated if path begins with a drive letter: - # 1. join('c:', '/a') == 'c:/a' - # 2. join('c:/', '/a') == 'c:/a' - # But - # 3. join('c:/a', '/b') == '/b' - # 4. join('c:', 'd:/') = 'd:/' - # 5. join('c:/', 'd:/') = 'd:/' - if path[1:2] != ":" or b[1:2] == ":": - # Path doesn't start with a drive letter, or cases 4 and 5. - b_wins = 1 - - # Else path has a drive letter, and b doesn't but is absolute. - elif len(path) > 3 or (len(path) == 3 and - path[-1] not in "/\\"): - # case 3 - b_wins = 1 - - if b_wins: - path = b - else: - # Join, and ensure there's a separator. - assert len(path) > 0 - if path[-1] in "/\\": - if b and b[0] in "/\\": - path += b[1:] - else: - path += b - elif path[-1] == ":": - path += b - elif b: - if b[0] in "/\\": - path += b - else: - path += "\\" + b - else: - # path is not empty and does not end with a backslash, - # but b is empty; since, e.g., split('a/') produces - # ('a', ''), it's best if join() adds a backslash in - # this case. - path += '\\' - - return path - - -# Split a path in a drive specification (a drive letter followed by a -# colon) and the path specification. -# It is always true that drivespec + pathspec == p -def splitdrive(p): - """Split a pathname into drive and path specifiers. Returns a 2-tuple -"(drive,path)"; either part may be empty""" - if p[1:2] == ':': - return p[0:2], p[2:] - return '', p - - -# Parse UNC paths -def splitunc(p): - """Split a pathname into UNC mount point and relative path specifiers. - - Return a 2-tuple (unc, rest); either part may be empty. - If unc is not empty, it has the form '//host/mount' (or similar - using backslashes). unc+rest is always the input path. - Paths containing drive letters never have an UNC part. - """ - if p[1:2] == ':': - return '', p # Drive letter present - firstTwo = p[0:2] - if firstTwo == '//' or firstTwo == '\\\\': - # is a UNC path: - # vvvvvvvvvvvvvvvvvvvv equivalent to drive letter - # \\machine\mountpoint\directories... - # directory ^^^^^^^^^^^^^^^ - normp = normcase(p) - index = normp.find('\\', 2) - if index == -1: - ##raise RuntimeError, 'illegal UNC path: "' + p + '"' - return ("", p) - index = normp.find('\\', index + 1) - if index == -1: - index = len(p) - return p[:index], p[index:] - return '', p - - -# Split a path in head (everything up to the last '/') and tail (the -# rest). After the trailing '/' is stripped, the invariant -# join(head, tail) == p holds. -# The resulting head won't end in '/' unless it is the root. - -def split(p): - """Split a pathname. - - Return tuple (head, tail) where tail is everything after the final slash. - Either part may be empty.""" - - d, p = splitdrive(p) - # set i to index beyond p's last slash - i = len(p) - while i and p[i-1] not in '/\\': - i = i - 1 - head, tail = p[:i], p[i:] # now tail has no slashes - # remove trailing slashes from head, unless it's all slashes - head2 = head - while head2 and head2[-1] in '/\\': - head2 = head2[:-1] - head = head2 or head - return d + head, tail - - -# Split a path in root and extension. -# The extension is everything starting at the last dot in the last -# pathname component; the root is everything before that. -# It is always true that root + ext == p. - -def splitext(p): - return genericpath._splitext(p, sep, altsep, extsep) -splitext.__doc__ = genericpath._splitext.__doc__ - - -# Return the tail (basename) part of a path. - -def basename(p): - """Returns the final component of a pathname""" - return split(p)[1] - - -# Return the head (dirname) part of a path. - -def dirname(p): - """Returns the directory component of a pathname""" - return split(p)[0] - -# Is a path a symbolic link? -# This will always return false on systems where posix.lstat doesn't exist. - -def islink(path): - """Test for symbolic link. - On WindowsNT/95 and OS/2 always returns false - """ - return False - -# alias exists to lexists -lexists = exists - -# Is a path a mount point? Either a root (with or without drive letter) -# or an UNC path with at most a / or \ after the mount point. - -def ismount(path): - """Test whether a path is a mount point (defined as root of drive)""" - unc, rest = splitunc(path) - if unc: - return rest in ("", "/", "\\") - p = splitdrive(path)[1] - return len(p) == 1 and p[0] in '/\\' - - -# Directory tree walk. -# For each directory under top (including top itself, but excluding -# '.' and '..'), func(arg, dirname, filenames) is called, where -# dirname is the name of the directory and filenames is the list -# of files (and subdirectories etc.) in the directory. -# The func may modify the filenames list, to implement a filter, -# or to impose a different order of visiting. - -def walk(top, func, arg): - """Directory tree walk with callback function. - - For each directory in the directory tree rooted at top (including top - itself, but excluding '.' and '..'), call func(arg, dirname, fnames). - dirname is the name of the directory, and fnames a list of the names of - the files and subdirectories in dirname (excluding '.' and '..'). func - may modify the fnames list in-place (e.g. via del or slice assignment), - and walk will only recurse into the subdirectories whose names remain in - fnames; this can be used to implement a filter, or to impose a specific - order of visiting. No semantics are defined for, or required of, arg, - beyond that arg is always passed to func. It can be used, e.g., to pass - a filename pattern, or a mutable object designed to accumulate - statistics. Passing None for arg is common.""" - warnings.warnpy3k("In 3.x, os.path.walk is removed in favor of os.walk.", - stacklevel=2) - try: - names = os.listdir(top) - except os.error: - return - func(arg, top, names) - for name in names: - name = join(top, name) - if isdir(name): - walk(name, func, arg) - - -# Expand paths beginning with '~' or '~user'. -# '~' means $HOME; '~user' means that user's home directory. -# If the path doesn't begin with '~', or if the user or $HOME is unknown, -# the path is returned unchanged (leaving error reporting to whatever -# function is called with the expanded path as argument). -# See also module 'glob' for expansion of *, ? and [...] in pathnames. -# (A function should also be defined to do full *sh-style environment -# variable expansion.) - -def expanduser(path): - """Expand ~ and ~user constructs. - - If user or $HOME is unknown, do nothing.""" - if path[:1] != '~': - return path - i, n = 1, len(path) - while i < n and path[i] not in '/\\': - i = i + 1 - - if 'HOME' in os.environ: - userhome = os.environ['HOME'] - elif 'USERPROFILE' in os.environ: - userhome = os.environ['USERPROFILE'] - elif not 'HOMEPATH' in os.environ: - return path - else: - try: - drive = os.environ['HOMEDRIVE'] - except KeyError: - drive = '' - userhome = join(drive, os.environ['HOMEPATH']) - - if i != 1: #~user - userhome = join(dirname(userhome), path[1:i]) - - return userhome + path[i:] - - -# Expand paths containing shell variable substitutions. -# The following rules apply: -# - no expansion within single quotes -# - '$$' is translated into '$' -# - '%%' is translated into '%' if '%%' are not seen in %var1%%var2% -# - ${varname} is accepted. -# - $varname is accepted. -# - %varname% is accepted. -# - varnames can be made out of letters, digits and the characters '_-' -# (though is not verifed in the ${varname} and %varname% cases) -# XXX With COMMAND.COM you can use any characters in a variable name, -# XXX except '^|<>='. - -def expandvars(path): - """Expand shell variables of the forms $var, ${var} and %var%. - - Unknown variables are left unchanged.""" - if '$' not in path and '%' not in path: - return path - import string - varchars = string.ascii_letters + string.digits + '_-' - res = '' - index = 0 - pathlen = len(path) - while index < pathlen: - c = path[index] - if c == '\'': # no expansion within single quotes - path = path[index + 1:] - pathlen = len(path) - try: - index = path.index('\'') - res = res + '\'' + path[:index + 1] - except ValueError: - res = res + path - index = pathlen - 1 - elif c == '%': # variable or '%' - if path[index + 1:index + 2] == '%': - res = res + c - index = index + 1 - else: - path = path[index+1:] - pathlen = len(path) - try: - index = path.index('%') - except ValueError: - res = res + '%' + path - index = pathlen - 1 - else: - var = path[:index] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '%' + var + '%' - elif c == '$': # variable or '$$' - if path[index + 1:index + 2] == '$': - res = res + c - index = index + 1 - elif path[index + 1:index + 2] == '{': - path = path[index+2:] - pathlen = len(path) - try: - index = path.index('}') - var = path[:index] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '${' + var + '}' - except ValueError: - res = res + '${' + path - index = pathlen - 1 - else: - var = '' - index = index + 1 - c = path[index:index + 1] - while c != '' and c in varchars: - var = var + c - index = index + 1 - c = path[index:index + 1] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '$' + var - if c != '': - index = index - 1 - else: - res = res + c - index = index + 1 - return res - - -# Normalize a path, e.g. A//B, A/./B and A/foo/../B all become A\B. -# Previously, this function also truncated pathnames to 8+3 format, -# but as this module is called "ntpath", that's obviously wrong! - -def normpath(path): - """Normalize path, eliminating double slashes, etc.""" - # Preserve unicode (if path is unicode) - backslash, dot = (u'\\', u'.') if isinstance(path, unicode) else ('\\', '.') - if path.startswith(('\\\\.\\', '\\\\?\\')): - # in the case of paths with these prefixes: - # \\.\ -> device names - # \\?\ -> literal paths - # do not do any normalization, but return the path unchanged - return path - path = path.replace("/", "\\") - prefix, path = splitdrive(path) - # We need to be careful here. If the prefix is empty, and the path starts - # with a backslash, it could either be an absolute path on the current - # drive (\dir1\dir2\file) or a UNC filename (\\server\mount\dir1\file). It - # is therefore imperative NOT to collapse multiple backslashes blindly in - # that case. - # The code below preserves multiple backslashes when there is no drive - # letter. This means that the invalid filename \\\a\b is preserved - # unchanged, where a\\\b is normalised to a\b. It's not clear that there - # is any better behaviour for such edge cases. - if prefix == '': - # No drive letter - preserve initial backslashes - while path[:1] == "\\": - prefix = prefix + backslash - path = path[1:] - else: - # We have a drive letter - collapse initial backslashes - if path.startswith("\\"): - prefix = prefix + backslash - path = path.lstrip("\\") - comps = path.split("\\") - i = 0 - while i < len(comps): - if comps[i] in ('.', ''): - del comps[i] - elif comps[i] == '..': - if i > 0 and comps[i-1] != '..': - del comps[i-1:i+1] - i -= 1 - elif i == 0 and prefix.endswith("\\"): - del comps[i] - else: - i += 1 - else: - i += 1 - # If the path is now empty, substitute '.' - if not prefix and not comps: - comps.append(dot) - return prefix + backslash.join(comps) - - -# Return an absolute path. -try: - from nt import _getfullpathname - -except ImportError: # no built-in nt module - maybe it's Jython ;) - - if os._name == 'nt' : - # on Windows so Java version of sys deals in NT paths - def abspath(path): - """Return the absolute version of a path.""" - try: - if isinstance(path, unicode): - # Result must be unicode - if path: - path = sys.getPath(path) - else: - # Empty path must return current working directory - path = os.getcwdu() - else: - # Result must be bytes - if path: - path = sys.getPath(path).encode('latin-1') - else: - # Empty path must return current working directory - path = os.getcwd() - except EnvironmentError: - pass # Bad path - return unchanged. - return normpath(path) - - else: - # not running on Windows - mock up something sensible - def abspath(path): - """Return the absolute version of a path.""" - try: - if isinstance(path, unicode): - # Result must be unicode - if path: - path = join(os.getcwdu(), path) - else: - # Empty path must return current working directory - path = os.getcwdu() - else: - # Result must be bytes - if path: - path = join(os.getcwd(), path) - else: - # Empty path must return current working directory - path = os.getcwd() - except EnvironmentError: - pass # Bad path - return unchanged. - return normpath(path) - -else: # use native Windows method on Windows - def abspath(path): - """Return the absolute version of a path.""" - - if path: # Empty path must return current working directory. - try: - path = _getfullpathname(path) - except WindowsError: - pass # Bad path - return unchanged. - elif isinstance(path, unicode): - path = os.getcwdu() - else: - path = os.getcwd() - return normpath(path) - -# realpath is a no-op on systems without islink support -realpath = abspath -# Win9x family and earlier have no Unicode filename support. -supports_unicode_filenames = (hasattr(sys, "getwindowsversion") and - sys.getwindowsversion()[3] >= 2) - -def _abspath_split(path): - abs = abspath(normpath(path)) - prefix, rest = splitunc(abs) - is_unc = bool(prefix) - if not is_unc: - prefix, rest = splitdrive(abs) - return is_unc, prefix, [x for x in rest.split(sep) if x] - -def relpath(path, start=curdir): - """Return a relative version of a path""" - - if not path: - raise ValueError("no path specified") - - start_is_unc, start_prefix, start_list = _abspath_split(start) - path_is_unc, path_prefix, path_list = _abspath_split(path) - - if path_is_unc ^ start_is_unc: - raise ValueError("Cannot mix UNC and non-UNC paths (%s and %s)" - % (path, start)) - if path_prefix.lower() != start_prefix.lower(): - if path_is_unc: - raise ValueError("path is on UNC root %s, start on UNC root %s" - % (path_prefix, start_prefix)) - else: - raise ValueError("path is on drive %s, start on drive %s" - % (path_prefix, start_prefix)) - # Work out how much of the filepath is shared by start and path. - i = 0 - for e1, e2 in zip(start_list, path_list): - if e1.lower() != e2.lower(): - break - i += 1 - - rel_list = [pardir] * (len(start_list)-i) + path_list[i:] - if not rel_list: - return curdir - return join(*rel_list) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -438,6 +438,7 @@ import java.nio.ByteBuffer import org.python.core.io.RawIOBase import org.python.core.io.StreamIO + from org.python.core.Py import fileSystemDecode else: import select _has_poll = hasattr(select, 'poll') @@ -779,7 +780,7 @@ maintain those byte values (which may be butchered as Strings) for the subprocess if they haven't been modified. """ - # Determine what's safe to merge + # Determine what's necessary to merge (new or different) merge_env = dict((key, value) for key, value in env.iteritems() if key not in builder_env or builder_env.get(key) != value) @@ -789,8 +790,10 @@ for entry in entries: if entry.getKey() not in env: entries.remove() - - builder_env.putAll(merge_env) + # add anything new or different in env + for key, value in merge_env.iteritems(): + # If the new value is bytes, assume it to be FS-encoded + builder_env.put(key, fileSystemDecode(value)) class Popen(object): @@ -1308,9 +1311,6 @@ args = _cmdline2listimpl(args) else: args = list(args) - # NOTE: CPython posix (execv) will str() any unicode - # args first, maybe we should do the same on - # posix. Windows passes unicode through, however if any(not isinstance(arg, (str, unicode)) for arg in args): raise TypeError('args must contain only strings') args = _escape_args(args) @@ -1321,6 +1321,11 @@ if executable is not None: args[0] = executable + # NOTE: CPython posix (execv) will FS-encode any unicode args, but + # pass on bytes unchanged, because that's what the system expects. + # Java expects unicode, so we do the converse: leave unicode + # unchanged but FS-decode any supplied as bytes. + args = [fileSystemDecode(arg) for arg in args] builder = java.lang.ProcessBuilder(args) if stdin is None: @@ -1330,16 +1335,20 @@ if stderr is None: builder.redirectError(java.lang.ProcessBuilder.Redirect.INHERIT) - # os.environ may be inherited for compatibility with CPython + # os.environ may be inherited for compatibility with CPython. + # Elements taken from os.environ are FS-decoded to unicode. _setup_env(dict(os.environ if env is None else env), builder.environment()) + # The current working directory must also be unicode. if cwd is None: - cwd = os.getcwd() - elif not os.path.exists(cwd): - raise OSError(errno.ENOENT, os.strerror(errno.ENOENT), cwd) - elif not os.path.isdir(cwd): - raise OSError(errno.ENOTDIR, os.strerror(errno.ENOTDIR), cwd) + cwd = os.getcwdu() + else: + cwd = fileSystemDecode(cwd) + if not os.path.exists(cwd): + raise OSError(errno.ENOENT, os.strerror(errno.ENOENT), cwd) + elif not os.path.isdir(cwd): + raise OSError(errno.ENOTDIR, os.strerror(errno.ENOTDIR), cwd) builder.directory(java.io.File(cwd)) # Let Java manage redirection of stderr to stdout (it's more @@ -1890,9 +1899,10 @@ args = _cmdline2listimpl(command) args = _escape_args(args) args = _shell_command + args - cwd = os.getcwd() + cwd = os.getcwdu() - + # Python supplies FS-encoded arguments while Java expects String + args = [fileSystemDecode(arg) for arg in args] builder = java.lang.ProcessBuilder(args) builder.directory(java.io.File(cwd)) diff --git a/Lib/sysconfig.py b/Lib/sysconfig.py --- a/Lib/sysconfig.py +++ b/Lib/sysconfig.py @@ -5,6 +5,11 @@ import os from os.path import pardir, realpath +def fileSystemEncode(path): + if isinstance(path, unicode): + return path.encode(sys.getfilesystemencoding()) + return path + _INSTALL_SCHEMES = { 'posix_prefix': { 'stdlib': '{base}/lib/python{py_version_short}', @@ -116,6 +121,7 @@ def _safe_realpath(path): try: + path = fileSystemEncode(path) return realpath(path) except OSError: return path diff --git a/Lib/test/test_exceptions.py b/Lib/test/test_exceptions.py --- a/Lib/test/test_exceptions.py +++ b/Lib/test/test_exceptions.py @@ -524,7 +524,6 @@ self.check_same_msg(Exception(), '') - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_0_args_with_overridden___str__(self): """Check same msg for exceptions with 0 args and overridden __str__""" # str() and unicode() on an exception with overridden __str__ that @@ -550,7 +549,6 @@ self.assertRaises(UnicodeEncodeError, str, e) self.assertEqual(unicode(e), u'f\xf6\xf6') - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_1_arg_with_overridden___str__(self): """Check same msg for exceptions with overridden __str__ and 1 arg""" # when __str__ is overridden and __unicode__ is not implemented @@ -575,7 +573,6 @@ for args in argslist: self.check_same_msg(Exception(*args), repr(args)) - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_many_args_with_overridden___str__(self): """Check same msg for exceptions with overridden __str__ and many args""" # if __str__ returns an ascii string / ascii unicode string diff --git a/Lib/test/test_exceptions_jy.py b/Lib/test/test_exceptions_jy.py --- a/Lib/test/test_exceptions_jy.py +++ b/Lib/test/test_exceptions_jy.py @@ -70,11 +70,12 @@ # But the exception hook, via Py#displayException, does not fail when attempting to __str__ the exception args with test_support.captured_stderr() as s: sys.excepthook(RuntimeError, u"Drink \u2615", None) - self.assertEqual(s.getvalue(), "RuntimeError\n") + # At minimum, it tells us what kind of exception it was + self.assertEqual(s.getvalue()[:12], "RuntimeError") # It is fine with ascii values, of course with test_support.captured_stderr() as s: sys.excepthook(RuntimeError, u"Drink java", None) - self.assertEqual(s.getvalue(), "RuntimeError: Drink java\n") + self.assertEqual(s.getvalue(), "RuntimeError: Drink java\n") def test_main(): diff --git a/Lib/test/test_httpservers.py b/Lib/test/test_httpservers.py --- a/Lib/test/test_httpservers.py +++ b/Lib/test/test_httpservers.py @@ -378,6 +378,9 @@ @unittest.skipIf(hasattr(os, 'geteuid') and os.geteuid() == 0, "This test can't be run reliably as root (issue #13308).") + at unittest.skipIf((not hasattr(os, 'symlink')) and + sys.executable.encode('ascii', 'replace') != sys.executable, + "Executable path is not pure ASCII.") # these fail for CPython too class CGIHTTPServerTestCase(BaseTestCase): class request_handler(NoLogRequestHandler, CGIHTTPRequestHandler): pass diff --git a/Lib/test/test_java_visibility.py b/Lib/test/test_java_visibility.py --- a/Lib/test/test_java_visibility.py +++ b/Lib/test/test_java_visibility.py @@ -13,6 +13,7 @@ from org.python.tests.multihidden import BaseConnection class VisibilityTest(unittest.TestCase): + def test_invisible(self): for item in dir(Invisible): self.assert_(not item.startswith("package")) @@ -178,6 +179,7 @@ class JavaClassTest(unittest.TestCase): + def test_class_methods_visible(self): self.assertFalse(HashMap.isInterface(), 'java.lang.Class methods should be visible on Class instances') @@ -198,6 +200,7 @@ self.assertEquals(3, s.b, "Defined fields should take precedence") class CoercionTest(unittest.TestCase): + def test_int_coercion(self): c = Coercions() self.assertEquals("5", c.takeInt(5)) @@ -234,6 +237,7 @@ self.assertEquals(c.tellClassNameObject(ht), "class java.util.Hashtable") class RespectJavaAccessibilityTest(unittest.TestCase): + def run_accessibility_script(self, script, error=AttributeError): fn = test_support.findfile(script) self.assertRaises(error, execfile, fn) @@ -254,6 +258,7 @@ self.run_accessibility_script("call_overridden_method.py") class ClassloaderTest(unittest.TestCase): + def test_loading_classes_without_import(self): cl = test_support.make_jar_classloader("../callbacker_test.jar") X = cl.loadClass("org.python.tests.Callbacker") @@ -265,11 +270,13 @@ self.assertEquals(None, called[0]) def test_main(): - test_support.run_unittest(VisibilityTest, + test_support.run_unittest( + VisibilityTest, JavaClassTest, CoercionTest, RespectJavaAccessibilityTest, - ClassloaderTest) + ClassloaderTest + ) if __name__ == "__main__": test_main() diff --git a/Lib/test/test_jser.py b/Lib/test/test_jser.py --- a/Lib/test/test_jser.py +++ b/Lib/test/test_jser.py @@ -15,7 +15,9 @@ class JavaSerializationTests(unittest.TestCase): def setUp(self): - self.sername = os.path.join(sys.prefix, "test.ser") + name = os.path.join(sys.prefix, "test.ser") + # As we are using java.io directly, ensure file name is a unicode + self.sername = name.decode(sys.getfilesystemencoding()) def tearDown(self): os.remove(self.sername) diff --git a/Lib/test/test_jython_launcher.py b/Lib/test/test_jython_launcher.py --- a/Lib/test/test_jython_launcher.py +++ b/Lib/test/test_jython_launcher.py @@ -31,7 +31,6 @@ # by the installer return executable - def get_uname(): _uname = None try: @@ -49,9 +48,8 @@ class TestLauncher(unittest.TestCase): - + def get_cmdline(self, cmd, env): - output = subprocess.check_output(cmd, env=env).rstrip() if is_windows: return subprocess._cmdline2list(output) @@ -76,7 +74,7 @@ k, v = arg[2:].split("=") props[k] = v return props - + def test_classpath_env(self): env = self.get_newenv() env["CLASSPATH"] = some_jar @@ -207,7 +205,7 @@ def test_file(self): self.assertCommand(['test.py']) - + def test_dash(self): self.assertCommand(['-i']) diff --git a/Lib/test/test_ssl.py b/Lib/test/test_ssl.py --- a/Lib/test/test_ssl.py +++ b/Lib/test/test_ssl.py @@ -27,7 +27,13 @@ HOST = support.HOST def data_file(*name): - return os.path.join(os.path.dirname(__file__), *name) + file = os.path.join(os.path.dirname(__file__), *name) + # Ensure we return unicode path. This tweak is not a divergence: + # CPython 2.7.13 fails the same way for a non-ascii location. + if isinstance(file, unicode): + return file + else: + return file.decode(sys.getfilesystemencoding()) # The custom key and certificate files used in test_ssl are generated # using Lib/test/make_ssl_certs.py. diff --git a/Lib/test/test_support.py b/Lib/test/test_support.py --- a/Lib/test/test_support.py +++ b/Lib/test/test_support.py @@ -509,7 +509,7 @@ if is_jython: # Jython disallows @ in module names TESTFN = '$test' - TESTFN_UNICODE = "$test-\xe0\xf2" + TESTFN_UNICODE = u"$test-\u87d2\u86c7" # = test python (Chinese) TESTFN_ENCODING = sys.getfilesystemencoding() elif os.name == 'riscos': TESTFN = 'testfile' diff --git a/Lib/test/test_zipimport_jy.py b/Lib/test/test_zipimport_jy.py --- a/Lib/test/test_zipimport_jy.py +++ b/Lib/test/test_zipimport_jy.py @@ -51,8 +51,10 @@ A(path).somevar = 1 def test_main(): - test_support.run_unittest(SyspathZipimportTest) - test_support.run_unittest(ZipImporterDictTest) + test_support.run_unittest( + SyspathZipimportTest, + ZipImporterDictTest + ) if __name__ == "__main__": test_main() diff --git a/build.xml b/build.xml --- a/build.xml +++ b/build.xml @@ -236,6 +236,7 @@ output.dir = '${output.dir}' compile.dir = '${compile.dir}' exposed.dir = '${exposed.dir}' + gensrc.dir = '${gensrc.dir}' dist.dir = '${dist.dir}' apidoc.dir = '${apidoc.dir}' templates.dir = '${templates.dir}' @@ -434,6 +435,7 @@ + @@ -694,6 +696,7 @@ String), decoded if necessary + * from a Python bytes object, using the file system encoding. In Jython, this + * encoding is UTF-8, irrespective of the OS platform. This method is comparable with Python 3 + * os.fsdecode, but for Java use, in places such as the os module. If + * the argument is not a PyUnicode, it will be decoded using the nominal Jython + * file system encoding. If the argument is a PyUnicode, its + * String is returned. + * + * @param filename as bytes to decode, or already as unicode + * @return unicode version of path + */ + public static String fileSystemDecode(PyString filename) { + String s = filename.getString(); + if (filename instanceof PyUnicode || CharMatcher.ascii().matchesAllOf(s)) { + // Already encoded or usable as ASCII + return s; + } else { + // It's bytes, so must decode properly + assert "utf-8".equals(PySystemState.FILE_SYSTEM_ENCODING.toString()); + return codecs.PyUnicode_DecodeUTF8(s, null); + } + } + + /** + * As {@link #fileSystemDecode(PyString)} but raising ValueError if not a + * str or unicode. + * + * @param filename as bytes to decode, or already as unicode + * @return unicode version of the file name + */ + public static String fileSystemDecode(PyObject filename) { + if (filename instanceof PyString) { + return fileSystemDecode((PyString)filename); + } else + throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", + filename.getType().fastGetName())); + } + + /** + * Return a PyString object we can use as a file name or file path in places where Python + * expects a bytes (that is a str) object in the file system encoding. + * In Jython, this encoding is UTF-8, irrespective of the OS platform. + *

+ * This is subtly different from CPython's use of "file system encoding", which tracks the + * platform's choice so that OS services may be called that have a bytes interface. Jython's + * interaction with the OS occurs via Java using String arguments representing Unicode values, + * so we have no need to match the encoding actually chosen by the platform (e.g. 'mbcs' on + * Windows). Rather we need a nominal Jython file system encoding, for use where the standard + * library forces byte paths on us (in Python 2). There is no reason for this choice to vary + * with OS platform. Methods receiving paths as bytes will + * {@link #fileSystemDecode(PyString)} them again for Java. + * + * @param filename as unicode to encode, or already as bytes + * @return encoded bytes version of path + */ + public static PyString fileSystemEncode(String filename) { + if (CharMatcher.ascii().matchesAllOf(filename)) { + // Just wrap it as US-ASCII is a subset of the file system encoding + return Py.newString(filename); + } else { + // It's non just US-ASCII, so must encode properly + assert "utf-8".equals(PySystemState.FILE_SYSTEM_ENCODING.toString()); + return Py.newString(codecs.PyUnicode_EncodeUTF8(filename, null)); + } + } + + /** + * Return a PyString object we can use as a file name or file path in places where Python + * expects a bytes (that is, str) object in the file system encoding. + * In Jython, this encoding is UTF-8, irrespective of the OS platform. This method is comparable + * with Python 3 os.fsencode. If the argument is a PyString, it is returned + * unchanged. If the argument is a PyUnicode, it is converted to a bytes using the + * nominal Jython file system encoding. + * + * @param filename as unicode to encode, or already as bytes + * @return encoded bytes version of path + */ + public static PyString fileSystemEncode(PyString filename) { + return (filename instanceof PyUnicode) ? fileSystemEncode(filename.getString()) : filename; + } + + /** + * Convert a PyList path to a list of Java String objects decoded from + * the path elements to strings guaranteed usable in the Java API. + * + * @param path a Python search path + * @return equivalent Java list + */ + private static List fileSystemDecode(PyList path) { + List list = new ArrayList<>(path.__len__()); + for (PyObject filename : path.getList()) { + list.add(fileSystemDecode(filename)); + } + return list; + } + public static PyStringMap newStringMap() { // enable lazy bootstrapping (see issue #1671) if (!PyType.hasBuilder(PyStringMap.class)) { @@ -1073,11 +1174,11 @@ } Py.getSystemState().callExitFunc(); } - //XXX: this needs review to make sure we are cutting out all of the Java - // exceptions. + + //XXX: this needs review to make sure we are cutting out all of the Java exceptions. private static String getStackTrace(Throwable javaError) { - ByteArrayOutputStream buf = new ByteArrayOutputStream(); - javaError.printStackTrace(new PrintStream(buf)); + CharArrayWriter buf = new CharArrayWriter(); + javaError.printStackTrace(new PrintWriter(buf)); String str = buf.toString(); int index = -1; @@ -1170,31 +1271,55 @@ ts.exception = null; } - public static void displayException(PyObject type, PyObject value, PyObject tb, - PyObject file) { + /** + * Print the description of an exception as a big string. The arguments are closely equivalent + * to the tuple returned by Python sys.exc_info, on standard error or a given + * byte-oriented file. Compare with Python traceback.print_exception. + * + * @param type of exception + * @param value the exception parameter (second argument to raise) + * @param tb traceback of the call stack where the exception originally occurred + * @param file to print encoded string to, or null meaning standard error + */ + public static void displayException(PyObject type, PyObject value, PyObject tb, PyObject file) { + + // Output is to standard error, unless a file object has been given. StdoutWrapper stderr = Py.stderr; if (file != null) { stderr = new FixedFileWrapper(file); } flushLine(); + // The creation of the report operates entirely in Java String (to support Unicode). + String formattedException = exceptionToString(type, value, tb); + stderr.print(formattedException); + } + + /** + * Format the description of an exception as a big string. The arguments are closely equivalent + * to the tuple returned by Python sys.exc_info. Compare with Python + * traceback.format_exception. + * + * @param type of exception + * @param value the exception parameter (second argument to raise) + * @param tb traceback of the call stack where the exception originally occurred + * @return string representation of the traceback and exception + */ + static String exceptionToString(PyObject type, PyObject value, PyObject tb) { + + // Compose the stack dump, syntax error, and actual exception in this buffer: + StringBuilder buf; + if (tb instanceof PyTraceback) { - stderr.print(((PyTraceback) tb).dumpStack()); + buf = new StringBuilder(((PyTraceback)tb).dumpStack()); + } else { + buf = new StringBuilder(); } + if (__builtin__.isinstance(value, Py.SyntaxError)) { - PyObject filename = value.__findattr__("filename"); - PyObject text = value.__findattr__("text"); - PyObject lineno = value.__findattr__("lineno"); - stderr.print(" File \""); - stderr.print(filename == Py.None || filename == null ? - "" : filename.toString()); - stderr.print("\", line "); - stderr.print(lineno == null ? Py.newString("0") : lineno); - stderr.print("\n"); - if (text != Py.None && text != null && text.__len__() != 0) { - printSyntaxErrorText(stderr, value.__findattr__("offset").asInt(), - text.toString()); - } + // The value part of the exception is a syntax error: first emit that. + appendSyntaxError(buf, value); + // Now supersede it with just the syntax error message for the next phase. value = value.__findattr__("msg"); if (value == null) { value = Py.None; @@ -1203,26 +1328,53 @@ if (value.getJavaProxy() != null) { Object javaError = value.__tojava__(Throwable.class); - if (javaError != null && javaError != Py.NoConversion) { - stderr.println(getStackTrace((Throwable) javaError)); + // The value is some Java Throwable: append that too + buf.append(getStackTrace((Throwable)javaError)); } } + + // Be prepared for formatting the value part to fail (fall back to just the type) try { - stderr.println(formatException(type, value)); + buf.append(formatException(type, value)); } catch (Exception ex) { - stderr.println(formatException(type, Py.None)); + buf.append(formatException(type, Py.None)); + } + buf.append('\n'); + + return buf.toString(); + } + + /** + * Helper to {@link #tracebackToString(PyObject, PyObject)} when the value in an exception turns + * out to be a syntax error. + */ + private static void appendSyntaxError(StringBuilder buf, PyObject value) { + + PyObject filename = value.__findattr__("filename"); + PyObject text = value.__findattr__("text"); + PyObject lineno = value.__findattr__("lineno"); + + buf.append(" File \""); + buf.append(filename == Py.None || filename == null ? "" : filename.toString()); + buf.append("\", line "); + buf.append(lineno == null ? Py.newString('0') : lineno); + buf.append('\n'); + + if (text != Py.None && text != null && text.__len__() != 0) { + appendSyntaxErrorText(buf, value.__findattr__("offset").asInt(), text.toString()); } } + /** - * Print the two lines showing where a SyntaxError was caused. + * Generate two lines showing where a SyntaxError was caused. * - * @param out StdoutWrapper to print to + * @param buf to append with generated message text * @param offset the offset into text - * @param text a source code String line + * @param text a source code line */ - private static void printSyntaxErrorText(StdoutWrapper out, int offset, String text) { + private static void appendSyntaxErrorText(StringBuilder buf, int offset, String text) { if (offset >= 0) { if (offset > 0 && offset == text.length()) { offset--; @@ -1250,19 +1402,21 @@ text = text.substring(i, text.length()); } - out.print(" "); - out.print(text); + buf.append(" "); + buf.append(text); if (text.length() == 0 || !text.endsWith("\n")) { - out.print("\n"); + buf.append('\n'); } if (offset == -1) { return; } - out.print(" "); + + // The indicator line " ^" + buf.append(" "); for (offset--; offset > 0; offset--) { - out.print(" "); + buf.append(' '); } - out.print("^\n"); + buf.append("^\n"); } public static String formatException(PyObject type, PyObject value) { @@ -1290,19 +1444,34 @@ } buf.append(className); } else { - buf.append(useRepr ? type.__repr__() : type.__str__()); + // Never happens since Python 2.7? Do something sensible anyway. + buf.append(asMessageString(type, useRepr)); } + if (value != null && value != Py.None) { - // only print colon if the str() of the object is not the empty string - PyObject s = useRepr ? value.__repr__() : value.__str__(); - if (!(s instanceof PyString) || s.__len__() != 0) { - buf.append(": "); + String s = asMessageString(value, useRepr); + // Print colon and object (unless it renders as "") + if (s.length() > 0) { + buf.append(": ").append(s); } - buf.append(s); } + return buf.toString(); } + /** Defensive method to avoid exceptions from decoding (or import encodings) */ + private static String asMessageString(PyObject value, boolean useRepr) { + if (useRepr) + value = value.__repr__(); + if (value instanceof PyUnicode) { + return value.asString(); + } else { + // Carefully avoid decoding errors that would swallow the intended message + String s = value.__str__().getString(); + return PyString.encode_UnicodeEscape(s, false); + } + } + public static void writeUnraisable(Throwable unraisable, PyObject obj) { PyException pye = JavaError(unraisable); stderr.println(String.format("Exception %s in %s ignored", @@ -1565,6 +1734,16 @@ } } + private static final String IMPORT_SITE_ERROR = "" + + "Cannot import site module and its dependencies: %s\n" + + "Determine if the following attributes are correct:\n" // + + " * sys.path: %s\n" + + " This attribute might be including the wrong directories, such as from CPython\n" + + " * sys.prefix: %s\n" + + " This attribute is set by the system property python.home, although it can\n" + + " be often automatically determined by the location of the Jython jar file\n\n" + + "You can use the -S option or python.import.site=false to not import the site module"; + public static boolean importSiteIfSelected() { if (Options.importSite) { try { @@ -1574,18 +1753,10 @@ } catch (PyException pye) { if (pye.match(Py.ImportError)) { PySystemState sys = Py.getSystemState(); - throw Py.ImportError(String.format("" - + "Cannot import site module and its dependencies: %s\n" - + "Determine if the following attributes are correct:\n" - + " * sys.path: %s\n" - + " This attribute might be including the wrong directories, such as from CPython\n" - + " * sys.prefix: %s\n" - + " This attribute is set by the system property python.home, although it can\n" - + " be often automatically determined by the location of the Jython jar file\n\n" - + "You can use the -S option or python.import.site=false to not import the site module", - pye.value.__getattr__("args").__getitem__(0), - sys.path, - sys.prefix)); + String value = pye.value.__getattr__("args").__getitem__(0).toString(); + List path = fileSystemDecode(sys.path); + throw Py.ImportError( + String.format(IMPORT_SITE_ERROR, value, path, PySystemState.prefix)); } else { throw pye; } @@ -2266,7 +2437,7 @@ } /* Here we would actually like to call cls.__findattr__("__metaclass__") * rather than cls.getType(). However there are circumstances where the - * metaclass doesn't show up as __metaclass__. On the other hand we need + * metaclass doesn't show up as __metaclass__. On the other hand we need * to avoid that checker refers to builtin type___subclasscheck__ or * type___instancecheck__. Filtering out checker-instances of * PyBuiltinMethodNarrow does the trick. We also filter out PyMethodDescr diff --git a/src/org/python/core/PyBaseException.java b/src/org/python/core/PyBaseException.java --- a/src/org/python/core/PyBaseException.java +++ b/src/org/python/core/PyBaseException.java @@ -169,12 +169,17 @@ @ExposedMethod(doc = BuiltinDocs.BaseException___str___doc) final PyString BaseException___str__() { switch (args.__len__()) { - case 0: - return Py.EmptyString; - case 1: - return args.__getitem__(0).__str__(); - default: - return args.__str__(); + case 0: + return Py.EmptyString; + case 1: + PyObject arg = args.__getitem__(0); + if (arg instanceof PyString) { + return (PyString)arg; + } else { + return arg.__str__(); + } + default: + return args.__str__(); } } diff --git a/src/org/python/core/PyBytecode.java b/src/org/python/core/PyBytecode.java --- a/src/org/python/core/PyBytecode.java +++ b/src/org/python/core/PyBytecode.java @@ -116,11 +116,13 @@ throw Py.AttributeError(name); } + @Override public void __setattr__(String name, PyObject value) { // no writable attributes throwReadonly(name); } + @Override public void __delattr__(String name) { throwReadonly(name); } @@ -137,6 +139,7 @@ return new PyTuple(pystr); } + @Override public PyObject __findattr_ex__(String name) { // have to craft co_varnames specially if (name == "co_varnames") { @@ -149,7 +152,7 @@ return toPyStringTuple(co_freevars); } if (name == "co_filename") { - return new PyString(co_filename); + return Py.fileSystemEncode(co_filename); // bytes object expected by clients } if (name == "co_name") { return new PyString(co_name); @@ -1156,7 +1159,7 @@ "zap" this information, to prevent END_FINALLY from re-raising the exception. (But non-local gotos should still be resumed.) - */ + */ PyObject exit; PyObject u = stack.pop(), v, w; if (u == Py.None) { @@ -1350,7 +1353,7 @@ if (why != Why.RETURN) { retval = Py.None; } - } else { + } else { // store the stack in the frame for reentry from the yield; f.f_savedlocals = stack.popN(stack.size()); } diff --git a/src/org/python/core/PyException.java b/src/org/python/core/PyException.java --- a/src/org/python/core/PyException.java +++ b/src/org/python/core/PyException.java @@ -62,21 +62,31 @@ } private boolean printingStackTrace = false; + @Override public void printStackTrace() { Py.printException(this); } + @Override public Throwable fillInStackTrace() { return Options.includeJavaStackInExceptions ? super.fillInStackTrace() : this; } + @Override public synchronized void printStackTrace(PrintStream s) { if (printingStackTrace) { super.printStackTrace(s); } else { try { + /* + * Ensure that non-ascii characters are made printable. IOne would prefer to emit + * Unicode, but the output stream too often only accepts bytes. (s is not + * necessarily a console, e.g. during a doctest.) + */ + PyFile err = new PyFile(s); + err.setEncoding("ascii", "backslashreplace"); printingStackTrace = true; - Py.displayException(type, value, traceback, new PyFile(s)); + Py.displayException(type, value, traceback, err); } finally { printingStackTrace = false; } @@ -92,12 +102,9 @@ } } + @Override public synchronized String toString() { - ByteArrayOutputStream buf = new ByteArrayOutputStream(); - if (!printingStackTrace) { - printStackTrace(new PrintStream(buf)); - } - return buf.toString(); + return Py.exceptionToString(type, value, traceback); } /** @@ -332,10 +339,11 @@ public static String exceptionClassName(PyObject obj) { return obj instanceof PyClass ? ((PyClass)obj).__name__ : ((PyType)obj).fastGetName(); } - - + + /* Traverseproc support */ + @Override public int traverse(Visitproc visit, Object arg) { int retValue; if (type != null) { @@ -357,6 +365,7 @@ return 0; } + @Override public boolean refersDirectlyTo(PyObject ob) { return ob != null && (type == ob || value == ob || traceback == ob); } diff --git a/src/org/python/core/PyFile.java b/src/org/python/core/PyFile.java --- a/src/org/python/core/PyFile.java +++ b/src/org/python/core/PyFile.java @@ -168,10 +168,6 @@ ArgParser ap = new ArgParser("file", args, kwds, new String[] {"name", "mode", "buffering"}, 1); PyObject name = ap.getPyObject(0); - if (!(name instanceof PyString)) { - throw Py.TypeError("coercing to Unicode: need string, '" + name.getType().fastGetName() - + "' type found"); - } String mode = ap.getString(1, "r"); int bufsize = ap.getInt(2, -1); file___init__(new FileIO((PyString) name, parseMode(mode)), name, mode, bufsize); diff --git a/src/org/python/core/PyNullImporter.java b/src/org/python/core/PyNullImporter.java --- a/src/org/python/core/PyNullImporter.java +++ b/src/org/python/core/PyNullImporter.java @@ -20,7 +20,7 @@ public PyNullImporter(PyObject pathObj) { super(); - String pathStr = asPath(pathObj); + String pathStr = Py.fileSystemDecode(pathObj); if (pathStr.equals("")) { throw Py.ImportError("empty pathname"); } @@ -42,17 +42,6 @@ return Py.None; } - // FIXME Refactoring move helper function to a central util library - // FIXME Also can take in account working in zip file systems - - private static String asPath(PyObject pathObj) { - if (!(pathObj instanceof PyString)) { - throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", - pathObj.getType().fastGetName())); - } - return pathObj.toString(); - } - private static boolean isDir(String pathStr) { if (pathStr.equals("")) { return false; diff --git a/src/org/python/core/PyString.java b/src/org/python/core/PyString.java --- a/src/org/python/core/PyString.java +++ b/src/org/python/core/PyString.java @@ -79,7 +79,7 @@ } PyString(StringBuilder buffer) { - this(TYPE, new String(buffer)); + this(TYPE, buffer.toString()); } /** @@ -3998,9 +3998,9 @@ * Implements PEP-3101 {}-formatting methods str.format() and * unicode.format(). When called with enclosingIterator == null, this * method takes this object as its formatting string. The method is also called (calls itself) - * to deal with nested formatting sepecifications. In that case, enclosingIterator + * to deal with nested formatting specifications. In that case, enclosingIterator * is a {@link MarkupIterator} on this object and value is a substring of this - * object needing recursive transaltion. + * object needing recursive translation. * * @param args to be interpolated into the string * @param keywords for the trailing args diff --git a/src/org/python/core/PySystemState.java b/src/org/python/core/PySystemState.java --- a/src/org/python/core/PySystemState.java +++ b/src/org/python/core/PySystemState.java @@ -82,6 +82,9 @@ public final static PyString float_repr_style = Py.newString("short"); + /** Nominal Jython file system encoding (as sys.getfilesystemencoding()) */ + static final PyString FILE_SYSTEM_ENCODING = Py.newString("utf-8"); + public static boolean py3kwarning = false; public final static Class flags = Options.class; @@ -109,12 +112,25 @@ public static PackageManager packageManager; private static File cachedir; - private static PyList defaultPath; - private static PyList defaultArgv; - private static PyObject defaultExecutable; + private static PyList defaultPath; // list of bytes or unicode + private static PyList defaultArgv; // list of bytes or unicode + private static PyObject defaultExecutable; // bytes or unicode or None public static Properties registry; // = init_registry(); + /** + * A string giving the site-specific directory prefix where the platform independent Python + * files are installed; by default, this is based on the property python.home or + * the location of the Jython JAR. The main collection of Python library modules is installed in + * the directory prefix/Lib. This object should contain bytes in the file system + * encoding for consistency with use in the standard library (see sysconfig.py). + */ public static PyObject prefix; + /** + * A string giving the site-specific directory prefix where the platform-dependent Python files + * are installed; by default, this is the same as {@link #exec_prefix}. This object should + * contain bytes in the file system encoding for consistency with use in the standard library + * (see sysconfig.py). + */ public static PyObject exec_prefix = Py.EmptyString; public static final PyString byteorder = new PyString("big"); @@ -504,7 +520,7 @@ } public PyObject getfilesystemencoding() { - return Py.None; + return FILE_SYSTEM_ENCODING; } @@ -840,10 +856,10 @@ } } if (prefix != null) { - PySystemState.prefix = Py.newString(prefix); + PySystemState.prefix = Py.fileSystemEncode(prefix); } if (exec_prefix != null) { - PySystemState.exec_prefix = Py.newString(exec_prefix); + PySystemState.exec_prefix = Py.fileSystemEncode(exec_prefix); } try { String jythonpath = System.getenv("JYTHONPATH"); @@ -1155,7 +1171,8 @@ } cachedir = new File(props.getProperty(PYTHON_CACHEDIR, CACHEDIR_DEFAULT_NAME)); if (!cachedir.isAbsolute()) { - cachedir = new File(prefix == null ? null : prefix.toString(), cachedir.getPath()); + String prefixString = prefix == null ? null : Py.fileSystemDecode(prefix); + cachedir = new File(prefixString, cachedir.getPath()); } } @@ -1174,16 +1191,16 @@ PyList argv = new PyList(); if (args != null) { for (String arg : args) { - argv.append(Py.newStringOrUnicode(arg)); + argv.append(Py.newStringOrUnicode(arg)); // XXX or always newUnicode? } } return argv; } /** - * Determine the default sys.executable value from the registry. - * If registry is not set (as in standalone jython jar), will use sys.prefix + /bin/jython(.exe) and the file may - * not exist. Users can create a wrapper in it's place to make it work in embedded environments. + * Determine the default sys.executable value from the registry. If registry is not set (as in + * standalone jython jar), we will use sys.prefix + /bin/jython(.exe) and the file may not + * exist. Users can create a wrapper in it's place to make it work in embedded environments. * Only if sys.prefix is null, returns Py.None * * @param props a Properties registry @@ -1191,26 +1208,26 @@ */ private static PyObject initExecutable(Properties props) { String executable = props.getProperty("python.executable"); - if (executable == null) { + File executableFile; + if (executable != null) { + // The executable from the registry is a Unicode String path + executableFile = new File(executable); + } else { if (prefix == null) { return Py.None; } else { - executable = prefix.asString() + File.separator + "bin" + File.separator; - if (Platform.IS_WINDOWS) { - executable += "jython.exe"; - } else { - executable += "jython"; - } + // The prefix is a unicode or encoded bytes object + executableFile = new File(Py.fileSystemDecode(prefix), + Platform.IS_WINDOWS ? "bin\\jython.exe" : "bin/jython"); } } - File executableFile = new File(executable); try { executableFile = executableFile.getCanonicalFile(); } catch (IOException ioe) { executableFile = executableFile.getAbsoluteFile(); } - return new PyString(executableFile.getPath()); + return Py.newStringOrUnicode(executableFile.getPath()); // XXX always bytes in CPython } /** @@ -1353,8 +1370,8 @@ PyList path = new PyList(); addPaths(path, props.getProperty("python.path", "")); if (prefix != null) { - String libpath = new File(prefix.toString(), "Lib").toString(); - path.append(new PyString(libpath)); + String libpath = new File(Py.fileSystemDecode(prefix), "Lib").toString(); + path.append(Py.fileSystemEncode(libpath)); // XXX or newUnicode? } if (standalone) { // standalone jython: add the /Lib directory inside JYTHON_JAR to the path @@ -1397,7 +1414,8 @@ private static void addPaths(PyList path, String pypath) { StringTokenizer tok = new StringTokenizer(pypath, java.io.File.pathSeparator); while (tok.hasMoreTokens()) { - path.append(new PyString(tok.nextToken().trim())); + // Use unicode object if necessary to represent the element + path.append(Py.newStringOrUnicode(tok.nextToken().trim())); // XXX or newUnicode? } } @@ -1540,6 +1558,7 @@ closer.cleanup(); } + @Override public void close() { cleanup(); } public static class PySystemStateCloser { diff --git a/src/org/python/core/PyTableCode.java b/src/org/python/core/PyTableCode.java --- a/src/org/python/core/PyTableCode.java +++ b/src/org/python/core/PyTableCode.java @@ -66,6 +66,7 @@ // co_lnotab, co_stacksize }; + @Override public PyObject __dir__() { PyString members[] = new PyString[__members__.length]; for (int i = 0; i < __members__.length; i++) @@ -80,11 +81,13 @@ throw Py.AttributeError(name); } + @Override public void __setattr__(String name, PyObject value) { // no writable attributes throwReadonly(name); } + @Override public void __delattr__(String name) { throwReadonly(name); } @@ -99,6 +102,7 @@ return new PyTuple(pystr); } + @Override public PyObject __findattr_ex__(String name) { // have to craft co_varnames specially if (name == "co_varnames") { @@ -111,7 +115,7 @@ return toPyStringTuple(co_freevars); } if (name == "co_filename") { - return new PyString(co_filename); + return Py.fileSystemEncode(co_filename); // bytes object expected by clients } if (name == "co_name") { return new PyString(co_name); diff --git a/src/org/python/core/PyUnicode.java b/src/org/python/core/PyUnicode.java --- a/src/org/python/core/PyUnicode.java +++ b/src/org/python/core/PyUnicode.java @@ -89,7 +89,7 @@ } PyUnicode(StringBuilder buffer) { - this(TYPE, new String(buffer)); + this(TYPE, buffer.toString()); } private static StringBuilder fromCodePoints(Iterator iter) { @@ -713,7 +713,7 @@ for (Iterator iter = newSubsequenceIterator(start, stop, step); iter.hasNext();) { buffer.appendCodePoint(iter.next()); } - return createInstance(new String(buffer)); + return createInstance(buffer.toString()); } @ExposedMethod(type = MethodType.CMP, doc = BuiltinDocs.unicode___getslice___doc) diff --git a/src/org/python/core/SyspathArchive.java b/src/org/python/core/SyspathArchive.java --- a/src/org/python/core/SyspathArchive.java +++ b/src/org/python/core/SyspathArchive.java @@ -4,7 +4,7 @@ import java.util.zip.*; @Untraversable -public class SyspathArchive extends PyString { +public class SyspathArchive extends PyUnicode { private ZipFile zipFile; public SyspathArchive(String archiveName) throws IOException { diff --git a/src/org/python/core/SyspathJavaLoader.java b/src/org/python/core/SyspathJavaLoader.java --- a/src/org/python/core/SyspathJavaLoader.java +++ b/src/org/python/core/SyspathJavaLoader.java @@ -26,20 +26,20 @@ public SyspathJavaLoader(ClassLoader parent) { super(parent); } - - /** + + /** * Returns a byte[] with the contents read from an InputStream. - * + * * The stream is closed after reading the bytes. - * - * @param input The input stream + * + * @param input The input stream * @param size The number of bytes to read - * + * * @return an array of byte[size] with the contents read * */ private byte[] getBytesFromInputStream(InputStream input, int size) { - try { + try { byte[] buffer = new byte[size]; int nread = 0; while(nread < size) { @@ -56,9 +56,9 @@ } } } - + private byte[] getBytesFromDir(String dir, String name) { - try { + try { File file = getFile(dir, name); if (file == null) { return null; @@ -71,7 +71,7 @@ } } - + private byte[] getBytesFromArchive(SyspathArchive archive, String name) { String entryname = name.replace('.', SLASH_CHAR) + ".class"; ZipEntry ze = archive.getEntry(entryname); @@ -79,7 +79,7 @@ return null; } try { - return getBytesFromInputStream(archive.getInputStream(ze), + return getBytesFromInputStream(archive.getInputStream(ze), (int)ze.getSize()); } catch (IOException e) { return null; @@ -98,11 +98,11 @@ } return pkg; } - + @Override protected Class findClass(String name) throws ClassNotFoundException { PySystemState sys = Py.getSystemState(); - ClassLoader sysClassLoader = sys.getClassLoader(); + ClassLoader sysClassLoader = sys.getClassLoader(); if (sysClassLoader != null) { // sys.classLoader overrides this class loader! return sysClassLoader.loadClass(name); @@ -114,13 +114,10 @@ PyObject entry = replacePathItem(sys, i, path); if (entry instanceof SyspathArchive) { SyspathArchive archive = (SyspathArchive)entry; - buffer = getBytesFromArchive(archive, name); + buffer = getBytesFromArchive(archive, name); } else { - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); - buffer = getBytesFromDir(dir, name); + String dir = Py.fileSystemDecode(entry); + buffer = getBytesFromDir(dir, name); } if (buffer != null) { definePackageForClass(name); @@ -130,7 +127,7 @@ // couldn't find the .class file on sys.path throw new ClassNotFoundException(name); } - + @Override protected URL findResource(String res) { PySystemState sys = Py.getSystemState(); @@ -157,10 +154,7 @@ } continue; } - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = sys.getPath(entry.toString()); + String dir = sys.getPath(Py.fileSystemDecode(entry)); try { File resource = new File(dir, res); if (!resource.exists()) { @@ -179,7 +173,7 @@ throws IOException { List resources = new ArrayList(); - + PySystemState sys = Py.getSystemState(); res = deslashResource(res); @@ -204,10 +198,7 @@ } continue; } - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = sys.getPath(entry.toString()); + String dir = sys.getPath(Py.fileSystemDecode(entry)); try { File resource = new File(dir, res); if (!resource.exists()) { @@ -220,7 +211,7 @@ } return Collections.enumeration(resources); } - + static PyObject replacePathItem(PySystemState sys, int idx, PyList paths) { PyObject path = paths.__getitem__(idx); if (path instanceof SyspathArchive) { @@ -229,9 +220,9 @@ } try { - // this has the side affect of adding the jar to the PackageManager during the + // this has the side effect of adding the jar to the PackageManager during the // initialization of the SyspathArchive - path = new SyspathArchive(sys.getPath(path.toString())); + path = new SyspathArchive(sys.getPath(Py.fileSystemDecode(path))); } catch (Exception e) { return path; } diff --git a/src/org/python/core/__builtin__.java b/src/org/python/core/__builtin__.java --- a/src/org/python/core/__builtin__.java +++ b/src/org/python/core/__builtin__.java @@ -85,7 +85,7 @@ case 18: return __builtin__.eval(arg1); case 19: - __builtin__.execfile(arg1.asString()); + __builtin__.execfile(Py.fileSystemDecode(arg1)); return Py.None; case 23: return __builtin__.hex(arg1); @@ -141,7 +141,7 @@ case 18: return __builtin__.eval(arg1, arg2); case 19: - __builtin__.execfile(arg1.asString(), arg2); + __builtin__.execfile(Py.fileSystemDecode(arg1), arg2); return Py.None; case 20: return __builtin__.filter(arg1, arg2); @@ -191,7 +191,7 @@ case 18: return __builtin__.eval(arg1, arg2, arg3); case 19: - __builtin__.execfile(arg1.asString(), arg2, arg3); + __builtin__.execfile(Py.fileSystemDecode(arg1), arg2, arg3); return Py.None; case 21: return __builtin__.getattr(arg1, arg2, arg3); @@ -1629,7 +1629,7 @@ "dont_inherit"}, 3); PyObject source = ap.getPyObject(0); - String filename = ap.getString(1); + String filename = Py.fileSystemDecode(ap.getPyObject(1)); String mode = ap.getString(2); int flags = ap.getInt(3, 0); boolean dont_inherit = ap.getPyObject(4, Py.False).__nonzero__(); diff --git a/src/org/python/core/imp.java b/src/org/python/core/imp.java --- a/src/org/python/core/imp.java +++ b/src/org/python/core/imp.java @@ -294,6 +294,7 @@ return compileSource(name, makeStream(file), sourceFilename, mtime); } + /** Remove the last three characters of a file name and add the compiled suffix "$py.class". */ public static String makeCompiledFilename(String filename) { return filename.substring(0, filename.length() - 3) + "$py.class"; } @@ -418,7 +419,8 @@ } if (moduleLocation != null) { - module.__setattr__("__file__", new PyString(moduleLocation)); + // Standard library expects __file__ to be encoded bytes + module.__setattr__("__file__", Py.fileSystemEncode(moduleLocation)); } else if (module.__findattr__("__file__") == null) { // Should probably never happen (but maybe with an odd custom builtins, or // Java Integration) @@ -543,10 +545,8 @@ return loadFromLoader(loader, moduleName); } } - if (!(p instanceof PyUnicode)) { - p = p.__str__(); - } - ret = loadFromSource(sys, name, moduleName, p.toString()); + // p could be unicode or bytes (in the file system encoding) + ret = loadFromSource(sys, name, moduleName, Py.fileSystemDecode(p)); if (ret != null) { return ret; } @@ -606,7 +606,7 @@ // display names are for identification purposes (e.g. __file__): when entry is // null it forces java.io.File to be a relative path (e.g. foo/bar.py instead of // /tmp/foo/bar.py) - String displayDirName = entry.equals("") ? null : entry.toString(); + String displayDirName = entry.equals("") ? null : entry; String displaySourceName = new File(new File(displayDirName, name), sourceName).getPath(); String displayCompiledName = new File(new File(displayDirName, name), compiledName).getPath(); @@ -640,7 +640,7 @@ compiledFile = new File(dirName, compiledName); } else { PyModule m = addModule(modName); - PyObject filename = new PyString(new File(displayDirName, name).getPath()); + PyObject filename = Py.newStringOrUnicode(new File(displayDirName, name).getPath()); m.__dict__.__setitem__("__path__", new PyList(new PyObject[] {filename})); } @@ -928,9 +928,6 @@ } } } - if (name.indexOf(File.separatorChar) != -1) { - throw Py.ImportError("Import by filename is not supported."); - } PyObject modules = Py.getSystemState().modules; PyObject pkgMod = null; String pkgName = null; @@ -974,6 +971,13 @@ return mod; } + /** Defend against attempt to import by filename (withdrawn feature). */ + private static void checkNotFile(String name){ + if (name.indexOf(File.separatorChar) != -1) { + throw Py.ImportError("Import by filename is not supported."); + } + } + private static void ensureFromList(PyObject mod, PyObject fromlist, String name) { ensureFromList(mod, fromlist, name, false); } @@ -1016,6 +1020,7 @@ * @return an imported module (Java or Python) */ public static PyObject importName(String name, boolean top) { + checkNotFile(name); PyUnicode.checkEncoding(name); ReentrantLock importLock = Py.getSystemState().getImportLock(); importLock.lock(); @@ -1036,6 +1041,7 @@ */ public static PyObject importName(String name, boolean top, PyObject modDict, PyObject fromlist, int level) { + checkNotFile(name); PyUnicode.checkEncoding(name); ReentrantLock importLock = Py.getSystemState().getImportLock(); importLock.lock(); diff --git a/src/org/python/core/io/FileIO.java b/src/org/python/core/io/FileIO.java --- a/src/org/python/core/io/FileIO.java +++ b/src/org/python/core/io/FileIO.java @@ -67,7 +67,7 @@ * @see #FileIO(PyString name, String mode) */ public FileIO(String name, String mode) { - this(Py.newString(name), mode); + this(Py.newUnicode(name), mode); } /** @@ -82,7 +82,7 @@ */ public FileIO(PyString name, String mode) { parseMode(mode); - File absPath = new RelativeFile(name.toString()); + File absPath = new RelativeFile(Py.fileSystemDecode(name)); try { if ((appending && !(reading || plus)) || (writing && !reading && !plus)) { diff --git a/src/org/python/core/packagecache/PathPackageManager.java b/src/org/python/core/packagecache/PathPackageManager.java --- a/src/org/python/core/packagecache/PathPackageManager.java +++ b/src/org/python/core/packagecache/PathPackageManager.java @@ -40,12 +40,9 @@ + name; for (int i = 0; i < path.__len__(); i++) { + // Each entry in the path may be byte-encoded or unicode PyObject entry = path.pyget(i); - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); - + String dir = Py.fileSystemDecode(entry); File f = new RelativeFile(dir, child); try { if (f.isDirectory() && imp.caseok(f, name)) { @@ -103,11 +100,8 @@ String child = jpkg.__name__.replace('.', File.separatorChar); for (int i = 0; i < path.__len__(); i++) { - PyObject entry = path.pyget(i); - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); + // Each entry in the path may be byte-encoded or unicode + String dir = Py.fileSystemDecode(path.pyget(i)); if (dir.length() == 0) { dir = null; diff --git a/src/org/python/modules/_imp.java b/src/org/python/modules/_imp.java --- a/src/org/python/modules/_imp.java +++ b/src/org/python/modules/_imp.java @@ -68,14 +68,14 @@ * This needs to be consolidated with the code in (@see org.python.core.imp). * * @param name module name - * @param entry a path String + * @param entry a path String (Unicode file or directory name) * @param findingPackage if looking for a package only try to locate __init__ * @return null if no module found otherwise module information */ static ModuleInfo findFromSource(String name, String entry, boolean findingPackage, boolean preferSource) { String sourceName = "__init__.py"; - String compiledName = makeCompiledFilename(sourceName); + String compiledName = imp.makeCompiledFilename(sourceName); String directoryName = PySystemState.getPathLazy(entry); // displayDirName is for identification purposes: when null it // forces java.io.File to be a relative path (e.g. foo/bar.py @@ -97,7 +97,7 @@ } else { Py.writeDebug("import", "trying source " + dir.getPath()); sourceName = name + ".py"; - compiledName = makeCompiledFilename(sourceName); + compiledName = imp.makeCompiledFilename(sourceName); sourceFile = new File(directoryName, sourceName); compiledFile = new File(directoryName, compiledName); } @@ -152,8 +152,7 @@ throw Py.TypeError("must be a file-like object"); } PySystemState sys = Py.getSystemState(); - String compiledFilename = - makeCompiledFilename(sys.getPath(filename)); + String compiledFilename = imp.makeCompiledFilename(sys.getPath(filename)); mod = imp.createFromSource(modname.intern(), (InputStream)o, filename, compiledFilename); PyObject modules = sys.modules; @@ -161,15 +160,38 @@ return mod; } - public static PyObject load_compiled(String name, String pathname) { - return load_compiled(name, pathname, new PyFile(pathname, "rb", -1)); - } - public static PyObject reload(PyObject module) { return __builtin__.reload(module); } - public static PyObject load_compiled(String name, String pathname, PyObject file) { + /** + * Return a module with the given name, the result of executing the compiled code + * at the given pathname. If this path is a PyUnicode, it is used + * exactly; if it is a PyString it is taken to be file-system encoded. + * + * @param name the module name + * @param pathname to the compiled module (becomes __file__) + * @return the module called name + */ + public static PyObject load_compiled(String name, PyString pathname) { + String _pathname = Py.fileSystemDecode(pathname); + return _load_compiled(name, _pathname, new PyFile(_pathname, "rb", -1)); + } + + /** + * Return a module with the given name, the result of executing the compiled code + * in the given file stream. + * + * @param name the module name + * @param pathname a file path that is not null (becomes __file__) + * @param file stream from which the compiled code is taken + * @return the module called name + */ + public static PyObject load_compiled(String name, PyString pathname, PyObject file) { + return _load_compiled(name, Py.fileSystemDecode(pathname), file); + } + + private static PyObject _load_compiled(String name, String pathname, PyObject file) { InputStream stream = (InputStream) file.__tojava__(InputStream.class); if (stream == Py.NoConversion) { throw Py.TypeError("must be a file-like object"); @@ -190,8 +212,10 @@ public static PyObject find_module(String name, PyObject path) { if (path == Py.None && PySystemState.getBuiltin(name) != null) { - return new PyTuple(Py.None, Py.newString(name), - new PyTuple(Py.EmptyString, Py.EmptyString, + return new PyTuple(Py.None, + Py.newString(name), + new PyTuple(Py.EmptyString, + Py.EmptyString, Py.newInteger(C_BUILTIN))); } @@ -199,14 +223,14 @@ path = Py.getSystemState().path; } for (PyObject p : path.asIterable()) { - ModuleInfo mi = findFromSource(name, p.toString(), false, true); + ModuleInfo mi = findFromSource(name, Py.fileSystemDecode(p), false, true); if(mi == null) { continue; } return new PyTuple(mi.file, - new PyString(mi.filename), - new PyTuple(new PyString(mi.suffix), - new PyString(mi.mode), + Py.newStringOrUnicode(mi.filename), + new PyTuple(Py.newString(mi.suffix), + Py.newString(mi.mode), Py.newInteger(mi.type))); } throw Py.ImportError("No module named " + name); @@ -216,7 +240,8 @@ PyObject mod = Py.None; PySystemState sys = Py.getSystemState(); int type = data.__getitem__(2).asInt(); - while(mod == Py.None) { + String filenameString = Py.fileSystemDecode(filename); + while (mod == Py.None) { String compiledName; switch (type) { case PY_SOURCE: @@ -226,8 +251,8 @@ } // XXX: This should load the accompanying byte code file instead, if it exists - String resolvedFilename = sys.getPath(filename.toString()); - compiledName = makeCompiledFilename(resolvedFilename); + String resolvedFilename = sys.getPath(filenameString); + compiledName = imp.makeCompiledFilename(resolvedFilename); if (name.endsWith(".__init__")) { name = name.substring(0, name.length() - ".__init__".length()); } else if (name.equals("__init__")) { @@ -241,19 +266,20 @@ } mod = imp.createFromSource(name.intern(), (InputStream)o, - filename.toString(), compiledName, mtime); + filenameString, compiledName, mtime); break; case PY_COMPILED: - mod = load_compiled(name, filename.toString(), file); + mod = _load_compiled(name, filenameString, file); break; case PKG_DIRECTORY: PyModule m = imp.addModule(name); m.__dict__.__setitem__("__path__", new PyList(new PyObject[] {filename})); m.__dict__.__setitem__("__file__", filename); - ModuleInfo mi = findFromSource(name, filename.toString(), true, true); + ModuleInfo mi = findFromSource(name, filenameString, true, true); type = mi.type; file = mi.file; - filename = new PyString(mi.filename); + filenameString = mi.filename; + filename = Py.newStringOrUnicode(filenameString); break; default: throw Py.ImportError("No module named " + name); @@ -264,8 +290,13 @@ return mod; } - public static String makeCompiledFilename(String filename) { - return imp.makeCompiledFilename(filename); + /** + * Variant of {@link imp#makeCompiledFilename(String)} dealing with encoded bytes. In the context + * where this is used from Python, a result in encoded bytes is preferable. + */ + public static PyString makeCompiledFilename(PyString filename) { + filename = Py.fileSystemEncode(filename); + return Py.newString(imp.makeCompiledFilename(filename.getString())); } public static PyObject get_magic() { diff --git a/src/org/python/modules/_py_compile.java b/src/org/python/modules/_py_compile.java --- a/src/org/python/modules/_py_compile.java +++ b/src/org/python/modules/_py_compile.java @@ -12,22 +12,30 @@ public class _py_compile { public static PyList __all__ = new PyList(new PyString[] { new PyString("compile") }); - public static boolean compile(String filename, String cfile, String dfile) { - // Resolve relative path names. dfile is only used for error messages and should not be - // resolved + /** + * Java wrapper on the module compiler in support of of py_compile.compile. Filenames here will + * be interpreted as Unicode if they are PyUnicode, and as byte-encoded names if they only + * PyString. + * + * @param fileName actual source file name + * @param compiledName compiled filename + * @param displayName displayed source filename, only used for error messages (and not resolved) + * @return true if successful + */ + public static boolean compile(PyString fileName, PyString compiledName, PyString displayName) { + // Resolve source path and check it exists PySystemState sys = Py.getSystemState(); - filename = sys.getPath(filename); - cfile = sys.getPath(cfile); + String file = sys.getPath(Py.fileSystemDecode(fileName)); + File f = new File(file); + if (!f.exists()) { + throw Py.IOError(Errno.ENOENT, file); + } - File file = new File(filename); - if (!file.exists()) { - throw Py.IOError(Errno.ENOENT, Py.newString(filename)); - } - String name = getModuleName(file); - - byte[] bytes = org.python.core.imp.compileSource(name, file, dfile, cfile); - org.python.core.imp.cacheCompiledSource(filename, cfile, bytes); - + // Convert file in which to put the byte code and display name (each may be null) + String c = (compiledName == null) ? null : sys.getPath(Py.fileSystemDecode(compiledName)); + String d = (displayName == null) ? null : Py.fileSystemDecode(displayName); + byte[] bytes = org.python.core.imp.compileSource(getModuleName(f), f, d, c); + org.python.core.imp.cacheCompiledSource(file, c, bytes); return bytes.length > 0; } diff --git a/src/org/python/modules/posix/PosixModule.java b/src/org/python/modules/posix/PosixModule.java --- a/src/org/python/modules/posix/PosixModule.java +++ b/src/org/python/modules/posix/PosixModule.java @@ -486,7 +486,8 @@ "getcwd() -> path\n\n" + "Return a string representing the current working directory."); public static PyObject getcwd() { - return Py.newStringOrUnicode(Py.getSystemState().getCurrentWorkingDir()); + // The return value is bytes in the file system encoding + return Py.fileSystemEncode(Py.getSystemState().getCurrentWorkingDir()); } public static PyString __doc__getcwdu = new PyString( @@ -1343,25 +1344,24 @@ return environ; } for (Map.Entry entry : env.entrySet()) { + // The shell restricts names to a subset of ASCII and values are encoded byte strings. environ.__setitem__( - Py.newStringOrUnicode(entry.getKey()), - Py.newStringOrUnicode(entry.getValue())); + Py.newString(entry.getKey()), + Py.fileSystemEncode(entry.getValue())); } return environ; } /** - * Return a path as a String from a PyObject + * Return a path as a String from a PyObject, which must be str or + * unicode. If the path is a str (that is, bytes), it is + * interpreted into Unicode using the file system encoding. * * @param path a PyObject, raising a TypeError if an invalid path type * @return a String path */ private static String asPath(PyObject path) { - if (path instanceof PyString) { - return path.toString(); - } - throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", - path.getType().fastGetName())); + return Py.fileSystemDecode(path); } /** diff --git a/src/org/python/modules/zipimport/zipimporter.java b/src/org/python/modules/zipimport/zipimporter.java --- a/src/org/python/modules/zipimport/zipimporter.java +++ b/src/org/python/modules/zipimport/zipimporter.java @@ -20,6 +20,7 @@ import org.python.core.PySystemState; import org.python.core.PyTuple; import org.python.core.PyType; +import org.python.core.PyUnicode; import org.python.core.Traverseproc; import org.python.core.Visitproc; import org.python.core.util.FileUtil; @@ -80,7 +81,7 @@ @ExposedMethod final void zipimporter___init__(PyObject[] args, String[] kwds) { ArgParser ap = new ArgParser("__init__", args, kwds, new String[] {"path"}); - String path = ap.getString(0); + String path = Py.fileSystemDecode(ap.getPyObject(0)); zipimporter___init__(path); } @@ -113,10 +114,11 @@ pathFile = parentFile; } if (archive != null) { - files = zipimport._zip_directory_cache.__finditem__(archive); + PyUnicode archivePath = Py.newUnicode(archive); + files = zipimport._zip_directory_cache.__finditem__(archivePath); if (files == null) { files = readDirectory(archive); - zipimport._zip_directory_cache.__setitem__(archive, files); + zipimport._zip_directory_cache.__setitem__(archivePath, files); } } else { throw zipimport.ZipImportError("not a Zip file: " + path); diff --git a/src/org/python/util/jython.java b/src/org/python/util/jython.java --- a/src/org/python/util/jython.java +++ b/src/org/python/util/jython.java @@ -341,8 +341,8 @@ } else { try { interp.globals.__setitem__(new PyString("__file__"), - new PyString(opts.filename)); - + // Note that __file__ is widely expected to be encoded bytes + Py.fileSystemEncode(opts.filename)); FileInputStream file; try { file = new FileInputStream(new RelativeFile(opts.filename)); diff --git a/src/shell/jython.exe b/src/shell/jython.exe index 7c9cbe9eec239c5768c17f873726220b09966341..b7500204c603274a6bdb9ec15064bd27f31c14ac GIT binary patch [stripped] diff --git a/src/shell/jython.py b/src/shell/jython.py --- a/src/shell/jython.py +++ b/src/shell/jython.py @@ -20,19 +20,68 @@ is_windows = os.name == "nt" or (os.name == "java" and os._name == "nt") +# A note about encoding: +# +# A major motivation for this program is to launch Jython on Windows, where +# console and file encoding may be different. Command-line arguments and +# environment variables are presented in Python 2.7 as byte-data, encoded +# "somehow". It becomes important to know which decoding to use as soon as +# paths may contain non-ascii characters. It is not the console encoding. +# Experiment shows that sys.getfilesystemencoding() is generally applicable +# to arguments, environment variables and spawning a subprocess. +# +# On a Windows 10 box, this comes up with pseudo-codec 'mbcs'. This supports +# European accented characters pretty well. +# +# When localised to Chinese(simplified) the FS encoding mbcs includes many +# more points than cp936 (the console encoding), although it still struggles +# with European accented characters. + +ENCODING = sys.getfilesystemencoding() or "utf-8" + + +def get_env(envvar, default=None): + """ Return the named environment variable, decoded to Unicode.""" + v = os.environ.get(envvar, default) + # Tolerate default given as bytes, as we're bound to forget sometimes + if isinstance(v, bytes): + v = v.decode(ENCODING) + # Remove quotes sometimes necessary around the value + if v is not None and v.startswith('"') and v.endswith('"'): + v = v[1:-1] + return v + +def encode_list(args, encoding=ENCODING): + """ Convert list of Unicode strings to list of encoded byte strings.""" + r = [] + for a in args: + if not isinstance(a, bytes): a = a.encode(encoding) + r.append(a) + return r + +def decode_list(args, encoding=ENCODING): + """ Convert list of byte strings to list of Unicode strings.""" + r = [] + for a in args: + if not isinstance(a, unicode): a = a.decode(encoding) + r.append(a) + return r def parse_launcher_args(args): + """ Process the given argument list into two objects, the first part being + a namespace of checked arguments to the interpreter itself, and the rest + being the Python program it will run and its arguments. + """ class Namespace(object): pass parsed = Namespace() - parsed.java = [] - parsed.properties = OrderedDict() - parsed.boot = False - parsed.jdb = False - parsed.help = False - parsed.print_requested = False - parsed.profile = False - parsed.jdb = None + parsed.boot = False # --boot flag given + parsed.jdb = False # --jdb flag given + parsed.help = False # --help or -h flag given + parsed.print_requested = False # --print flag given + parsed.profile = False # --profile flag given + parsed.properties = OrderedDict() # properties to give the JVM + parsed.java = [] # any other arguments to give the JVM it = iter(args) next(it) # ignore sys.argv[0] @@ -42,11 +91,11 @@ arg = next(it) except StopIteration: break - if arg.startswith("-D"): - k, v = arg[2:].split("=") + if arg.startswith(u"-D"): + k, v = arg[2:].split(u"=") parsed.properties[k] = v i += 1 - elif arg in ("-J-classpath", "-J-cp"): + elif arg in (u"-J-classpath", u"-J-cp"): try: next_arg = next(it) except StopIteration: @@ -55,24 +104,24 @@ bad_option("Bad option for -J-classpath") parsed.classpath = next_arg i += 2 - elif arg.startswith("-J-Xmx"): + elif arg.startswith(u"-J-Xmx"): parsed.mem = arg[2:] i += 1 - elif arg.startswith("-J-Xss"): + elif arg.startswith(u"-J-Xss"): parsed.stack = arg[2:] i += 1 - elif arg.startswith("-J"): + elif arg.startswith(u"-J"): parsed.java.append(arg[2:]) i += 1 - elif arg == "--print": + elif arg == u"--print": parsed.print_requested = True i += 1 - elif arg in ("-h", "--help"): + elif arg in (u"-h", u"--help"): parsed.help = True - elif arg in ("--boot", "--jdb", "--profile"): + elif arg in (u"--boot", u"--jdb", u"--profile"): setattr(parsed, arg[2:], True) i += 1 - elif arg == "--": + elif arg == u"--": i += 1 break else: @@ -92,13 +141,13 @@ if hasattr(self, "_uname"): return self._uname if is_windows: - self._uname = "windows" + self._uname = u"windows" else: uname = subprocess.check_output(["uname"]).strip().lower() if uname.startswith("cygwin"): - self._uname = "cygwin" + self._uname = u"cygwin" else: - self._uname = uname + self._uname = uname.decode(ENCODING) return self._uname @property @@ -114,22 +163,23 @@ return self._java_command def setup_java_command(self): + """ Sets java_home and java_command according to environment and parsed + launcher arguments --jdb and --help. + """ if self.args.help: self._java_home = None - self._java_command = "java" + self._java_command = u"java" return - - if "JAVA_HOME" not in os.environ: - self._java_home = None - self._java_command = "jdb" if self.args.jdb else "java" + + command = u"jdb" if self.args.jdb else u"java" + + self._java_home = get_env("JAVA_HOME") + if self._java_home is None or self.uname == u"cygwin": + # Assume java or jdb on the path + self._java_command = command else: - self._java_home = os.environ["JAVA_HOME"] - if self.uname == "cygwin": - self._java_command = "jdb" if self.args.jdb else "java" - else: - self._java_command = os.path.join( - self.java_home, "bin", - "jdb" if self.args.jdb else "java") + # Assume java or jdb in JAVA_HOME/bin + self._java_command = os.path.join(self._java_home, u"bin", command) @property def executable(self): @@ -139,28 +189,37 @@ # Modified from # http://stackoverflow.com/questions/3718657/how-to-properly-determine-current-script-directory-in-python/22881871#22881871 if getattr(sys, "frozen", False): # py2exe, PyInstaller, cx_Freeze - path = os.path.abspath(sys.executable) + # Frozen. Let it go with the executable path. + bytes_path = sys.executable else: - def inspect_this(): pass - path = inspect.getabsfile(inspect_this) - self._executable = os.path.realpath(path) + # Not frozen. Any object defined in this file will do. + bytes_path = inspect.getfile(JythonCommand) + # Python 2 thinks in bytes. Carefully normalise in Unicode. + path = os.path.realpath(bytes_path.decode(ENCODING)) + try: + # If possible, make this relative to the CWD. + # This helps manage multi-byte names in installation location. + path = os.path.relpath(path, os.getcwdu()) + except ValueError: + # Many reasons why this might be impossible: use an absolute path. + path = os.path.abspath(path) + self._executable = path return self._executable @property def jython_home(self): if hasattr(self, "_jython_home"): return self._jython_home - if "JYTHON_HOME" in os.environ: - self._jython_home = os.environ["JYTHON_HOME"] - else: - self._jython_home = os.path.dirname(os.path.dirname(self.executable)) - if self.uname == "cygwin": - self._jython_home = subprocess.check_output(["cygpath", "--windows", self._jython_home]).strip() + self._jython_home = get_env("JYTHON_HOME") or os.path.dirname( + os.path.dirname(self.executable)) + if self.uname == u"cygwin": + # Even on Cygwin, we need a Windows-style path for this + home = unicode_subprocess(["cygpath", "--windows", home]) return self._jython_home @property def jython_opts(): - return os.environ.get("JYTHON_OPTS", "") + return get_env("JYTHON_OPTS", "") @property def classpath_delimiter(self): @@ -179,11 +238,9 @@ else: jars.append(os.path.join(self.jython_home, "javalib", "*")) elif not os.path.exists(os.path.join(self.jython_home, "jython.jar")): - bad_option("""{jython_home} contains neither jython-dev.jar nor jython.jar. + bad_option(u"""{} contains neither jython-dev.jar nor jython.jar. Try running this script from the 'bin' directory of an installed Jython or -setting {envvar_specifier}JYTHON_HOME.""".format( - jython_home=self.jython_home, - envvar_specifier="%" if self.uname == "windows" else "$")) +setting JYTHON_HOME.""".format(self.jython_home)) else: jars = [os.path.join(self.jython_home, "jython.jar")] self._jython_jars = jars @@ -194,14 +251,14 @@ if hasattr(self.args, "classpath"): return self.args.classpath else: - return os.environ.get("CLASSPATH", ".") + return get_env("CLASSPATH", ".") @property def java_mem(self): if hasattr(self.args, "mem"): return self.args.mem else: - return os.environ.get("JAVA_MEM", "-Xmx512m") + return get_env("JAVA_MEM", "-Xmx512m") @property def java_stack(self): @@ -213,7 +270,7 @@ @property def java_opts(self): return [self.java_mem, self.java_stack] - + @property def java_profile_agent(self): return os.path.join(self.jython_home, "javalib", "profile.jar") @@ -222,68 +279,84 @@ if "JAVA_ENCODING" not in os.environ and self.uname == "darwin" and "file.encoding" not in self.args.properties: self.args.properties["file.encoding"] = "UTF-8" - def convert(self, arg): - if sys.stdout.encoding: - return arg.encode(sys.stdout.encoding) - else: - return arg - def make_classpath(self, jars): return self.classpath_delimiter.join(jars) def convert_path(self, arg): - if self.uname == "cygwin": - if not arg.startswith("/cygdrive/"): - new_path = self.convert(arg).replace("/", "\\") + if self.uname == u"cygwin": + if not arg.startswith(u"/cygdrive/"): + return arg.replace(u"/", u"\\") else: - new_path = subprocess.check_output(["cygpath", "-pw", self.convert(arg)]).strip() - return new_path + arg = arg.replace('*', r'\*') # prevent globbing + return unicode_subprocess(["cygpath", "-pw", arg]) else: - return self.convert(arg) + return arg + + def unicode_subprocess(self, unicode_command): + """ Launch a command with subprocess.check_output() and read the + output, except everything is expected to be in Unicode. + """ + cmd = [] + for c in unicode_command: + if isinstance(c, bytes): + cmd.append(c) + else: + cmd.append(c.encode(ENCODING)) + return subprocess.check_output(cmd).strip().decode(ENCODING) @property def command(self): + # Set default file encoding for just for Darwin (?) self.set_encoding() + + # Begin to build the Java part of the ultimate command args = [self.java_command] args.extend(self.java_opts) args.extend(self.args.java) + # Get the class path right (depends on --boot) classpath = self.java_classpath jython_jars = self.jython_jars if self.args.boot: - args.append("-Xbootclasspath/a:%s" % self.convert_path(self.make_classpath(jython_jars))) + args.append(u"-Xbootclasspath/a:%s" % self.convert_path(self.make_classpath(jython_jars))) else: classpath = self.make_classpath(jython_jars) + self.classpath_delimiter + classpath - args.extend(["-classpath", self.convert_path(classpath)]) + args.extend([u"-classpath", self.convert_path(classpath)]) if "python.home" not in self.args.properties: - args.append("-Dpython.home=%s" % self.convert_path(self.jython_home)) + args.append(u"-Dpython.home=%s" % self.convert_path(self.jython_home)) if "python.executable" not in self.args.properties: - args.append("-Dpython.executable=%s" % self.convert_path(self.executable)) + args.append(u"-Dpython.executable=%s" % self.convert_path(self.executable)) if "python.launcher.uname" not in self.args.properties: - args.append("-Dpython.launcher.uname=%s" % self.uname) - # Determines whether running on a tty for the benefit of + args.append(u"-Dpython.launcher.uname=%s" % self.uname) + + # Determine whether running on a tty for the benefit of # running on Cygwin. This step is needed because the Mintty # terminal emulator doesn't behave like a standard Microsoft # Windows tty, and so JNR Posix doesn't detect it properly. if "python.launcher.tty" not in self.args.properties: - args.append("-Dpython.launcher.tty=%s" % str(os.isatty(sys.stdin.fileno())).lower()) - if self.uname == "cygwin" and "python.console" not in self.args.properties: - args.append("-Dpython.console=org.python.core.PlainConsole") + args.append(u"-Dpython.launcher.tty=%s" % str(os.isatty(sys.stdin.fileno())).lower()) + if self.uname == u"cygwin" and "python.console" not in self.args.properties: + args.append(u"-Dpython.console=org.python.core.PlainConsole") + if self.args.profile: - args.append("-XX:-UseSplitVerifier") - args.append("-javaagent:%s" % self.convert_path(self.java_profile_agent)) + args.append(u"-XX:-UseSplitVerifier") + args.append(u"-javaagent:%s" % self.convert_path(self.java_profile_agent)) + for k, v in self.args.properties.iteritems(): - args.append("-D%s=%s" % (self.convert(k), self.convert(v))) - args.append("org.python.util.jython") + args.append(u"-D%s=%s" % (k, v)) + + args.append(u"org.python.util.jython") + if self.args.help: - args.append("--help") + args.append(u"--help") + args.extend(self.jython_args) return args def bad_option(msg): - print >> sys.stderr, """ + print >> sys.stderr, u""" {msg} usage: jython [option] ... [-c cmd | -m mod | file | -] [arg] ... Try `jython -h' for more information. @@ -312,19 +385,24 @@ """ def support_java_opts(args): + """ Generator from options intended for the JVM. Options beginning -D go + through unchanged, others are prefixed with -J. + """ + # Input is expected to be Unicode, but just in case ... + if isinstance(args, bytes): args = args.decode(ENCODING) it = iter(args) while it: arg = next(it) - if arg.startswith("-D"): + if arg.startswith(u"-D"): yield arg - elif arg in ("-classpath", "-cp"): - yield "-J" + arg + elif arg in (u"-classpath", u"-cp"): + yield u"-J" + arg try: yield next(it) except StopIteration: bad_option("Argument expected for -classpath option in JAVA_OPTS") else: - yield "-J" + arg + yield u"-J" + arg # copied from subprocess module in Jython; see @@ -378,37 +456,36 @@ return argv - -def decode_args(sys_args): - args = [sys_args[0]] - - def get_env_opts(envvar): - opts = os.environ.get(envvar, "") - if is_windows: - return cmdline2list(opts) - else: - return shlex.split(opts) - - java_opts = get_env_opts("JAVA_OPTS") - jython_opts = get_env_opts("JYTHON_OPTS") - - args.extend(support_java_opts(java_opts)) - args.extend(sys_args[1:]) - - if sys.stdout.encoding: - if sys.stdout.encoding.lower() == "cp65001": - sys.exit("""Jython does not support code page 65001 (CP_UTF8). -Please try another code page by setting it with the chcp command.""") - args = [arg.decode(sys.stdout.encoding) for arg in args] - jython_opts = [arg.decode(sys.stdout.encoding) for arg in jython_opts] - - return args, jython_opts - +def get_env_opts(envvar): + """ Return a list of the values in the named environment variable, + split according to shell conventions, and decoded to Unicode. + """ + opts = os.environ.get(envvar, "") # bytes at this point + if is_windows: + opts = cmdline2list(opts) + else: + opts = shlex.split(opts) + return decode_list(opts) def main(sys_args): - sys_args, jython_opts = decode_args(sys_args) + # The entire program must work in Unicode + sys_args = decode_list(sys_args) + + # sys_args[0] is this script (which we'll replace with 'java' eventually). + # Insert options for the java command from the environment. + sys_args[1:1] = support_java_opts(get_env_opts("JAVA_OPTS")) + + # Parse the composite arguments (yes, even the ones from JAVA_OPTS), + # and return the "unparsed" tail considered arguments for Jython itself. args, jython_args = parse_launcher_args(sys_args) + + # Build the data from which we can generate the command ultimately. + # Jython options supplied from the environment stand in front of the + # unparsed tail from the command line. + jython_opts = get_env_opts("JYTHON_OPTS") jython_command = JythonCommand(args, jython_opts + jython_args) + + # This is the "fully adjusted" command to launch, but still as Unicode. command = jython_command.command if args.profile and not args.help: @@ -416,23 +493,32 @@ os.unlink("profile.txt") except OSError: pass + if args.print_requested and not args.help: - if jython_command.uname == "windows": - print subprocess.list2cmdline(jython_command.command) + if jython_command.uname == u"windows": + # Add escapes and quotes necessary to Windows. + # Normally used for a byte strings but Python is tolerant :) + command_line = subprocess.list2cmdline(command) else: - print " ".join(pipes.quote(arg) for arg in jython_command.command) + # Just concatenate with spaces + command_line = u" ".join(command) + # It is possible the Unicode cannot be encoded for the console + enc = sys.stdout.encoding or 'ascii' + sys.stdout.write(command_line.encode(enc, 'replace')) else: - if not (is_windows or not hasattr(os, "execvp") or args.help or jython_command.uname == "cygwin"): + if not (is_windows or not hasattr(os, "execvp") or args.help or + jython_command.uname == u"cygwin"): # Replace this process with the java process. # # NB such replacements actually do not work under Windows, # but if tried, they also fail very badly by hanging. # So don't even try! + command = encode_list(command) os.execvp(command[0], command[1:]) else: result = 1 try: - result = subprocess.call(command) + result = subprocess.call(encode_list(command)) if args.help: print_help() except KeyboardInterrupt: -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:06:50 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:06:50 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Rework_launcher_jython=2Ep?= =?utf-8?q?y_to_allow_for_non-ascii_paths_on_Windows=2E?= Message-ID: <20170521090143.40106.B76D4E5A4C7B609D@psf.io> https://hg.python.org/jython/rev/977e34a69fda changeset: 8083:977e34a69fda user: Jeff Allen date: Sun Apr 16 23:31:23 2017 +0100 summary: Rework launcher jython.py to allow for non-ascii paths on Windows. The launcher now works internally in Unicode. jython.exe has been regenerated from it using PyInstaller 3.2.1 in a virtualenv under Python 2.7.13. test_jython_launcher passes for a user "?preuve" on Windows and Cygwin as long as -S (don't import site) is given. Issue #2356 refers. files: Lib/test/test_jython_launcher.py | 8 +- src/shell/jython.exe | Bin src/shell/jython.py | 314 ++++++++++++------ 3 files changed, 203 insertions(+), 119 deletions(-) diff --git a/Lib/test/test_jython_launcher.py b/Lib/test/test_jython_launcher.py --- a/Lib/test/test_jython_launcher.py +++ b/Lib/test/test_jython_launcher.py @@ -31,7 +31,6 @@ # by the installer return executable - def get_uname(): _uname = None try: @@ -49,9 +48,8 @@ class TestLauncher(unittest.TestCase): - + def get_cmdline(self, cmd, env): - output = subprocess.check_output(cmd, env=env).rstrip() if is_windows: return subprocess._cmdline2list(output) @@ -76,7 +74,7 @@ k, v = arg[2:].split("=") props[k] = v return props - + def test_classpath_env(self): env = self.get_newenv() env["CLASSPATH"] = some_jar @@ -207,7 +205,7 @@ def test_file(self): self.assertCommand(['test.py']) - + def test_dash(self): self.assertCommand(['-i']) diff --git a/src/shell/jython.exe b/src/shell/jython.exe index 7c9cbe9eec239c5768c17f873726220b09966341..b7500204c603274a6bdb9ec15064bd27f31c14ac GIT binary patch [stripped] diff --git a/src/shell/jython.py b/src/shell/jython.py --- a/src/shell/jython.py +++ b/src/shell/jython.py @@ -20,19 +20,68 @@ is_windows = os.name == "nt" or (os.name == "java" and os._name == "nt") +# A note about encoding: +# +# A major motivation for this program is to launch Jython on Windows, where +# console and file encoding may be different. Command-line arguments and +# environment variables are presented in Python 2.7 as byte-data, encoded +# "somehow". It becomes important to know which decoding to use as soon as +# paths may contain non-ascii characters. It is not the console encoding. +# Experiment shows that sys.getfilesystemencoding() is generally applicable +# to arguments, environment variables and spawning a subprocess. +# +# On a Windows 10 box, this comes up with pseudo-codec 'mbcs'. This supports +# European accented characters pretty well. +# +# When localised to Chinese(simplified) the FS encoding mbcs includes many +# more points than cp936 (the console encoding), although it still struggles +# with European accented characters. + +ENCODING = sys.getfilesystemencoding() or "utf-8" + + +def get_env(envvar, default=None): + """ Return the named environment variable, decoded to Unicode.""" + v = os.environ.get(envvar, default) + # Tolerate default given as bytes, as we're bound to forget sometimes + if isinstance(v, bytes): + v = v.decode(ENCODING) + # Remove quotes sometimes necessary around the value + if v is not None and v.startswith('"') and v.endswith('"'): + v = v[1:-1] + return v + +def encode_list(args, encoding=ENCODING): + """ Convert list of Unicode strings to list of encoded byte strings.""" + r = [] + for a in args: + if not isinstance(a, bytes): a = a.encode(encoding) + r.append(a) + return r + +def decode_list(args, encoding=ENCODING): + """ Convert list of byte strings to list of Unicode strings.""" + r = [] + for a in args: + if not isinstance(a, unicode): a = a.decode(encoding) + r.append(a) + return r def parse_launcher_args(args): + """ Process the given argument list into two objects, the first part being + a namespace of checked arguments to the interpreter itself, and the rest + being the Python program it will run and its arguments. + """ class Namespace(object): pass parsed = Namespace() - parsed.java = [] - parsed.properties = OrderedDict() - parsed.boot = False - parsed.jdb = False - parsed.help = False - parsed.print_requested = False - parsed.profile = False - parsed.jdb = None + parsed.boot = False # --boot flag given + parsed.jdb = False # --jdb flag given + parsed.help = False # --help or -h flag given + parsed.print_requested = False # --print flag given + parsed.profile = False # --profile flag given + parsed.properties = OrderedDict() # properties to give the JVM + parsed.java = [] # any other arguments to give the JVM it = iter(args) next(it) # ignore sys.argv[0] @@ -42,11 +91,11 @@ arg = next(it) except StopIteration: break - if arg.startswith("-D"): - k, v = arg[2:].split("=") + if arg.startswith(u"-D"): + k, v = arg[2:].split(u"=") parsed.properties[k] = v i += 1 - elif arg in ("-J-classpath", "-J-cp"): + elif arg in (u"-J-classpath", u"-J-cp"): try: next_arg = next(it) except StopIteration: @@ -55,24 +104,24 @@ bad_option("Bad option for -J-classpath") parsed.classpath = next_arg i += 2 - elif arg.startswith("-J-Xmx"): + elif arg.startswith(u"-J-Xmx"): parsed.mem = arg[2:] i += 1 - elif arg.startswith("-J-Xss"): + elif arg.startswith(u"-J-Xss"): parsed.stack = arg[2:] i += 1 - elif arg.startswith("-J"): + elif arg.startswith(u"-J"): parsed.java.append(arg[2:]) i += 1 - elif arg == "--print": + elif arg == u"--print": parsed.print_requested = True i += 1 - elif arg in ("-h", "--help"): + elif arg in (u"-h", u"--help"): parsed.help = True - elif arg in ("--boot", "--jdb", "--profile"): + elif arg in (u"--boot", u"--jdb", u"--profile"): setattr(parsed, arg[2:], True) i += 1 - elif arg == "--": + elif arg == u"--": i += 1 break else: @@ -92,13 +141,13 @@ if hasattr(self, "_uname"): return self._uname if is_windows: - self._uname = "windows" + self._uname = u"windows" else: uname = subprocess.check_output(["uname"]).strip().lower() if uname.startswith("cygwin"): - self._uname = "cygwin" + self._uname = u"cygwin" else: - self._uname = uname + self._uname = uname.decode(ENCODING) return self._uname @property @@ -114,22 +163,23 @@ return self._java_command def setup_java_command(self): + """ Sets java_home and java_command according to environment and parsed + launcher arguments --jdb and --help. + """ if self.args.help: self._java_home = None - self._java_command = "java" + self._java_command = u"java" return - - if "JAVA_HOME" not in os.environ: - self._java_home = None - self._java_command = "jdb" if self.args.jdb else "java" + + command = u"jdb" if self.args.jdb else u"java" + + self._java_home = get_env("JAVA_HOME") + if self._java_home is None or self.uname == u"cygwin": + # Assume java or jdb on the path + self._java_command = command else: - self._java_home = os.environ["JAVA_HOME"] - if self.uname == "cygwin": - self._java_command = "jdb" if self.args.jdb else "java" - else: - self._java_command = os.path.join( - self.java_home, "bin", - "jdb" if self.args.jdb else "java") + # Assume java or jdb in JAVA_HOME/bin + self._java_command = os.path.join(self._java_home, u"bin", command) @property def executable(self): @@ -139,28 +189,37 @@ # Modified from # http://stackoverflow.com/questions/3718657/how-to-properly-determine-current-script-directory-in-python/22881871#22881871 if getattr(sys, "frozen", False): # py2exe, PyInstaller, cx_Freeze - path = os.path.abspath(sys.executable) + # Frozen. Let it go with the executable path. + bytes_path = sys.executable else: - def inspect_this(): pass - path = inspect.getabsfile(inspect_this) - self._executable = os.path.realpath(path) + # Not frozen. Any object defined in this file will do. + bytes_path = inspect.getfile(JythonCommand) + # Python 2 thinks in bytes. Carefully normalise in Unicode. + path = os.path.realpath(bytes_path.decode(ENCODING)) + try: + # If possible, make this relative to the CWD. + # This helps manage multi-byte names in installation location. + path = os.path.relpath(path, os.getcwdu()) + except ValueError: + # Many reasons why this might be impossible: use an absolute path. + path = os.path.abspath(path) + self._executable = path return self._executable @property def jython_home(self): if hasattr(self, "_jython_home"): return self._jython_home - if "JYTHON_HOME" in os.environ: - self._jython_home = os.environ["JYTHON_HOME"] - else: - self._jython_home = os.path.dirname(os.path.dirname(self.executable)) - if self.uname == "cygwin": - self._jython_home = subprocess.check_output(["cygpath", "--windows", self._jython_home]).strip() + self._jython_home = get_env("JYTHON_HOME") or os.path.dirname( + os.path.dirname(self.executable)) + if self.uname == u"cygwin": + # Even on Cygwin, we need a Windows-style path for this + home = unicode_subprocess(["cygpath", "--windows", home]) return self._jython_home @property def jython_opts(): - return os.environ.get("JYTHON_OPTS", "") + return get_env("JYTHON_OPTS", "") @property def classpath_delimiter(self): @@ -179,11 +238,9 @@ else: jars.append(os.path.join(self.jython_home, "javalib", "*")) elif not os.path.exists(os.path.join(self.jython_home, "jython.jar")): - bad_option("""{jython_home} contains neither jython-dev.jar nor jython.jar. + bad_option(u"""{} contains neither jython-dev.jar nor jython.jar. Try running this script from the 'bin' directory of an installed Jython or -setting {envvar_specifier}JYTHON_HOME.""".format( - jython_home=self.jython_home, - envvar_specifier="%" if self.uname == "windows" else "$")) +setting JYTHON_HOME.""".format(self.jython_home)) else: jars = [os.path.join(self.jython_home, "jython.jar")] self._jython_jars = jars @@ -194,14 +251,14 @@ if hasattr(self.args, "classpath"): return self.args.classpath else: - return os.environ.get("CLASSPATH", ".") + return get_env("CLASSPATH", ".") @property def java_mem(self): if hasattr(self.args, "mem"): return self.args.mem else: - return os.environ.get("JAVA_MEM", "-Xmx512m") + return get_env("JAVA_MEM", "-Xmx512m") @property def java_stack(self): @@ -213,7 +270,7 @@ @property def java_opts(self): return [self.java_mem, self.java_stack] - + @property def java_profile_agent(self): return os.path.join(self.jython_home, "javalib", "profile.jar") @@ -222,68 +279,84 @@ if "JAVA_ENCODING" not in os.environ and self.uname == "darwin" and "file.encoding" not in self.args.properties: self.args.properties["file.encoding"] = "UTF-8" - def convert(self, arg): - if sys.stdout.encoding: - return arg.encode(sys.stdout.encoding) - else: - return arg - def make_classpath(self, jars): return self.classpath_delimiter.join(jars) def convert_path(self, arg): - if self.uname == "cygwin": - if not arg.startswith("/cygdrive/"): - new_path = self.convert(arg).replace("/", "\\") + if self.uname == u"cygwin": + if not arg.startswith(u"/cygdrive/"): + return arg.replace(u"/", u"\\") else: - new_path = subprocess.check_output(["cygpath", "-pw", self.convert(arg)]).strip() - return new_path + arg = arg.replace('*', r'\*') # prevent globbing + return unicode_subprocess(["cygpath", "-pw", arg]) else: - return self.convert(arg) + return arg + + def unicode_subprocess(self, unicode_command): + """ Launch a command with subprocess.check_output() and read the + output, except everything is expected to be in Unicode. + """ + cmd = [] + for c in unicode_command: + if isinstance(c, bytes): + cmd.append(c) + else: + cmd.append(c.encode(ENCODING)) + return subprocess.check_output(cmd).strip().decode(ENCODING) @property def command(self): + # Set default file encoding for just for Darwin (?) self.set_encoding() + + # Begin to build the Java part of the ultimate command args = [self.java_command] args.extend(self.java_opts) args.extend(self.args.java) + # Get the class path right (depends on --boot) classpath = self.java_classpath jython_jars = self.jython_jars if self.args.boot: - args.append("-Xbootclasspath/a:%s" % self.convert_path(self.make_classpath(jython_jars))) + args.append(u"-Xbootclasspath/a:%s" % self.convert_path(self.make_classpath(jython_jars))) else: classpath = self.make_classpath(jython_jars) + self.classpath_delimiter + classpath - args.extend(["-classpath", self.convert_path(classpath)]) + args.extend([u"-classpath", self.convert_path(classpath)]) if "python.home" not in self.args.properties: - args.append("-Dpython.home=%s" % self.convert_path(self.jython_home)) + args.append(u"-Dpython.home=%s" % self.convert_path(self.jython_home)) if "python.executable" not in self.args.properties: - args.append("-Dpython.executable=%s" % self.convert_path(self.executable)) + args.append(u"-Dpython.executable=%s" % self.convert_path(self.executable)) if "python.launcher.uname" not in self.args.properties: - args.append("-Dpython.launcher.uname=%s" % self.uname) - # Determines whether running on a tty for the benefit of + args.append(u"-Dpython.launcher.uname=%s" % self.uname) + + # Determine whether running on a tty for the benefit of # running on Cygwin. This step is needed because the Mintty # terminal emulator doesn't behave like a standard Microsoft # Windows tty, and so JNR Posix doesn't detect it properly. if "python.launcher.tty" not in self.args.properties: - args.append("-Dpython.launcher.tty=%s" % str(os.isatty(sys.stdin.fileno())).lower()) - if self.uname == "cygwin" and "python.console" not in self.args.properties: - args.append("-Dpython.console=org.python.core.PlainConsole") + args.append(u"-Dpython.launcher.tty=%s" % str(os.isatty(sys.stdin.fileno())).lower()) + if self.uname == u"cygwin" and "python.console" not in self.args.properties: + args.append(u"-Dpython.console=org.python.core.PlainConsole") + if self.args.profile: - args.append("-XX:-UseSplitVerifier") - args.append("-javaagent:%s" % self.convert_path(self.java_profile_agent)) + args.append(u"-XX:-UseSplitVerifier") + args.append(u"-javaagent:%s" % self.convert_path(self.java_profile_agent)) + for k, v in self.args.properties.iteritems(): - args.append("-D%s=%s" % (self.convert(k), self.convert(v))) - args.append("org.python.util.jython") + args.append(u"-D%s=%s" % (k, v)) + + args.append(u"org.python.util.jython") + if self.args.help: - args.append("--help") + args.append(u"--help") + args.extend(self.jython_args) return args def bad_option(msg): - print >> sys.stderr, """ + print >> sys.stderr, u""" {msg} usage: jython [option] ... [-c cmd | -m mod | file | -] [arg] ... Try `jython -h' for more information. @@ -312,19 +385,24 @@ """ def support_java_opts(args): + """ Generator from options intended for the JVM. Options beginning -D go + through unchanged, others are prefixed with -J. + """ + # Input is expected to be Unicode, but just in case ... + if isinstance(args, bytes): args = args.decode(ENCODING) it = iter(args) while it: arg = next(it) - if arg.startswith("-D"): + if arg.startswith(u"-D"): yield arg - elif arg in ("-classpath", "-cp"): - yield "-J" + arg + elif arg in (u"-classpath", u"-cp"): + yield u"-J" + arg try: yield next(it) except StopIteration: bad_option("Argument expected for -classpath option in JAVA_OPTS") else: - yield "-J" + arg + yield u"-J" + arg # copied from subprocess module in Jython; see @@ -378,37 +456,36 @@ return argv - -def decode_args(sys_args): - args = [sys_args[0]] - - def get_env_opts(envvar): - opts = os.environ.get(envvar, "") - if is_windows: - return cmdline2list(opts) - else: - return shlex.split(opts) - - java_opts = get_env_opts("JAVA_OPTS") - jython_opts = get_env_opts("JYTHON_OPTS") - - args.extend(support_java_opts(java_opts)) - args.extend(sys_args[1:]) - - if sys.stdout.encoding: - if sys.stdout.encoding.lower() == "cp65001": - sys.exit("""Jython does not support code page 65001 (CP_UTF8). -Please try another code page by setting it with the chcp command.""") - args = [arg.decode(sys.stdout.encoding) for arg in args] - jython_opts = [arg.decode(sys.stdout.encoding) for arg in jython_opts] - - return args, jython_opts - +def get_env_opts(envvar): + """ Return a list of the values in the named environment variable, + split according to shell conventions, and decoded to Unicode. + """ + opts = os.environ.get(envvar, "") # bytes at this point + if is_windows: + opts = cmdline2list(opts) + else: + opts = shlex.split(opts) + return decode_list(opts) def main(sys_args): - sys_args, jython_opts = decode_args(sys_args) + # The entire program must work in Unicode + sys_args = decode_list(sys_args) + + # sys_args[0] is this script (which we'll replace with 'java' eventually). + # Insert options for the java command from the environment. + sys_args[1:1] = support_java_opts(get_env_opts("JAVA_OPTS")) + + # Parse the composite arguments (yes, even the ones from JAVA_OPTS), + # and return the "unparsed" tail considered arguments for Jython itself. args, jython_args = parse_launcher_args(sys_args) + + # Build the data from which we can generate the command ultimately. + # Jython options supplied from the environment stand in front of the + # unparsed tail from the command line. + jython_opts = get_env_opts("JYTHON_OPTS") jython_command = JythonCommand(args, jython_opts + jython_args) + + # This is the "fully adjusted" command to launch, but still as Unicode. command = jython_command.command if args.profile and not args.help: @@ -416,23 +493,32 @@ os.unlink("profile.txt") except OSError: pass + if args.print_requested and not args.help: - if jython_command.uname == "windows": - print subprocess.list2cmdline(jython_command.command) + if jython_command.uname == u"windows": + # Add escapes and quotes necessary to Windows. + # Normally used for a byte strings but Python is tolerant :) + command_line = subprocess.list2cmdline(command) else: - print " ".join(pipes.quote(arg) for arg in jython_command.command) + # Just concatenate with spaces + command_line = u" ".join(command) + # It is possible the Unicode cannot be encoded for the console + enc = sys.stdout.encoding or 'ascii' + sys.stdout.write(command_line.encode(enc, 'replace')) else: - if not (is_windows or not hasattr(os, "execvp") or args.help or jython_command.uname == "cygwin"): + if not (is_windows or not hasattr(os, "execvp") or args.help or + jython_command.uname == u"cygwin"): # Replace this process with the java process. # # NB such replacements actually do not work under Windows, # but if tried, they also fail very badly by hanging. # So don't even try! + command = encode_list(command) os.execvp(command[0], command[1:]) else: result = 1 try: - result = subprocess.call(command) + result = subprocess.call(encode_list(command)) if args.help: print_help() except KeyboardInterrupt: -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:06:50 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:06:50 +0000 Subject: [Jython-checkins] =?utf-8?q?jython_=28merge_default_-=3E_default?= =?utf-8?q?=29=3A_Merge_UTF-8_file_system_encoding_changes_to_trunk_=28fix?= =?utf-8?q?es_=231839=2C_=232356=29?= Message-ID: <20170521090148.39614.CC09A201A54220C7@psf.io> https://hg.python.org/jython/rev/3a62f20cc160 changeset: 8092:3a62f20cc160 parent: 8081:c382818607a0 parent: 8091:f4a6679623d7 user: Jeff Allen date: Sun May 21 08:49:49 2017 +0100 summary: Merge UTF-8 file system encoding changes to trunk (fixes #1839, #2356) files: CPythonLib.includes | 1 + Lib/encodings/_java.py | 8 +- Lib/javashell.py | 2 +- Lib/lib2to3/tests/test_main.py | 155 ++ Lib/ntpath.py | 560 ---------- Lib/subprocess.py | 38 +- Lib/sysconfig.py | 6 + Lib/test/regrtest.py | 1 - Lib/test/script_helper.py | 6 +- Lib/test/test_bytecodetools_jy.py | 10 +- Lib/test/test_exceptions.py | 3 - Lib/test/test_exceptions_jy.py | 5 +- Lib/test/test_httpservers.py | 3 + Lib/test/test_io.py | 1 + Lib/test/test_java_integration.py | 15 +- Lib/test/test_java_visibility.py | 11 +- Lib/test/test_jser.py | 4 +- Lib/test/test_jython_launcher.py | 8 +- Lib/test/test_os_jy.py | 85 +- Lib/test/test_runpy.py | 402 ------- Lib/test/test_ssl.py | 8 +- Lib/test/test_support.py | 9 +- Lib/test/test_sys.py | 2 - Lib/test/test_zipimport_jy.py | 6 +- Lib/test/test_zipimport_support.py | 20 +- build.xml | 3 + src/org/python/core/Py.java | 350 +++++- src/org/python/core/PyBaseCode.java | 8 +- src/org/python/core/PyBaseException.java | 17 +- src/org/python/core/PyBytecode.java | 9 +- src/org/python/core/PyException.java | 25 +- src/org/python/core/PyFile.java | 6 +- src/org/python/core/PyJavaPackage.java | 8 +- src/org/python/core/PyLong.java | 3 + src/org/python/core/PyNullImporter.java | 13 +- src/org/python/core/PyString.java | 53 +- src/org/python/core/PySyntaxError.java | 8 +- src/org/python/core/PySystemState.java | 66 +- src/org/python/core/PyTableCode.java | 6 +- src/org/python/core/PyUnicode.java | 4 +- src/org/python/core/StdoutWrapper.java | 35 +- src/org/python/core/SyspathArchive.java | 7 +- src/org/python/core/SyspathJavaLoader.java | 55 +- src/org/python/core/__builtin__.java | 8 +- src/org/python/core/imp.java | 29 +- src/org/python/core/io/FileIO.java | 13 +- src/org/python/core/packagecache/PathPackageManager.java | 20 +- src/org/python/modules/_imp.java | 82 +- src/org/python/modules/_py_compile.java | 36 +- src/org/python/modules/posix/PosixModule.java | 30 +- src/org/python/modules/zipimport/zipimporter.java | 19 +- src/org/python/util/jython.java | 10 +- src/shell/jython.exe | Bin src/shell/jython.py | 314 +++-- 54 files changed, 1160 insertions(+), 1446 deletions(-) diff --git a/CPythonLib.includes b/CPythonLib.includes --- a/CPythonLib.includes +++ b/CPythonLib.includes @@ -110,6 +110,7 @@ netrc.py nntplib.py numbers.py +ntpath.py nturl2path.py opcode.py optparse.py diff --git a/Lib/encodings/_java.py b/Lib/encodings/_java.py --- a/Lib/encodings/_java.py +++ b/Lib/encodings/_java.py @@ -162,12 +162,16 @@ def reset(self): self.buffer = "" + self.decoder.reset() def getstate(self): - return self.buffer or 0 + # No way to extract the internal state of a Java decoder. + return self.buffer or "", 0 def setstate(self, state): - self.buffer = state or "" + self.buffer, _ = state or ("", 0) + # No way to restore: reset possible EOF state. + self.decoder.reset() class StreamWriter(NonfinalCodec, codecs.StreamWriter): diff --git a/Lib/javashell.py b/Lib/javashell.py --- a/Lib/javashell.py +++ b/Lib/javashell.py @@ -55,7 +55,7 @@ env = self._formatEnvironment( self.environment ) try: - p = Runtime.getRuntime().exec( shellCmd, env, File(os.getcwd()) ) + p = Runtime.getRuntime().exec( shellCmd, env, File(os.getcwdu()) ) return p except IOException, ex: raise OSError( diff --git a/Lib/lib2to3/tests/test_main.py b/Lib/lib2to3/tests/test_main.py new file mode 100644 --- /dev/null +++ b/Lib/lib2to3/tests/test_main.py @@ -0,0 +1,155 @@ +# -*- coding: utf-8 -*- +import sys +import codecs +import logging +import os +import re +import shutil +import StringIO +import sys +import tempfile +import unittest + +from lib2to3 import main + + +TEST_DATA_DIR = os.path.join(os.path.dirname(__file__), "data") +PY2_TEST_MODULE = os.path.join(TEST_DATA_DIR, "py2_test_grammar.py") + + +class TestMain(unittest.TestCase): + + if not hasattr(unittest.TestCase, 'assertNotRegex'): + # This method was only introduced in 3.2. + def assertNotRegex(self, text, regexp, msg=None): + import re + if not hasattr(regexp, 'search'): + regexp = re.compile(regexp) + if regexp.search(text): + self.fail("regexp %s MATCHED text %r" % (regexp.pattern, text)) + + def setUp(self): + self.temp_dir = None # tearDown() will rmtree this directory if set. + + def tearDown(self): + # Clean up logging configuration down by main. + del logging.root.handlers[:] + if self.temp_dir: + shutil.rmtree(self.temp_dir) + + def run_2to3_capture(self, args, in_capture, out_capture, err_capture): + save_stdin = sys.stdin + save_stdout = sys.stdout + save_stderr = sys.stderr + sys.stdin = in_capture + sys.stdout = out_capture + sys.stderr = err_capture + try: + return main.main("lib2to3.fixes", args) + finally: + sys.stdin = save_stdin + sys.stdout = save_stdout + sys.stderr = save_stderr + + def test_unencodable_diff(self): + input_stream = StringIO.StringIO(u"print 'nothing'\nprint u'?ber'\n") + out = StringIO.StringIO() + out_enc = codecs.getwriter("ascii")(out) + err = StringIO.StringIO() + ret = self.run_2to3_capture(["-"], input_stream, out_enc, err) + self.assertEqual(ret, 0) + output = out.getvalue() + self.assertTrue("-print 'nothing'" in output) + self.assertTrue("WARNING: couldn't encode 's diff for " + "your terminal" in err.getvalue()) + + def setup_test_source_trees(self): + """Setup a test source tree and output destination tree.""" + self.temp_dir = tempfile.mkdtemp() # tearDown() cleans this up. + + # Make the directory names unicode, in case the temporary directory has + # a non-ascii name, since refactor.py uses unicode strings internally. + # (Added for Jython but is test failure in CPython 2.7.13 too.) + self.temp_dir = self.temp_dir.decode(sys.getfilesystemencoding()) + + self.py2_src_dir = os.path.join(self.temp_dir, "python2_project") + self.py3_dest_dir = os.path.join(self.temp_dir, "python3_project") + os.mkdir(self.py2_src_dir) + os.mkdir(self.py3_dest_dir) + # Turn it into a package with a few files. + self.setup_files = [] + open(os.path.join(self.py2_src_dir, "__init__.py"), "w").close() + self.setup_files.append("__init__.py") + shutil.copy(PY2_TEST_MODULE, self.py2_src_dir) + self.setup_files.append(os.path.basename(PY2_TEST_MODULE)) + self.trivial_py2_file = os.path.join(self.py2_src_dir, "trivial.py") + self.init_py2_file = os.path.join(self.py2_src_dir, "__init__.py") + with open(self.trivial_py2_file, "w") as trivial: + trivial.write("print 'I need a simple conversion.'") + self.setup_files.append("trivial.py") + + def test_filename_changing_on_output_single_dir(self): + """2to3 a single directory with a new output dir and suffix.""" + self.setup_test_source_trees() + out = StringIO.StringIO() + err = StringIO.StringIO() + suffix = "TEST" + ret = self.run_2to3_capture( + ["-n", "--add-suffix", suffix, "--write-unchanged-files", + "--no-diffs", "--output-dir", + self.py3_dest_dir, self.py2_src_dir], + StringIO.StringIO(""), out, err) + self.assertEqual(ret, 0) + stderr = err.getvalue() + self.assertIn(" implies -w.", stderr) + self.assertIn( + "Output in %r will mirror the input directory %r layout" % ( + self.py3_dest_dir, self.py2_src_dir), stderr) + self.assertEqual(set(name+suffix for name in self.setup_files), + set(os.listdir(self.py3_dest_dir))) + for name in self.setup_files: + self.assertIn("Writing converted %s to %s" % ( + os.path.join(self.py2_src_dir, name), + os.path.join(self.py3_dest_dir, name+suffix)), stderr) + sep = re.escape(os.sep) + self.assertRegexpMatches( + stderr, r"No changes to .*/__init__\.py".replace("/", sep)) + self.assertNotRegex( + stderr, r"No changes to .*/trivial\.py".replace("/", sep)) + + def test_filename_changing_on_output_two_files(self): + """2to3 two files in one directory with a new output dir.""" + self.setup_test_source_trees() + err = StringIO.StringIO() + py2_files = [self.trivial_py2_file, self.init_py2_file] + expected_files = set(os.path.basename(name) for name in py2_files) + ret = self.run_2to3_capture( + ["-n", "-w", "--write-unchanged-files", + "--no-diffs", "--output-dir", self.py3_dest_dir] + py2_files, + StringIO.StringIO(""), StringIO.StringIO(), err) + self.assertEqual(ret, 0) + stderr = err.getvalue() + self.assertIn( + "Output in %r will mirror the input directory %r layout" % ( + self.py3_dest_dir, self.py2_src_dir), stderr) + self.assertEqual(expected_files, set(os.listdir(self.py3_dest_dir))) + + def test_filename_changing_on_output_single_file(self): + """2to3 a single file with a new output dir.""" + self.setup_test_source_trees() + err = StringIO.StringIO() + ret = self.run_2to3_capture( + ["-n", "-w", "--no-diffs", "--output-dir", self.py3_dest_dir, + self.trivial_py2_file], + StringIO.StringIO(""), StringIO.StringIO(), err) + self.assertEqual(ret, 0) + stderr = err.getvalue() + self.assertIn( + "Output in %r will mirror the input directory %r layout" % ( + self.py3_dest_dir, self.py2_src_dir), stderr) + self.assertEqual(set([os.path.basename(self.trivial_py2_file)]), + set(os.listdir(self.py3_dest_dir))) + + +if __name__ == '__main__': + unittest.main() diff --git a/Lib/ntpath.py b/Lib/ntpath.py deleted file mode 100644 --- a/Lib/ntpath.py +++ /dev/null @@ -1,560 +0,0 @@ -# Module 'ntpath' -- common operations on WinNT/Win95 pathnames -"""Common pathname manipulations, WindowsNT/95 version. - -Instead of importing this module directly, import os and refer to this -module as os.path. -""" - -import os -import sys -import stat -import genericpath -import warnings - -from genericpath import * - -__all__ = ["normcase","isabs","join","splitdrive","split","splitext", - "basename","dirname","commonprefix","getsize","getmtime", - "getatime","getctime", "islink","exists","lexists","isdir","isfile", - "ismount","walk","expanduser","expandvars","normpath","abspath", - "splitunc","curdir","pardir","sep","pathsep","defpath","altsep", - "extsep","devnull","realpath","supports_unicode_filenames","relpath"] - -# strings representing various path-related bits and pieces -curdir = '.' -pardir = '..' -extsep = '.' -sep = '\\' -pathsep = ';' -altsep = '/' -defpath = '.;C:\\bin' -if 'ce' in sys.builtin_module_names: - defpath = '\\Windows' -elif 'os2' in sys.builtin_module_names: - # OS/2 w/ VACPP - altsep = '/' -devnull = 'nul' - -# Normalize the case of a pathname and map slashes to backslashes. -# Other normalizations (such as optimizing '../' away) are not done -# (this is done by normpath). - -def normcase(s): - """Normalize case of pathname. - - Makes all characters lowercase and all slashes into backslashes.""" - return s.replace("/", "\\").lower() - - -# Return whether a path is absolute. -# Trivial in Posix, harder on the Mac or MS-DOS. -# For DOS it is absolute if it starts with a slash or backslash (current -# volume), or if a pathname after the volume letter and colon / UNC resource -# starts with a slash or backslash. - -def isabs(s): - """Test whether a path is absolute""" - s = splitdrive(s)[1] - return s != '' and s[:1] in '/\\' - - -# Join two (or more) paths. - -def join(a, *p): - """Join two or more pathname components, inserting "\\" as needed. - If any component is an absolute path, all previous path components - will be discarded.""" - path = a - for b in p: - b_wins = 0 # set to 1 iff b makes path irrelevant - if path == "": - b_wins = 1 - - elif isabs(b): - # This probably wipes out path so far. However, it's more - # complicated if path begins with a drive letter: - # 1. join('c:', '/a') == 'c:/a' - # 2. join('c:/', '/a') == 'c:/a' - # But - # 3. join('c:/a', '/b') == '/b' - # 4. join('c:', 'd:/') = 'd:/' - # 5. join('c:/', 'd:/') = 'd:/' - if path[1:2] != ":" or b[1:2] == ":": - # Path doesn't start with a drive letter, or cases 4 and 5. - b_wins = 1 - - # Else path has a drive letter, and b doesn't but is absolute. - elif len(path) > 3 or (len(path) == 3 and - path[-1] not in "/\\"): - # case 3 - b_wins = 1 - - if b_wins: - path = b - else: - # Join, and ensure there's a separator. - assert len(path) > 0 - if path[-1] in "/\\": - if b and b[0] in "/\\": - path += b[1:] - else: - path += b - elif path[-1] == ":": - path += b - elif b: - if b[0] in "/\\": - path += b - else: - path += "\\" + b - else: - # path is not empty and does not end with a backslash, - # but b is empty; since, e.g., split('a/') produces - # ('a', ''), it's best if join() adds a backslash in - # this case. - path += '\\' - - return path - - -# Split a path in a drive specification (a drive letter followed by a -# colon) and the path specification. -# It is always true that drivespec + pathspec == p -def splitdrive(p): - """Split a pathname into drive and path specifiers. Returns a 2-tuple -"(drive,path)"; either part may be empty""" - if p[1:2] == ':': - return p[0:2], p[2:] - return '', p - - -# Parse UNC paths -def splitunc(p): - """Split a pathname into UNC mount point and relative path specifiers. - - Return a 2-tuple (unc, rest); either part may be empty. - If unc is not empty, it has the form '//host/mount' (or similar - using backslashes). unc+rest is always the input path. - Paths containing drive letters never have an UNC part. - """ - if p[1:2] == ':': - return '', p # Drive letter present - firstTwo = p[0:2] - if firstTwo == '//' or firstTwo == '\\\\': - # is a UNC path: - # vvvvvvvvvvvvvvvvvvvv equivalent to drive letter - # \\machine\mountpoint\directories... - # directory ^^^^^^^^^^^^^^^ - normp = normcase(p) - index = normp.find('\\', 2) - if index == -1: - ##raise RuntimeError, 'illegal UNC path: "' + p + '"' - return ("", p) - index = normp.find('\\', index + 1) - if index == -1: - index = len(p) - return p[:index], p[index:] - return '', p - - -# Split a path in head (everything up to the last '/') and tail (the -# rest). After the trailing '/' is stripped, the invariant -# join(head, tail) == p holds. -# The resulting head won't end in '/' unless it is the root. - -def split(p): - """Split a pathname. - - Return tuple (head, tail) where tail is everything after the final slash. - Either part may be empty.""" - - d, p = splitdrive(p) - # set i to index beyond p's last slash - i = len(p) - while i and p[i-1] not in '/\\': - i = i - 1 - head, tail = p[:i], p[i:] # now tail has no slashes - # remove trailing slashes from head, unless it's all slashes - head2 = head - while head2 and head2[-1] in '/\\': - head2 = head2[:-1] - head = head2 or head - return d + head, tail - - -# Split a path in root and extension. -# The extension is everything starting at the last dot in the last -# pathname component; the root is everything before that. -# It is always true that root + ext == p. - -def splitext(p): - return genericpath._splitext(p, sep, altsep, extsep) -splitext.__doc__ = genericpath._splitext.__doc__ - - -# Return the tail (basename) part of a path. - -def basename(p): - """Returns the final component of a pathname""" - return split(p)[1] - - -# Return the head (dirname) part of a path. - -def dirname(p): - """Returns the directory component of a pathname""" - return split(p)[0] - -# Is a path a symbolic link? -# This will always return false on systems where posix.lstat doesn't exist. - -def islink(path): - """Test for symbolic link. - On WindowsNT/95 and OS/2 always returns false - """ - return False - -# alias exists to lexists -lexists = exists - -# Is a path a mount point? Either a root (with or without drive letter) -# or an UNC path with at most a / or \ after the mount point. - -def ismount(path): - """Test whether a path is a mount point (defined as root of drive)""" - unc, rest = splitunc(path) - if unc: - return rest in ("", "/", "\\") - p = splitdrive(path)[1] - return len(p) == 1 and p[0] in '/\\' - - -# Directory tree walk. -# For each directory under top (including top itself, but excluding -# '.' and '..'), func(arg, dirname, filenames) is called, where -# dirname is the name of the directory and filenames is the list -# of files (and subdirectories etc.) in the directory. -# The func may modify the filenames list, to implement a filter, -# or to impose a different order of visiting. - -def walk(top, func, arg): - """Directory tree walk with callback function. - - For each directory in the directory tree rooted at top (including top - itself, but excluding '.' and '..'), call func(arg, dirname, fnames). - dirname is the name of the directory, and fnames a list of the names of - the files and subdirectories in dirname (excluding '.' and '..'). func - may modify the fnames list in-place (e.g. via del or slice assignment), - and walk will only recurse into the subdirectories whose names remain in - fnames; this can be used to implement a filter, or to impose a specific - order of visiting. No semantics are defined for, or required of, arg, - beyond that arg is always passed to func. It can be used, e.g., to pass - a filename pattern, or a mutable object designed to accumulate - statistics. Passing None for arg is common.""" - warnings.warnpy3k("In 3.x, os.path.walk is removed in favor of os.walk.", - stacklevel=2) - try: - names = os.listdir(top) - except os.error: - return - func(arg, top, names) - for name in names: - name = join(top, name) - if isdir(name): - walk(name, func, arg) - - -# Expand paths beginning with '~' or '~user'. -# '~' means $HOME; '~user' means that user's home directory. -# If the path doesn't begin with '~', or if the user or $HOME is unknown, -# the path is returned unchanged (leaving error reporting to whatever -# function is called with the expanded path as argument). -# See also module 'glob' for expansion of *, ? and [...] in pathnames. -# (A function should also be defined to do full *sh-style environment -# variable expansion.) - -def expanduser(path): - """Expand ~ and ~user constructs. - - If user or $HOME is unknown, do nothing.""" - if path[:1] != '~': - return path - i, n = 1, len(path) - while i < n and path[i] not in '/\\': - i = i + 1 - - if 'HOME' in os.environ: - userhome = os.environ['HOME'] - elif 'USERPROFILE' in os.environ: - userhome = os.environ['USERPROFILE'] - elif not 'HOMEPATH' in os.environ: - return path - else: - try: - drive = os.environ['HOMEDRIVE'] - except KeyError: - drive = '' - userhome = join(drive, os.environ['HOMEPATH']) - - if i != 1: #~user - userhome = join(dirname(userhome), path[1:i]) - - return userhome + path[i:] - - -# Expand paths containing shell variable substitutions. -# The following rules apply: -# - no expansion within single quotes -# - '$$' is translated into '$' -# - '%%' is translated into '%' if '%%' are not seen in %var1%%var2% -# - ${varname} is accepted. -# - $varname is accepted. -# - %varname% is accepted. -# - varnames can be made out of letters, digits and the characters '_-' -# (though is not verifed in the ${varname} and %varname% cases) -# XXX With COMMAND.COM you can use any characters in a variable name, -# XXX except '^|<>='. - -def expandvars(path): - """Expand shell variables of the forms $var, ${var} and %var%. - - Unknown variables are left unchanged.""" - if '$' not in path and '%' not in path: - return path - import string - varchars = string.ascii_letters + string.digits + '_-' - res = '' - index = 0 - pathlen = len(path) - while index < pathlen: - c = path[index] - if c == '\'': # no expansion within single quotes - path = path[index + 1:] - pathlen = len(path) - try: - index = path.index('\'') - res = res + '\'' + path[:index + 1] - except ValueError: - res = res + path - index = pathlen - 1 - elif c == '%': # variable or '%' - if path[index + 1:index + 2] == '%': - res = res + c - index = index + 1 - else: - path = path[index+1:] - pathlen = len(path) - try: - index = path.index('%') - except ValueError: - res = res + '%' + path - index = pathlen - 1 - else: - var = path[:index] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '%' + var + '%' - elif c == '$': # variable or '$$' - if path[index + 1:index + 2] == '$': - res = res + c - index = index + 1 - elif path[index + 1:index + 2] == '{': - path = path[index+2:] - pathlen = len(path) - try: - index = path.index('}') - var = path[:index] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '${' + var + '}' - except ValueError: - res = res + '${' + path - index = pathlen - 1 - else: - var = '' - index = index + 1 - c = path[index:index + 1] - while c != '' and c in varchars: - var = var + c - index = index + 1 - c = path[index:index + 1] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '$' + var - if c != '': - index = index - 1 - else: - res = res + c - index = index + 1 - return res - - -# Normalize a path, e.g. A//B, A/./B and A/foo/../B all become A\B. -# Previously, this function also truncated pathnames to 8+3 format, -# but as this module is called "ntpath", that's obviously wrong! - -def normpath(path): - """Normalize path, eliminating double slashes, etc.""" - # Preserve unicode (if path is unicode) - backslash, dot = (u'\\', u'.') if isinstance(path, unicode) else ('\\', '.') - if path.startswith(('\\\\.\\', '\\\\?\\')): - # in the case of paths with these prefixes: - # \\.\ -> device names - # \\?\ -> literal paths - # do not do any normalization, but return the path unchanged - return path - path = path.replace("/", "\\") - prefix, path = splitdrive(path) - # We need to be careful here. If the prefix is empty, and the path starts - # with a backslash, it could either be an absolute path on the current - # drive (\dir1\dir2\file) or a UNC filename (\\server\mount\dir1\file). It - # is therefore imperative NOT to collapse multiple backslashes blindly in - # that case. - # The code below preserves multiple backslashes when there is no drive - # letter. This means that the invalid filename \\\a\b is preserved - # unchanged, where a\\\b is normalised to a\b. It's not clear that there - # is any better behaviour for such edge cases. - if prefix == '': - # No drive letter - preserve initial backslashes - while path[:1] == "\\": - prefix = prefix + backslash - path = path[1:] - else: - # We have a drive letter - collapse initial backslashes - if path.startswith("\\"): - prefix = prefix + backslash - path = path.lstrip("\\") - comps = path.split("\\") - i = 0 - while i < len(comps): - if comps[i] in ('.', ''): - del comps[i] - elif comps[i] == '..': - if i > 0 and comps[i-1] != '..': - del comps[i-1:i+1] - i -= 1 - elif i == 0 and prefix.endswith("\\"): - del comps[i] - else: - i += 1 - else: - i += 1 - # If the path is now empty, substitute '.' - if not prefix and not comps: - comps.append(dot) - return prefix + backslash.join(comps) - - -# Return an absolute path. -try: - from nt import _getfullpathname - -except ImportError: # no built-in nt module - maybe it's Jython ;) - - if os._name == 'nt' : - # on Windows so Java version of sys deals in NT paths - def abspath(path): - """Return the absolute version of a path.""" - try: - if isinstance(path, unicode): - # Result must be unicode - if path: - path = sys.getPath(path) - else: - # Empty path must return current working directory - path = os.getcwdu() - else: - # Result must be bytes - if path: - path = sys.getPath(path).encode('latin-1') - else: - # Empty path must return current working directory - path = os.getcwd() - except EnvironmentError: - pass # Bad path - return unchanged. - return normpath(path) - - else: - # not running on Windows - mock up something sensible - def abspath(path): - """Return the absolute version of a path.""" - try: - if isinstance(path, unicode): - # Result must be unicode - if path: - path = join(os.getcwdu(), path) - else: - # Empty path must return current working directory - path = os.getcwdu() - else: - # Result must be bytes - if path: - path = join(os.getcwd(), path) - else: - # Empty path must return current working directory - path = os.getcwd() - except EnvironmentError: - pass # Bad path - return unchanged. - return normpath(path) - -else: # use native Windows method on Windows - def abspath(path): - """Return the absolute version of a path.""" - - if path: # Empty path must return current working directory. - try: - path = _getfullpathname(path) - except WindowsError: - pass # Bad path - return unchanged. - elif isinstance(path, unicode): - path = os.getcwdu() - else: - path = os.getcwd() - return normpath(path) - -# realpath is a no-op on systems without islink support -realpath = abspath -# Win9x family and earlier have no Unicode filename support. -supports_unicode_filenames = (hasattr(sys, "getwindowsversion") and - sys.getwindowsversion()[3] >= 2) - -def _abspath_split(path): - abs = abspath(normpath(path)) - prefix, rest = splitunc(abs) - is_unc = bool(prefix) - if not is_unc: - prefix, rest = splitdrive(abs) - return is_unc, prefix, [x for x in rest.split(sep) if x] - -def relpath(path, start=curdir): - """Return a relative version of a path""" - - if not path: - raise ValueError("no path specified") - - start_is_unc, start_prefix, start_list = _abspath_split(start) - path_is_unc, path_prefix, path_list = _abspath_split(path) - - if path_is_unc ^ start_is_unc: - raise ValueError("Cannot mix UNC and non-UNC paths (%s and %s)" - % (path, start)) - if path_prefix.lower() != start_prefix.lower(): - if path_is_unc: - raise ValueError("path is on UNC root %s, start on UNC root %s" - % (path_prefix, start_prefix)) - else: - raise ValueError("path is on drive %s, start on drive %s" - % (path_prefix, start_prefix)) - # Work out how much of the filepath is shared by start and path. - i = 0 - for e1, e2 in zip(start_list, path_list): - if e1.lower() != e2.lower(): - break - i += 1 - - rel_list = [pardir] * (len(start_list)-i) + path_list[i:] - if not rel_list: - return curdir - return join(*rel_list) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -438,6 +438,7 @@ import java.nio.ByteBuffer import org.python.core.io.RawIOBase import org.python.core.io.StreamIO + from org.python.core.Py import fileSystemDecode else: import select _has_poll = hasattr(select, 'poll') @@ -779,7 +780,7 @@ maintain those byte values (which may be butchered as Strings) for the subprocess if they haven't been modified. """ - # Determine what's safe to merge + # Determine what's necessary to merge (new or different) merge_env = dict((key, value) for key, value in env.iteritems() if key not in builder_env or builder_env.get(key) != value) @@ -789,8 +790,10 @@ for entry in entries: if entry.getKey() not in env: entries.remove() - - builder_env.putAll(merge_env) + # add anything new or different in env + for key, value in merge_env.iteritems(): + # If the new value is bytes, assume it to be FS-encoded + builder_env.put(key, fileSystemDecode(value)) class Popen(object): @@ -1308,9 +1311,6 @@ args = _cmdline2listimpl(args) else: args = list(args) - # NOTE: CPython posix (execv) will str() any unicode - # args first, maybe we should do the same on - # posix. Windows passes unicode through, however if any(not isinstance(arg, (str, unicode)) for arg in args): raise TypeError('args must contain only strings') args = _escape_args(args) @@ -1321,6 +1321,11 @@ if executable is not None: args[0] = executable + # NOTE: CPython posix (execv) will FS-encode any unicode args, but + # pass on bytes unchanged, because that's what the system expects. + # Java expects unicode, so we do the converse: leave unicode + # unchanged but FS-decode any supplied as bytes. + args = [fileSystemDecode(arg) for arg in args] builder = java.lang.ProcessBuilder(args) if stdin is None: @@ -1330,16 +1335,20 @@ if stderr is None: builder.redirectError(java.lang.ProcessBuilder.Redirect.INHERIT) - # os.environ may be inherited for compatibility with CPython + # os.environ may be inherited for compatibility with CPython. + # Elements taken from os.environ are FS-decoded to unicode. _setup_env(dict(os.environ if env is None else env), builder.environment()) + # The current working directory must also be unicode. if cwd is None: - cwd = os.getcwd() - elif not os.path.exists(cwd): - raise OSError(errno.ENOENT, os.strerror(errno.ENOENT), cwd) - elif not os.path.isdir(cwd): - raise OSError(errno.ENOTDIR, os.strerror(errno.ENOTDIR), cwd) + cwd = os.getcwdu() + else: + cwd = fileSystemDecode(cwd) + if not os.path.exists(cwd): + raise OSError(errno.ENOENT, os.strerror(errno.ENOENT), cwd) + elif not os.path.isdir(cwd): + raise OSError(errno.ENOTDIR, os.strerror(errno.ENOTDIR), cwd) builder.directory(java.io.File(cwd)) # Let Java manage redirection of stderr to stdout (it's more @@ -1890,9 +1899,10 @@ args = _cmdline2listimpl(command) args = _escape_args(args) args = _shell_command + args - cwd = os.getcwd() + cwd = os.getcwdu() - + # Python supplies FS-encoded arguments while Java expects String + args = [fileSystemDecode(arg) for arg in args] builder = java.lang.ProcessBuilder(args) builder.directory(java.io.File(cwd)) diff --git a/Lib/sysconfig.py b/Lib/sysconfig.py --- a/Lib/sysconfig.py +++ b/Lib/sysconfig.py @@ -5,6 +5,11 @@ import os from os.path import pardir, realpath +def fileSystemEncode(path): + if isinstance(path, unicode): + return path.encode(sys.getfilesystemencoding()) + return path + _INSTALL_SCHEMES = { 'posix_prefix': { 'stdlib': '{base}/lib/python{py_version_short}', @@ -116,6 +121,7 @@ def _safe_realpath(path): try: + path = fileSystemEncode(path) return realpath(path) except OSError: return path diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -1372,7 +1372,6 @@ test_mailbox # fails miserably and ruins other tests test_os_jy # Locale tests fail on Cygwin (but not Windows) # test_popen # Passes, but see http://bugs.python.org/issue1559298 - test_runpy # OSError: unlink() test_select_new # Hangs (Windows), though ok run singly test_urllib2 # file not on local host (likely Windows only) """, diff --git a/Lib/test/script_helper.py b/Lib/test/script_helper.py --- a/Lib/test/script_helper.py +++ b/Lib/test/script_helper.py @@ -20,6 +20,8 @@ from test.test_support import strip_python_stderr +_IS_JYTHON_WINDOWS = sys.platform.startswith('java') and os._name == 'nt' + # Executing the interpreter in a subprocess def _assert_python(expected_success, *args, **env_vars): cmd_line = [sys.executable] @@ -101,7 +103,9 @@ try: yield dirname finally: - shutil.rmtree(dirname) + # On Windows, unlink failures within rmtree often mask the true nature + # of a failing test (or sometimes a passing one). + shutil.rmtree(dirname, ignore_errors=_IS_JYTHON_WINDOWS) def make_script(script_dir, script_basename, source): script_filename = script_basename+os.extsep+'py' diff --git a/Lib/test/test_bytecodetools_jy.py b/Lib/test/test_bytecodetools_jy.py --- a/Lib/test/test_bytecodetools_jy.py +++ b/Lib/test/test_bytecodetools_jy.py @@ -69,7 +69,11 @@ """ProxyDebugDirectory used to be the only way to save proxied classes""" def setUp(self): - self.tmpdir = tempfile.mkdtemp() + tmp = tempfile.mkdtemp() + # Ensure Unicode since derived file paths are used in Java calls + if isinstance(tmp, bytes): + tmp = tmp.decode(sys.getfilesystemencoding()) + self.tmpdir = tmp def tearDown(self): test_support.rmtree(self.tmpdir) @@ -82,7 +86,7 @@ class C(Callable): def call(self): return 47 - + self.assertEqual(C().call(), 47) proxy_dir = os.path.join(self.tmpdir, "org", "python", "proxies") # If test script is run outside of regrtest, the first path is used; @@ -93,7 +97,7 @@ self.assertRegexpMatches( proxy_classes[0], r'\$C\$\d+.class$') - + def test_main(): test_support.run_unittest( diff --git a/Lib/test/test_exceptions.py b/Lib/test/test_exceptions.py --- a/Lib/test/test_exceptions.py +++ b/Lib/test/test_exceptions.py @@ -524,7 +524,6 @@ self.check_same_msg(Exception(), '') - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_0_args_with_overridden___str__(self): """Check same msg for exceptions with 0 args and overridden __str__""" # str() and unicode() on an exception with overridden __str__ that @@ -550,7 +549,6 @@ self.assertRaises(UnicodeEncodeError, str, e) self.assertEqual(unicode(e), u'f\xf6\xf6') - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_1_arg_with_overridden___str__(self): """Check same msg for exceptions with overridden __str__ and 1 arg""" # when __str__ is overridden and __unicode__ is not implemented @@ -575,7 +573,6 @@ for args in argslist: self.check_same_msg(Exception(*args), repr(args)) - @unittest.skipIf(is_jython, "FIXME: not working in Jython") def test_many_args_with_overridden___str__(self): """Check same msg for exceptions with overridden __str__ and many args""" # if __str__ returns an ascii string / ascii unicode string diff --git a/Lib/test/test_exceptions_jy.py b/Lib/test/test_exceptions_jy.py --- a/Lib/test/test_exceptions_jy.py +++ b/Lib/test/test_exceptions_jy.py @@ -70,11 +70,12 @@ # But the exception hook, via Py#displayException, does not fail when attempting to __str__ the exception args with test_support.captured_stderr() as s: sys.excepthook(RuntimeError, u"Drink \u2615", None) - self.assertEqual(s.getvalue(), "RuntimeError\n") + # At minimum, it tells us what kind of exception it was + self.assertEqual(s.getvalue()[:12], "RuntimeError") # It is fine with ascii values, of course with test_support.captured_stderr() as s: sys.excepthook(RuntimeError, u"Drink java", None) - self.assertEqual(s.getvalue(), "RuntimeError: Drink java\n") + self.assertEqual(s.getvalue(), "RuntimeError: Drink java\n") def test_main(): diff --git a/Lib/test/test_httpservers.py b/Lib/test/test_httpservers.py --- a/Lib/test/test_httpservers.py +++ b/Lib/test/test_httpservers.py @@ -378,6 +378,9 @@ @unittest.skipIf(hasattr(os, 'geteuid') and os.geteuid() == 0, "This test can't be run reliably as root (issue #13308).") + at unittest.skipIf((not hasattr(os, 'symlink')) and + sys.executable.encode('ascii', 'replace') != sys.executable, + "Executable path is not pure ASCII.") # these fail for CPython too class CGIHTTPServerTestCase(BaseTestCase): class request_handler(NoLogRequestHandler, CGIHTTPRequestHandler): pass diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -2438,6 +2438,7 @@ self.assertEqual(f.errors, "replace") @unittest.skipUnless(threading, 'Threading required for this test.') + @unittest.skipIf(support.is_jython, "Not thread-safe: Jython issue 2588.") def test_threads_write(self): # Issue6750: concurrent writes could duplicate data event = threading.Event() diff --git a/Lib/test/test_java_integration.py b/Lib/test/test_java_integration.py --- a/Lib/test/test_java_integration.py +++ b/Lib/test/test_java_integration.py @@ -485,8 +485,11 @@ # script must lie within python.home for this test to work return policy = test_support.findfile("python_home.policy") - self.assertEquals(subprocess.call([sys.executable, "-J-Dpython.cachedir.skip=true", - "-J-Djava.security.manager", "-J-Djava.security.policy=%s" % policy, script]), + self.assertEquals( + subprocess.call([sys.executable, + "-J-Dpython.cachedir.skip=true", + "-J-Djava.security.manager", + "-J-Djava.security.policy=%s" % policy, script]), 0) def test_import_signal_fails_with_import_error_using_security(self): @@ -693,7 +696,9 @@ def test_proxy_serialization(self): # Proxies can be deserializable in a fresh JVM, including being able # to "findPython" to get a PySystemState. - tempdir = tempfile.mkdtemp() + # tempdir gets combined with unicode paths derived from class names, + # so make it a unicode object. + tempdir = tempfile.mkdtemp().decode(sys.getfilesystemencoding()) old_proxy_debug_dir = org.python.core.Options.proxyDebugDirectory try: # Generate a proxy for Cat class; @@ -738,7 +743,9 @@ @unittest.skipUnless(find_executable('jar'), 'Need the jar command to run') def test_custom_proxymaker(self): # Verify custom proxymaker supports direct usage of Python code in Java - tempdir = tempfile.mkdtemp() + # tempdir gets combined with unicode paths derived from class names, + # so make it a unicode object. + tempdir = tempfile.mkdtemp().decode(sys.getfilesystemencoding()) try: SerializableProxies.serialized_path = tempdir import bark diff --git a/Lib/test/test_java_visibility.py b/Lib/test/test_java_visibility.py --- a/Lib/test/test_java_visibility.py +++ b/Lib/test/test_java_visibility.py @@ -13,6 +13,7 @@ from org.python.tests.multihidden import BaseConnection class VisibilityTest(unittest.TestCase): + def test_invisible(self): for item in dir(Invisible): self.assert_(not item.startswith("package")) @@ -178,6 +179,7 @@ class JavaClassTest(unittest.TestCase): + def test_class_methods_visible(self): self.assertFalse(HashMap.isInterface(), 'java.lang.Class methods should be visible on Class instances') @@ -198,6 +200,7 @@ self.assertEquals(3, s.b, "Defined fields should take precedence") class CoercionTest(unittest.TestCase): + def test_int_coercion(self): c = Coercions() self.assertEquals("5", c.takeInt(5)) @@ -234,6 +237,7 @@ self.assertEquals(c.tellClassNameObject(ht), "class java.util.Hashtable") class RespectJavaAccessibilityTest(unittest.TestCase): + def run_accessibility_script(self, script, error=AttributeError): fn = test_support.findfile(script) self.assertRaises(error, execfile, fn) @@ -254,6 +258,7 @@ self.run_accessibility_script("call_overridden_method.py") class ClassloaderTest(unittest.TestCase): + def test_loading_classes_without_import(self): cl = test_support.make_jar_classloader("../callbacker_test.jar") X = cl.loadClass("org.python.tests.Callbacker") @@ -265,11 +270,13 @@ self.assertEquals(None, called[0]) def test_main(): - test_support.run_unittest(VisibilityTest, + test_support.run_unittest( + VisibilityTest, JavaClassTest, CoercionTest, RespectJavaAccessibilityTest, - ClassloaderTest) + ClassloaderTest + ) if __name__ == "__main__": test_main() diff --git a/Lib/test/test_jser.py b/Lib/test/test_jser.py --- a/Lib/test/test_jser.py +++ b/Lib/test/test_jser.py @@ -15,7 +15,9 @@ class JavaSerializationTests(unittest.TestCase): def setUp(self): - self.sername = os.path.join(sys.prefix, "test.ser") + name = os.path.join(sys.prefix, "test.ser") + # As we are using java.io directly, ensure file name is a unicode + self.sername = name.decode(sys.getfilesystemencoding()) def tearDown(self): os.remove(self.sername) diff --git a/Lib/test/test_jython_launcher.py b/Lib/test/test_jython_launcher.py --- a/Lib/test/test_jython_launcher.py +++ b/Lib/test/test_jython_launcher.py @@ -31,7 +31,6 @@ # by the installer return executable - def get_uname(): _uname = None try: @@ -49,9 +48,8 @@ class TestLauncher(unittest.TestCase): - + def get_cmdline(self, cmd, env): - output = subprocess.check_output(cmd, env=env).rstrip() if is_windows: return subprocess._cmdline2list(output) @@ -76,7 +74,7 @@ k, v = arg[2:].split("=") props[k] = v return props - + def test_classpath_env(self): env = self.get_newenv() env["CLASSPATH"] = some_jar @@ -207,7 +205,7 @@ def test_file(self): self.assertCommand(['test.py']) - + def test_dash(self): self.assertCommand(['-i']) diff --git a/Lib/test/test_os_jy.py b/Lib/test/test_os_jy.py --- a/Lib/test/test_os_jy.py +++ b/Lib/test/test_os_jy.py @@ -198,14 +198,41 @@ def test_env(self): with test_support.temp_cwd(name=u"tempcwd-??"): + # os.environ is constructed with FS-encoded values (as in CPython), + # but it will accept unicode additions. newenv = os.environ.copy() - newenv["TEST_HOME"] = u"??" - p = subprocess.Popen([sys.executable, "-c", - 'import sys,os;' \ - 'sys.stdout.write(os.getenv("TEST_HOME").encode("utf-8"))'], - stdout=subprocess.PIPE, - env=newenv) - self.assertEqual(p.stdout.read().decode("utf-8"), u"??") + newenv["TEST_HOME"] = expected = u"??" + # Environment passed as UTF-16 String[] by Java, arrives FS-encoded. + for encoding in ('utf-8', 'gbk'): + # Emit the value of TEST_HOME explicitly encoded. + p = subprocess.Popen( + [sys.executable, "-c", + 'import sys, os;' \ + 'sys.stdout.write(os.getenv("TEST_HOME")' \ + '.decode(sys.getfilesystemencoding())' \ + '.encode("%s"))' \ + % encoding], + stdout=subprocess.PIPE, + env=newenv) + # Decode with chosen encoding + self.assertEqual(p.stdout.read().decode(encoding), u"??") + + def test_env_naively(self): + with test_support.temp_cwd(name=u"tempcwd-??"): + # os.environ is constructed with FS-encoded values (as in CPython), + # but it will accept unicode additions. + newenv = os.environ.copy() + newenv["TEST_HOME"] = expected = u"??" + # Environment passed as UTF-16 String[] by Java, arrives FS-encoded. + # However, emit TEST_HOME without thinking about the encoding. + p = subprocess.Popen( + [sys.executable, "-c", + 'import sys, os;' \ + 'sys.stdout.write(os.getenv("TEST_HOME"))'], + stdout=subprocess.PIPE, + env=newenv) + # Decode with default encoding utf-8 (because ... ?) + self.assertEqual(p.stdout.read().decode('utf-8'), expected) def test_getcwd(self): with test_support.temp_cwd(name=u"tempcwd-??") as temp_cwd: @@ -216,38 +243,46 @@ self.assertEqual(p.stdout.read().decode("utf-8"), temp_cwd) def test_listdir(self): - # It is hard to avoid Unicode paths on systems like OS X. Use - # relative paths from a temp CWD to work around this + # It is hard to avoid Unicode paths on systems like OS X. Use relative + # paths from a temp CWD to work around this. But when you don't, + # it behaves like this ... with test_support.temp_cwd() as new_cwd: - unicode_path = os.path.join(".", "unicode") - self.assertIs(type(unicode_path), str) - chinese_path = os.path.join(unicode_path, u"??") + + basedir = os.path.join(".", "unicode") + self.assertIs(type(basedir), bytes) + chinese_path = os.path.join(basedir, u"??") self.assertIs(type(chinese_path), unicode) home_path = os.path.join(chinese_path, u"??") os.makedirs(home_path) + FS = sys.getfilesystemencoding() + with open(os.path.join(home_path, "test.txt"), "w") as test_file: test_file.write("42\n") - # Verify works with str paths, returning Unicode as necessary - entries = os.listdir(unicode_path) - self.assertIn(u"??", entries) + # listdir(bytes) includes encoded form of ?? + entries = os.listdir(basedir) + self.assertIn(u"??".encode(FS), entries) + for entry in entries: + self.assertIs(type(entry), bytes) - # Verify works with Unicode paths + # listdir(unicode) includes unicode form of ?? entries = os.listdir(chinese_path) self.assertIn(u"??", entries) + for entry in entries: + self.assertIs(type(entry), unicode) # glob.glob builds on os.listdir; note that we don't use - # Unicode paths in the arg to glob + # Unicode paths in the arg to glob so the result is bytes self.assertEqual( glob.glob(os.path.join("unicode", "*")), - [os.path.join(u"unicode", u"??")]) + [os.path.join(u"unicode", u"??").encode(FS)]) self.assertEqual( glob.glob(os.path.join("unicode", "*", "*")), - [os.path.join(u"unicode", u"??", u"??")]) + [os.path.join(u"unicode", u"??", u"??").encode(FS)]) self.assertEqual( glob.glob(os.path.join("unicode", "*", "*", "*")), - [os.path.join(u"unicode", u"??", u"??", "test.txt")]) + [os.path.join(u"unicode", u"??", u"??", "test.txt").encode(FS)]) # Now use a Unicode path as well as in the glob arg self.assertEqual( @@ -263,11 +298,15 @@ # Verify Java integration. But we will need to construct # an absolute path since chdir doesn't work with Java # (except for subprocesses, like below in test_env) - for entry in entries: + for entry in entries: # list(unicode) + # new_cwd is bytes while chinese_path is unicode. + # But new_cwd is not guaranteed to be just ascii, so decode it. + new_cwd = new_cwd.decode(FS) entry_path = os.path.join(new_cwd, chinese_path, entry) f = File(entry_path) - self.assertTrue(f.exists(), "File %r (%r) should be testable for existence" % ( - f, entry_path)) + self.assertTrue(f.exists(), + "File %r (%r) should be testable for existence" % + (f, entry_path)) class LocaleTestCase(unittest.TestCase): diff --git a/Lib/test/test_runpy.py b/Lib/test/test_runpy.py deleted file mode 100644 --- a/Lib/test/test_runpy.py +++ /dev/null @@ -1,402 +0,0 @@ -# Test the runpy module -import unittest -import os -import os.path -import sys -import re -import tempfile -from test.test_support import verbose, run_unittest, forget -from test.script_helper import (temp_dir, make_script, compile_script, - make_pkg, make_zip_script, make_zip_pkg) - - -from runpy import _run_code, _run_module_code, run_module, run_path -# Note: This module can't safely test _run_module_as_main as it -# runs its tests in the current process, which would mess with the -# real __main__ module (usually test.regrtest) -# See test_cmd_line_script for a test that executes that code path - -# Set up the test code and expected results - -class RunModuleCodeTest(unittest.TestCase): - """Unit tests for runpy._run_code and runpy._run_module_code""" - - expected_result = ["Top level assignment", "Lower level reference"] - test_source = ( - "# Check basic code execution\n" - "result = ['Top level assignment']\n" - "def f():\n" - " result.append('Lower level reference')\n" - "f()\n" - "# Check the sys module\n" - "import sys\n" - "run_argv0 = sys.argv[0]\n" - "run_name_in_sys_modules = __name__ in sys.modules\n" - "if run_name_in_sys_modules:\n" - " module_in_sys_modules = globals() is sys.modules[__name__].__dict__\n" - "# Check nested operation\n" - "import runpy\n" - "nested = runpy._run_module_code('x=1\\n', mod_name='')\n" - ) - - def test_run_code(self): - saved_argv0 = sys.argv[0] - d = _run_code(self.test_source, {}) - self.assertEqual(d["result"], self.expected_result) - self.assertIs(d["__name__"], None) - self.assertIs(d["__file__"], None) - self.assertIs(d["__loader__"], None) - self.assertIs(d["__package__"], None) - self.assertIs(d["run_argv0"], saved_argv0) - self.assertNotIn("run_name", d) - self.assertIs(sys.argv[0], saved_argv0) - - def test_run_module_code(self): - initial = object() - name = "" - file = "Some other nonsense" - loader = "Now you're just being silly" - package = '' # Treat as a top level module - d1 = dict(initial=initial) - saved_argv0 = sys.argv[0] - d2 = _run_module_code(self.test_source, - d1, - name, - file, - loader, - package) - self.assertNotIn("result", d1) - self.assertIs(d2["initial"], initial) - self.assertEqual(d2["result"], self.expected_result) - self.assertEqual(d2["nested"]["x"], 1) - self.assertIs(d2["__name__"], name) - self.assertTrue(d2["run_name_in_sys_modules"]) - self.assertTrue(d2["module_in_sys_modules"]) - self.assertIs(d2["__file__"], file) - self.assertIs(d2["run_argv0"], file) - self.assertIs(d2["__loader__"], loader) - self.assertIs(d2["__package__"], package) - self.assertIs(sys.argv[0], saved_argv0) - self.assertNotIn(name, sys.modules) - - -class RunModuleTest(unittest.TestCase): - """Unit tests for runpy.run_module""" - - def expect_import_error(self, mod_name): - try: - run_module(mod_name) - except ImportError: - pass - else: - self.fail("Expected import error for " + mod_name) - - def test_invalid_names(self): - # Builtin module - self.expect_import_error("sys") - # Non-existent modules - self.expect_import_error("sys.imp.eric") - self.expect_import_error("os.path.half") - self.expect_import_error("a.bee") - self.expect_import_error(".howard") - self.expect_import_error("..eaten") - # Package without __main__.py - self.expect_import_error("multiprocessing") - - def test_library_module(self): - run_module("runpy") - - def _add_pkg_dir(self, pkg_dir): - os.mkdir(pkg_dir) - pkg_fname = os.path.join(pkg_dir, "__init__"+os.extsep+"py") - pkg_file = open(pkg_fname, "w") - pkg_file.close() - return pkg_fname - - def _make_pkg(self, source, depth, mod_base="runpy_test"): - pkg_name = "__runpy_pkg__" - test_fname = mod_base+os.extsep+"py" - pkg_dir = sub_dir = tempfile.mkdtemp() - if verbose: print " Package tree in:", sub_dir - sys.path.insert(0, pkg_dir) - if verbose: print " Updated sys.path:", sys.path[0] - for i in range(depth): - sub_dir = os.path.join(sub_dir, pkg_name) - pkg_fname = self._add_pkg_dir(sub_dir) - if verbose: print " Next level in:", sub_dir - if verbose: print " Created:", pkg_fname - mod_fname = os.path.join(sub_dir, test_fname) - mod_file = open(mod_fname, "w") - mod_file.write(source) - mod_file.close() - if verbose: print " Created:", mod_fname - mod_name = (pkg_name+".")*depth + mod_base - return pkg_dir, mod_fname, mod_name - - def _del_pkg(self, top, depth, mod_name): - for entry in list(sys.modules): - if entry.startswith("__runpy_pkg__"): - del sys.modules[entry] - if verbose: print " Removed sys.modules entries" - del sys.path[0] - if verbose: print " Removed sys.path entry" - for root, dirs, files in os.walk(top, topdown=False): - for name in files: - try: - os.remove(os.path.join(root, name)) - except OSError, ex: - if verbose: print ex # Persist with cleaning up - for name in dirs: - fullname = os.path.join(root, name) - try: - os.rmdir(fullname) - except OSError, ex: - if verbose: print ex # Persist with cleaning up - try: - os.rmdir(top) - if verbose: print " Removed package tree" - except OSError, ex: - if verbose: print ex # Persist with cleaning up - - def _check_module(self, depth): - pkg_dir, mod_fname, mod_name = ( - self._make_pkg("x=1\n", depth)) - forget(mod_name) - try: - if verbose: print "Running from source:", mod_name - d1 = run_module(mod_name) # Read from source - self.assertIn("x", d1) - self.assertTrue(d1["x"] == 1) - del d1 # Ensure __loader__ entry doesn't keep file open - __import__(mod_name) - os.remove(mod_fname) - if verbose: print "Running from compiled:", mod_name - d2 = run_module(mod_name) # Read from bytecode - self.assertIn("x", d2) - self.assertTrue(d2["x"] == 1) - del d2 # Ensure __loader__ entry doesn't keep file open - finally: - self._del_pkg(pkg_dir, depth, mod_name) - if verbose: print "Module executed successfully" - - def _check_package(self, depth): - pkg_dir, mod_fname, mod_name = ( - self._make_pkg("x=1\n", depth, "__main__")) - pkg_name, _, _ = mod_name.rpartition(".") - forget(mod_name) - try: - if verbose: print "Running from source:", pkg_name - d1 = run_module(pkg_name) # Read from source - self.assertIn("x", d1) - self.assertTrue(d1["x"] == 1) - del d1 # Ensure __loader__ entry doesn't keep file open - __import__(mod_name) - os.remove(mod_fname) - if verbose: print "Running from compiled:", pkg_name - d2 = run_module(pkg_name) # Read from bytecode - self.assertIn("x", d2) - self.assertTrue(d2["x"] == 1) - del d2 # Ensure __loader__ entry doesn't keep file open - finally: - self._del_pkg(pkg_dir, depth, pkg_name) - if verbose: print "Package executed successfully" - - def _add_relative_modules(self, base_dir, source, depth): - if depth <= 1: - raise ValueError("Relative module test needs depth > 1") - pkg_name = "__runpy_pkg__" - module_dir = base_dir - for i in range(depth): - parent_dir = module_dir - module_dir = os.path.join(module_dir, pkg_name) - # Add sibling module - sibling_fname = os.path.join(module_dir, "sibling"+os.extsep+"py") - sibling_file = open(sibling_fname, "w") - sibling_file.close() - if verbose: print " Added sibling module:", sibling_fname - # Add nephew module - uncle_dir = os.path.join(parent_dir, "uncle") - self._add_pkg_dir(uncle_dir) - if verbose: print " Added uncle package:", uncle_dir - cousin_dir = os.path.join(uncle_dir, "cousin") - self._add_pkg_dir(cousin_dir) - if verbose: print " Added cousin package:", cousin_dir - nephew_fname = os.path.join(cousin_dir, "nephew"+os.extsep+"py") - nephew_file = open(nephew_fname, "w") - nephew_file.close() - if verbose: print " Added nephew module:", nephew_fname - - def _check_relative_imports(self, depth, run_name=None): - contents = r"""\ -from __future__ import absolute_import -from . import sibling -from ..uncle.cousin import nephew -""" - pkg_dir, mod_fname, mod_name = ( - self._make_pkg(contents, depth)) - try: - self._add_relative_modules(pkg_dir, contents, depth) - pkg_name = mod_name.rpartition('.')[0] - if verbose: print "Running from source:", mod_name - d1 = run_module(mod_name, run_name=run_name) # Read from source - self.assertIn("__package__", d1) - self.assertTrue(d1["__package__"] == pkg_name) - self.assertIn("sibling", d1) - self.assertIn("nephew", d1) - del d1 # Ensure __loader__ entry doesn't keep file open - __import__(mod_name) - os.remove(mod_fname) - if verbose: print "Running from compiled:", mod_name - d2 = run_module(mod_name, run_name=run_name) # Read from bytecode - self.assertIn("__package__", d2) - self.assertTrue(d2["__package__"] == pkg_name) - self.assertIn("sibling", d2) - self.assertIn("nephew", d2) - del d2 # Ensure __loader__ entry doesn't keep file open - finally: - self._del_pkg(pkg_dir, depth, mod_name) - if verbose: print "Module executed successfully" - - def test_run_module(self): - for depth in range(4): - if verbose: print "Testing package depth:", depth - self._check_module(depth) - - def test_run_package(self): - for depth in range(1, 4): - if verbose: print "Testing package depth:", depth - self._check_package(depth) - - def test_explicit_relative_import(self): - for depth in range(2, 5): - if verbose: print "Testing relative imports at depth:", depth - self._check_relative_imports(depth) - - def test_main_relative_import(self): - for depth in range(2, 5): - if verbose: print "Testing main relative imports at depth:", depth - self._check_relative_imports(depth, "__main__") - - -class RunPathTest(unittest.TestCase): - """Unit tests for runpy.run_path""" - # Based on corresponding tests in test_cmd_line_script - - test_source = """\ -# Script may be run with optimisation enabled, so don't rely on assert -# statements being executed -def assertEqual(lhs, rhs): - if lhs != rhs: - raise AssertionError('%r != %r' % (lhs, rhs)) -def assertIs(lhs, rhs): - if lhs is not rhs: - raise AssertionError('%r is not %r' % (lhs, rhs)) -# Check basic code execution -result = ['Top level assignment'] -def f(): - result.append('Lower level reference') -f() -assertEqual(result, ['Top level assignment', 'Lower level reference']) -# Check the sys module -import sys -assertIs(globals(), sys.modules[__name__].__dict__) -argv0 = sys.argv[0] -""" - - def _make_test_script(self, script_dir, script_basename, source=None): - if source is None: - source = self.test_source - return make_script(script_dir, script_basename, source) - - def _check_script(self, script_name, expected_name, expected_file, - expected_argv0, expected_package): - result = run_path(script_name) - self.assertEqual(result["__name__"], expected_name) - self.assertEqual(result["__file__"], expected_file) - self.assertIn("argv0", result) - self.assertEqual(result["argv0"], expected_argv0) - self.assertEqual(result["__package__"], expected_package) - - def _check_import_error(self, script_name, msg): - msg = re.escape(msg) - self.assertRaisesRegexp(ImportError, msg, run_path, script_name) - - def test_basic_script(self): - with temp_dir() as script_dir: - mod_name = 'script' - script_name = self._make_test_script(script_dir, mod_name) - self._check_script(script_name, "", script_name, - script_name, None) - - def test_script_compiled(self): - with temp_dir() as script_dir: - mod_name = 'script' - script_name = self._make_test_script(script_dir, mod_name) - compiled_name = compile_script(script_name) - os.remove(script_name) - self._check_script(compiled_name, "", compiled_name, - compiled_name, None) - - def test_directory(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - self._check_script(script_dir, "", script_name, - script_dir, '') - - def test_directory_compiled(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - compiled_name = compile_script(script_name) - os.remove(script_name) - self._check_script(script_dir, "", compiled_name, - script_dir, '') - - def test_directory_error(self): - with temp_dir() as script_dir: - mod_name = 'not_main' - script_name = self._make_test_script(script_dir, mod_name) - msg = "can't find '__main__' module in %r" % script_dir - self._check_import_error(script_dir, msg) - - def test_zipfile(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - zip_name, fname = make_zip_script(script_dir, 'test_zip', script_name) - self._check_script(zip_name, "", fname, zip_name, '') - - def test_zipfile_compiled(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - compiled_name = compile_script(script_name) - zip_name, fname = make_zip_script(script_dir, 'test_zip', compiled_name) - self._check_script(zip_name, "", fname, zip_name, '') - - def test_zipfile_error(self): - with temp_dir() as script_dir: - mod_name = 'not_main' - script_name = self._make_test_script(script_dir, mod_name) - zip_name, fname = make_zip_script(script_dir, 'test_zip', script_name) - msg = "can't find '__main__' module in '%s'" % zip_name - self._check_import_error(zip_name, msg) - - def test_main_recursion_error(self): - with temp_dir() as script_dir, temp_dir() as dummy_dir: - mod_name = '__main__' - source = ("import runpy\n" - "runpy.run_path(%r)\n") % dummy_dir - script_name = self._make_test_script(script_dir, mod_name, source) - zip_name, fname = make_zip_script(script_dir, 'test_zip', script_name) - msg = "recursion depth exceeded" - self.assertRaisesRegexp(RuntimeError, msg, run_path, zip_name) - - - -def test_main(): - run_unittest(RunModuleCodeTest, RunModuleTest, RunPathTest) - -if __name__ == "__main__": - test_main() diff --git a/Lib/test/test_ssl.py b/Lib/test/test_ssl.py --- a/Lib/test/test_ssl.py +++ b/Lib/test/test_ssl.py @@ -27,7 +27,13 @@ HOST = support.HOST def data_file(*name): - return os.path.join(os.path.dirname(__file__), *name) + file = os.path.join(os.path.dirname(__file__), *name) + # Ensure we return unicode path. This tweak is not a divergence: + # CPython 2.7.13 fails the same way for a non-ascii location. + if isinstance(file, unicode): + return file + else: + return file.decode(sys.getfilesystemencoding()) # The custom key and certificate files used in test_ssl are generated # using Lib/test/make_ssl_certs.py. diff --git a/Lib/test/test_support.py b/Lib/test/test_support.py --- a/Lib/test/test_support.py +++ b/Lib/test/test_support.py @@ -490,8 +490,13 @@ def make_jar_classloader(jar): import os from java.net import URL, URLClassLoader + from java.io import File - url = URL('jar:file:%s!/' % jar) + if isinstance(jar, bytes): # Java will expect a unicode file name + jar = jar.decode(sys.getfilesystemencoding()) + jar_url = File(jar).toURI().toURL().toString() + url = URL(u'jar:%s!/' % jar_url) + if is_jython_nt: # URLJarFiles keep a cached open file handle to the jar even # after this ClassLoader is GC'ed, disallowing Windows tests @@ -509,7 +514,7 @@ if is_jython: # Jython disallows @ in module names TESTFN = '$test' - TESTFN_UNICODE = "$test-\xe0\xf2" + TESTFN_UNICODE = u"$test-\u87d2\u86c7" # = test python (Chinese) TESTFN_ENCODING = sys.getfilesystemencoding() elif os.name == 'riscos': TESTFN = 'testfile' diff --git a/Lib/test/test_sys.py b/Lib/test/test_sys.py --- a/Lib/test/test_sys.py +++ b/Lib/test/test_sys.py @@ -253,8 +253,6 @@ self.assert_(vi[3] in ("alpha", "beta", "candidate", "final")) self.assert_(isinstance(vi[4], int)) - @unittest.skipIf(test.test_support.is_jython_nt, - "FIXME: fails probably due to issue 2312") def test_ioencoding(self): # from v2.7 test import subprocess,os env = dict(os.environ) diff --git a/Lib/test/test_zipimport_jy.py b/Lib/test/test_zipimport_jy.py --- a/Lib/test/test_zipimport_jy.py +++ b/Lib/test/test_zipimport_jy.py @@ -51,8 +51,10 @@ A(path).somevar = 1 def test_main(): - test_support.run_unittest(SyspathZipimportTest) - test_support.run_unittest(ZipImporterDictTest) + test_support.run_unittest( + SyspathZipimportTest, + ZipImporterDictTest + ) if __name__ == "__main__": test_main() diff --git a/Lib/test/test_zipimport_support.py b/Lib/test/test_zipimport_support.py --- a/Lib/test/test_zipimport_support.py +++ b/Lib/test/test_zipimport_support.py @@ -240,6 +240,14 @@ print data self.assertIn(expected, data) + def assertNormalisedIn(self, target, data): + # bdb/pdb applies normcase to its filename before displaying. + # Also, it emerges as FS-encoded bytes, so do the same to the target. + target = os.path.normcase(target) + if not isinstance(target, bytes): + target = target.encode(sys.getfilesystemencoding()) + self.assertIn(target, data) + def test_pdb_issue4201(self): test_src = textwrap.dedent("""\ def f(): @@ -248,22 +256,22 @@ import pdb pdb.runcall(f) """) + with temp_dir() as d: script_name = make_script(d, 'script', test_src) p = spawn_python(script_name) p.stdin.write('l\n') data = kill_python(p) - # bdb/pdb applies normcase to its filename before displaying - # See CPython Issue 14255 (back-ported for Jython) - self.assertIn(os.path.normcase(script_name.encode('utf-8')), data) + # Back-port from CPython 3 (see CPython Issue 14255). + self.assertNormalisedIn(script_name, data) + zip_name, run_name = make_zip_script(d, "test_zip", script_name, '__main__.py') p = spawn_python(zip_name) p.stdin.write('l\n') data = kill_python(p) - # bdb/pdb applies normcase to its filename before displaying - # See CPython Issue 14255 (back-ported for Jython) - self.assertIn(os.path.normcase(run_name.encode('utf-8')), data) + # Back-port from CPython 3 (see CPython Issue 14255). + self.assertNormalisedIn(run_name, data) def test_main(): diff --git a/build.xml b/build.xml --- a/build.xml +++ b/build.xml @@ -236,6 +236,7 @@ output.dir = '${output.dir}' compile.dir = '${compile.dir}' exposed.dir = '${exposed.dir}' + gensrc.dir = '${gensrc.dir}' dist.dir = '${dist.dir}' apidoc.dir = '${apidoc.dir}' templates.dir = '${templates.dir}' @@ -434,6 +435,7 @@ + @@ -694,6 +696,7 @@ String), decoded if necessary + * from a Python bytes object, using the file system encoding. In Jython, this + * encoding is UTF-8, irrespective of the OS platform. This method is comparable with Python 3 + * os.fsdecode, but for Java use, in places such as the os module. If + * the argument is not a PyUnicode, it will be decoded using the nominal Jython + * file system encoding. If the argument is a PyUnicode, its + * String is returned. + * + * @param filename as bytes to decode, or already as unicode + * @return unicode version of path + */ + public static String fileSystemDecode(PyString filename) { + String s = filename.getString(); + if (filename instanceof PyUnicode || CharMatcher.ascii().matchesAllOf(s)) { + // Already encoded or usable as ASCII + return s; + } else { + // It's bytes, so must decode properly + assert "utf-8".equals(PySystemState.FILE_SYSTEM_ENCODING.toString()); + return codecs.PyUnicode_DecodeUTF8(s, null); + } + } + + /** + * As {@link #fileSystemDecode(PyString)} but raising ValueError if not a + * str or unicode. + * + * @param filename as bytes to decode, or already as unicode + * @return unicode version of the file name + */ + public static String fileSystemDecode(PyObject filename) { + if (filename instanceof PyString) { + return fileSystemDecode((PyString)filename); + } else { + throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", + filename.getType().fastGetName())); + } + } + + /** + * Return a PyString object we can use as a file name or file path in places where Python + * expects a bytes (that is a str) object in the file system encoding. + * In Jython, this encoding is UTF-8, irrespective of the OS platform. + *

+ * This is subtly different from CPython's use of "file system encoding", which tracks the + * platform's choice so that OS services may be called that have a bytes interface. Jython's + * interaction with the OS occurs via Java using String arguments representing Unicode values, + * so we have no need to match the encoding actually chosen by the platform (e.g. 'mbcs' on + * Windows). Rather we need a nominal Jython file system encoding, for use where the standard + * library forces byte paths on us (in Python 2). There is no reason for this choice to vary + * with OS platform. Methods receiving paths as bytes will + * {@link #fileSystemDecode(PyString)} them again for Java. + * + * @param filename as unicode to encode, or already as bytes + * @return encoded bytes version of path + */ + public static PyString fileSystemEncode(String filename) { + if (CharMatcher.ascii().matchesAllOf(filename)) { + // Just wrap it as US-ASCII is a subset of the file system encoding + return Py.newString(filename); + } else { + // It's non just US-ASCII, so must encode properly + assert "utf-8".equals(PySystemState.FILE_SYSTEM_ENCODING.toString()); + return Py.newString(codecs.PyUnicode_EncodeUTF8(filename, null)); + } + } + + /** + * Return a PyString object we can use as a file name or file path in places where Python + * expects a bytes (that is, str) object in the file system encoding. + * In Jython, this encoding is UTF-8, irrespective of the OS platform. This method is comparable + * with Python 3 os.fsencode. If the argument is a PyString, it is returned + * unchanged. If the argument is a PyUnicode, it is converted to a bytes using the + * nominal Jython file system encoding. + * + * @param filename as unicode to encode, or already as bytes + * @return encoded bytes version of path + */ + public static PyString fileSystemEncode(PyString filename) { + return (filename instanceof PyUnicode) ? fileSystemEncode(filename.getString()) : filename; + } + + /** + * Convert a PyList path to a list of Java String objects decoded from + * the path elements to strings guaranteed usable in the Java API. + * + * @param path a Python search path + * @return equivalent Java list + */ + private static List fileSystemDecode(PyList path) { + List list = new ArrayList<>(path.__len__()); + for (PyObject filename : path.getList()) { + list.add(fileSystemDecode(filename)); + } + return list; + } + public static PyStringMap newStringMap() { // enable lazy bootstrapping (see issue #1671) if (!PyType.hasBuilder(PyStringMap.class)) { @@ -1073,11 +1175,11 @@ } Py.getSystemState().callExitFunc(); } - //XXX: this needs review to make sure we are cutting out all of the Java - // exceptions. + + //XXX: this needs review to make sure we are cutting out all of the Java exceptions. private static String getStackTrace(Throwable javaError) { - ByteArrayOutputStream buf = new ByteArrayOutputStream(); - javaError.printStackTrace(new PrintStream(buf)); + CharArrayWriter buf = new CharArrayWriter(); + javaError.printStackTrace(new PrintWriter(buf)); String str = buf.toString(); int index = -1; @@ -1170,31 +1272,107 @@ ts.exception = null; } - public static void displayException(PyObject type, PyObject value, PyObject tb, - PyObject file) { + /** + * Print the description of an exception as a big string. The arguments are closely equivalent + * to the tuple returned by Python sys.exc_info, on standard error or a given + * byte-oriented file. Compare with Python traceback.print_exception. + * + * @param type of exception + * @param value the exception parameter (second argument to raise) + * @param tb traceback of the call stack where the exception originally occurred + * @param file to print encoded string to, or null meaning standard error + */ + public static void displayException(PyObject type, PyObject value, PyObject tb, PyObject file) { + + // Output is to standard error, unless a file object has been given. StdoutWrapper stderr = Py.stderr; + + // As we format the exception in Unicode, we deal with encoding in this method + String encoding, errors = codecs.REPLACE; + if (file != null) { + // Ostensibly writing to a file: assume file content encoding (file.encoding) stderr = new FixedFileWrapper(file); + encoding = codecs.getDefaultEncoding(); + } else { + // Not a file, assume we should encode for the console + encoding = getAttr(Py.getSystemState().__stderr__, "encoding", null); } + + // But if the stream can tell us directly, of course we use that answer. + encoding = getAttr(stderr.myFile(), "encoding", encoding); + errors = getAttr(stderr.myFile(), "errors", errors); + flushLine(); + // The creation of the report operates entirely in Java String (to support Unicode). + try { + // Be prepared for formatting or printing to fail + PyString bytes = exceptionToBytes(type, value, tb, encoding, errors); + stderr.print(bytes); + } catch (Exception ex) { + // Looks like that exception just won't convert or print + value = Py.newString(""); + PyString bytes = exceptionToBytes(type, value, tb, encoding, errors); + stderr.print(bytes); + } + } + + /** Get a String attribute from an object or a return a default. */ + private static String getAttr(PyObject target, String internedName, String def) { + PyObject attr = target.__findattr__(internedName); + if (attr == null) { + return def; + } else if (attr instanceof PyUnicode) { + return ((PyUnicode)attr).getString(); + } else { + return attr.__str__().getString(); + } + } + + /** + * Helper for {@link #displayException(PyObject, PyObject, PyObject, PyObject)}, falling back to + * US-ASCII as the last resort encoding. + */ + private static PyString exceptionToBytes(PyObject type, PyObject value, PyObject tb, + String encoding, String errors) { + String string = exceptionToString(type, value, tb); + String bytes; // not UTF-16 + try { + // Format the exception and stack-trace in all its glory + bytes = codecs.encode(Py.newUnicode(string), encoding, errors); + } catch (Exception ex) { + // Sometimes a working codec is just too much to ask + bytes = codecs.PyUnicode_EncodeASCII(string, string.length(), codecs.REPLACE); + } + return Py.newString(bytes); + } + + /** + * Format the description of an exception as a big string. The arguments are closely equivalent + * to the tuple returned by Python sys.exc_info. Compare with Python + * traceback.format_exception. + * + * @param type of exception + * @param value the exception parameter (second argument to raise) + * @param tb traceback of the call stack where the exception originally occurred + * @return string representation of the traceback and exception + */ + static String exceptionToString(PyObject type, PyObject value, PyObject tb) { + + // Compose the stack dump, syntax error, and actual exception in this buffer: + StringBuilder buf; + if (tb instanceof PyTraceback) { - stderr.print(((PyTraceback) tb).dumpStack()); + buf = new StringBuilder(((PyTraceback)tb).dumpStack()); + } else { + buf = new StringBuilder(); } + if (__builtin__.isinstance(value, Py.SyntaxError)) { - PyObject filename = value.__findattr__("filename"); - PyObject text = value.__findattr__("text"); - PyObject lineno = value.__findattr__("lineno"); - stderr.print(" File \""); - stderr.print(filename == Py.None || filename == null ? - "" : filename.toString()); - stderr.print("\", line "); - stderr.print(lineno == null ? Py.newString("0") : lineno); - stderr.print("\n"); - if (text != Py.None && text != null && text.__len__() != 0) { - printSyntaxErrorText(stderr, value.__findattr__("offset").asInt(), - text.toString()); - } + // The value part of the exception is a syntax error: first emit that. + appendSyntaxError(buf, value); + // Now supersede it with just the syntax error message for the next phase. value = value.__findattr__("msg"); if (value == null) { value = Py.None; @@ -1203,26 +1381,46 @@ if (value.getJavaProxy() != null) { Object javaError = value.__tojava__(Throwable.class); - if (javaError != null && javaError != Py.NoConversion) { - stderr.println(getStackTrace((Throwable) javaError)); + // The value is some Java Throwable: append that too + buf.append(getStackTrace((Throwable)javaError)); } } - try { - stderr.println(formatException(type, value)); - } catch (Exception ex) { - stderr.println(formatException(type, Py.None)); + + // Formatting the value may raise UnicodeEncodeError: client must deal + buf.append(formatException(type, value)).append('\n'); + return buf.toString(); + } + + /** + * Helper to {@link #tracebackToString(PyObject, PyObject)} when the value in an exception turns + * out to be a syntax error. + */ + private static void appendSyntaxError(StringBuilder buf, PyObject value) { + + PyObject filename = value.__findattr__("filename"); + PyObject text = value.__findattr__("text"); + PyObject lineno = value.__findattr__("lineno"); + + buf.append(" File \""); + buf.append(filename == Py.None || filename == null ? "" : filename.toString()); + buf.append("\", line "); + buf.append(lineno == null ? Py.newString('0') : lineno); + buf.append('\n'); + + if (text != Py.None && text != null && text.__len__() != 0) { + appendSyntaxErrorText(buf, value.__findattr__("offset").asInt(), text.toString()); } } /** - * Print the two lines showing where a SyntaxError was caused. + * Generate two lines showing where a SyntaxError was caused. * - * @param out StdoutWrapper to print to + * @param buf to append with generated message text * @param offset the offset into text - * @param text a source code String line + * @param text a source code line */ - private static void printSyntaxErrorText(StdoutWrapper out, int offset, String text) { + private static void appendSyntaxErrorText(StringBuilder buf, int offset, String text) { if (offset >= 0) { if (offset > 0 && offset == text.length()) { offset--; @@ -1250,19 +1448,21 @@ text = text.substring(i, text.length()); } - out.print(" "); - out.print(text); + buf.append(" "); + buf.append(text); if (text.length() == 0 || !text.endsWith("\n")) { - out.print("\n"); + buf.append('\n'); } if (offset == -1) { return; } - out.print(" "); + + // The indicator line " ^" + buf.append(" "); for (offset--; offset > 0; offset--) { - out.print(" "); + buf.append(' '); } - out.print("^\n"); + buf.append("^\n"); } public static String formatException(PyObject type, PyObject value) { @@ -1290,26 +1490,40 @@ } buf.append(className); } else { - buf.append(useRepr ? type.__repr__() : type.__str__()); + // Never happens since Python 2.7? Do something sensible anyway. + buf.append(asMessageString(type, useRepr)); } + if (value != null && value != Py.None) { - // only print colon if the str() of the object is not the empty string - PyObject s = useRepr ? value.__repr__() : value.__str__(); - if (!(s instanceof PyString) || s.__len__() != 0) { - buf.append(": "); + String s = asMessageString(value, useRepr); + // Print colon and object (unless it renders as "") + if (s.length() > 0) { + buf.append(": ").append(s); } - buf.append(s); } + return buf.toString(); } + /** Defensive method to avoid exceptions from decoding (or import encodings) */ + private static String asMessageString(PyObject value, boolean useRepr) { + if (useRepr) + value = value.__repr__(); + if (value instanceof PyUnicode) { + return value.asString(); + } else { + // Carefully avoid decoding errors that would swallow the intended message + String s = value.__str__().getString(); + return PyString.encode_UnicodeEscape(s, false); + } + } + public static void writeUnraisable(Throwable unraisable, PyObject obj) { PyException pye = JavaError(unraisable); stderr.println(String.format("Exception %s in %s ignored", formatException(pye.type, pye.value, true), obj)); } - /* Equivalent to Python's assert statement */ public static void assert_(PyObject test, PyObject message) { if (!test.__nonzero__()) { @@ -1565,6 +1779,16 @@ } } + private static final String IMPORT_SITE_ERROR = "" + + "Cannot import site module and its dependencies: %s\n" + + "Determine if the following attributes are correct:\n" // + + " * sys.path: %s\n" + + " This attribute might be including the wrong directories, such as from CPython\n" + + " * sys.prefix: %s\n" + + " This attribute is set by the system property python.home, although it can\n" + + " be often automatically determined by the location of the Jython jar file\n\n" + + "You can use the -S option or python.import.site=false to not import the site module"; + public static boolean importSiteIfSelected() { if (Options.importSite) { try { @@ -1574,18 +1798,10 @@ } catch (PyException pye) { if (pye.match(Py.ImportError)) { PySystemState sys = Py.getSystemState(); - throw Py.ImportError(String.format("" - + "Cannot import site module and its dependencies: %s\n" - + "Determine if the following attributes are correct:\n" - + " * sys.path: %s\n" - + " This attribute might be including the wrong directories, such as from CPython\n" - + " * sys.prefix: %s\n" - + " This attribute is set by the system property python.home, although it can\n" - + " be often automatically determined by the location of the Jython jar file\n\n" - + "You can use the -S option or python.import.site=false to not import the site module", - pye.value.__getattr__("args").__getitem__(0), - sys.path, - sys.prefix)); + String value = pye.value.__getattr__("args").__getitem__(0).toString(); + List path = fileSystemDecode(sys.path); + String prefix = fileSystemDecode(PySystemState.prefix); + throw Py.ImportError(String.format(IMPORT_SITE_ERROR, value, path, prefix)); } else { throw pye; } @@ -2266,7 +2482,7 @@ } /* Here we would actually like to call cls.__findattr__("__metaclass__") * rather than cls.getType(). However there are circumstances where the - * metaclass doesn't show up as __metaclass__. On the other hand we need + * metaclass doesn't show up as __metaclass__. On the other hand we need * to avoid that checker refers to builtin type___subclasscheck__ or * type___instancecheck__. Filtering out checker-instances of * PyBuiltinMethodNarrow does the trick. We also filter out PyMethodDescr diff --git a/src/org/python/core/PyBaseCode.java b/src/org/python/core/PyBaseCode.java --- a/src/org/python/core/PyBaseCode.java +++ b/src/org/python/core/PyBaseCode.java @@ -170,7 +170,7 @@ } return call(state, frame, closure); } - + @Override public PyObject call(ThreadState state, PyObject arg1, PyObject arg2, PyObject arg3, PyObject arg4, PyObject globals, @@ -309,8 +309,10 @@ } public String toString() { - return String.format("", - co_name, Py.idstr(this), co_filename, co_firstlineno); + // Result must be convertible to a str (for __repr__()), but let's make it fully printable. + String filename = PyString.encode_UnicodeEscape(co_filename, '"'); + return String.format("", + co_name, Py.idstr(this), filename, co_firstlineno); } protected abstract PyObject interpret(PyFrame f, ThreadState ts); diff --git a/src/org/python/core/PyBaseException.java b/src/org/python/core/PyBaseException.java --- a/src/org/python/core/PyBaseException.java +++ b/src/org/python/core/PyBaseException.java @@ -169,12 +169,17 @@ @ExposedMethod(doc = BuiltinDocs.BaseException___str___doc) final PyString BaseException___str__() { switch (args.__len__()) { - case 0: - return Py.EmptyString; - case 1: - return args.__getitem__(0).__str__(); - default: - return args.__str__(); + case 0: + return Py.EmptyString; + case 1: + PyObject arg = args.__getitem__(0); + if (arg instanceof PyString) { + return (PyString)arg; + } else { + return arg.__str__(); + } + default: + return args.__str__(); } } diff --git a/src/org/python/core/PyBytecode.java b/src/org/python/core/PyBytecode.java --- a/src/org/python/core/PyBytecode.java +++ b/src/org/python/core/PyBytecode.java @@ -116,11 +116,13 @@ throw Py.AttributeError(name); } + @Override public void __setattr__(String name, PyObject value) { // no writable attributes throwReadonly(name); } + @Override public void __delattr__(String name) { throwReadonly(name); } @@ -137,6 +139,7 @@ return new PyTuple(pystr); } + @Override public PyObject __findattr_ex__(String name) { // have to craft co_varnames specially if (name == "co_varnames") { @@ -149,7 +152,7 @@ return toPyStringTuple(co_freevars); } if (name == "co_filename") { - return new PyString(co_filename); + return Py.fileSystemEncode(co_filename); // bytes object expected by clients } if (name == "co_name") { return new PyString(co_name); @@ -1156,7 +1159,7 @@ "zap" this information, to prevent END_FINALLY from re-raising the exception. (But non-local gotos should still be resumed.) - */ + */ PyObject exit; PyObject u = stack.pop(), v, w; if (u == Py.None) { @@ -1350,7 +1353,7 @@ if (why != Why.RETURN) { retval = Py.None; } - } else { + } else { // store the stack in the frame for reentry from the yield; f.f_savedlocals = stack.popN(stack.size()); } diff --git a/src/org/python/core/PyException.java b/src/org/python/core/PyException.java --- a/src/org/python/core/PyException.java +++ b/src/org/python/core/PyException.java @@ -62,21 +62,31 @@ } private boolean printingStackTrace = false; + @Override public void printStackTrace() { Py.printException(this); } + @Override public Throwable fillInStackTrace() { return Options.includeJavaStackInExceptions ? super.fillInStackTrace() : this; } + @Override public synchronized void printStackTrace(PrintStream s) { if (printingStackTrace) { super.printStackTrace(s); } else { try { + /* + * Ensure that non-ascii characters are made printable. IOne would prefer to emit + * Unicode, but the output stream too often only accepts bytes. (s is not + * necessarily a console, e.g. during a doctest.) + */ + PyFile err = new PyFile(s); + err.setEncoding("ascii", "backslashreplace"); printingStackTrace = true; - Py.displayException(type, value, traceback, new PyFile(s)); + Py.displayException(type, value, traceback, err); } finally { printingStackTrace = false; } @@ -92,12 +102,9 @@ } } + @Override public synchronized String toString() { - ByteArrayOutputStream buf = new ByteArrayOutputStream(); - if (!printingStackTrace) { - printStackTrace(new PrintStream(buf)); - } - return buf.toString(); + return Py.exceptionToString(type, value, traceback); } /** @@ -332,10 +339,11 @@ public static String exceptionClassName(PyObject obj) { return obj instanceof PyClass ? ((PyClass)obj).__name__ : ((PyType)obj).fastGetName(); } - - + + /* Traverseproc support */ + @Override public int traverse(Visitproc visit, Object arg) { int retValue; if (type != null) { @@ -357,6 +365,7 @@ return 0; } + @Override public boolean refersDirectlyTo(PyObject ob) { return ob != null && (type == ob || value == ob || traceback == ob); } diff --git a/src/org/python/core/PyFile.java b/src/org/python/core/PyFile.java --- a/src/org/python/core/PyFile.java +++ b/src/org/python/core/PyFile.java @@ -168,10 +168,6 @@ ArgParser ap = new ArgParser("file", args, kwds, new String[] {"name", "mode", "buffering"}, 1); PyObject name = ap.getPyObject(0); - if (!(name instanceof PyString)) { - throw Py.TypeError("coercing to Unicode: need string, '" + name.getType().fastGetName() - + "' type found"); - } String mode = ap.getString(1, "r"); int bufsize = ap.getInt(2, -1); file___init__(new FileIO((PyString) name, parseMode(mode)), name, mode, bufsize); @@ -179,7 +175,7 @@ } private void file___init__(RawIOBase raw, String name, String mode, int bufsize) { - file___init__(raw, new PyString(name), mode, bufsize); + file___init__(raw, Py.newStringOrUnicode(name), mode, bufsize); } private void file___init__(RawIOBase raw, PyObject name, String mode, int bufsize) { diff --git a/src/org/python/core/PyJavaPackage.java b/src/org/python/core/PyJavaPackage.java --- a/src/org/python/core/PyJavaPackage.java +++ b/src/org/python/core/PyJavaPackage.java @@ -138,9 +138,8 @@ if (name == "__dict__") return __dict__; if (name == "__mgr__") return Py.java2py(__mgr__); if (name == "__file__") { - if (__file__ != null) return new PyString(__file__); - - return Py.None; + // Stored as UTF-16 for Java but expected as bytes in Python + return __file__ == null ? Py.None : Py.fileSystemEncode(__file__); } return null; @@ -157,7 +156,8 @@ return; } if (attr == "__file__") { - __file__ = value.__str__().toString(); + // Stored as UTF-16 for Java but presented as bytes from Python + __file__ = Py.fileSystemDecode(value); return; } diff --git a/src/org/python/core/PyLong.java b/src/org/python/core/PyLong.java --- a/src/org/python/core/PyLong.java +++ b/src/org/python/core/PyLong.java @@ -295,6 +295,9 @@ @Override public Object __tojava__(Class c) { try { + if (c == Boolean.TYPE || c == Boolean.class) { + return new Boolean(!getValue().equals(BigInteger.ZERO)); + } if (c == Byte.TYPE || c == Byte.class) { return new Byte((byte)getLong(Byte.MIN_VALUE, Byte.MAX_VALUE)); } diff --git a/src/org/python/core/PyNullImporter.java b/src/org/python/core/PyNullImporter.java --- a/src/org/python/core/PyNullImporter.java +++ b/src/org/python/core/PyNullImporter.java @@ -20,7 +20,7 @@ public PyNullImporter(PyObject pathObj) { super(); - String pathStr = asPath(pathObj); + String pathStr = Py.fileSystemDecode(pathObj); if (pathStr.equals("")) { throw Py.ImportError("empty pathname"); } @@ -42,17 +42,6 @@ return Py.None; } - // FIXME Refactoring move helper function to a central util library - // FIXME Also can take in account working in zip file systems - - private static String asPath(PyObject pathObj) { - if (!(pathObj instanceof PyString)) { - throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", - pathObj.getType().fastGetName())); - } - return pathObj.toString(); - } - private static boolean isDir(String pathStr) { if (pathStr.equals("")) { return false; diff --git a/src/org/python/core/PyString.java b/src/org/python/core/PyString.java --- a/src/org/python/core/PyString.java +++ b/src/org/python/core/PyString.java @@ -79,7 +79,7 @@ } PyString(StringBuilder buffer) { - this(TYPE, new String(buffer)); + this(TYPE, buffer.toString()); } /** @@ -302,19 +302,51 @@ private static char[] hexdigit = "0123456789abcdef".toCharArray(); public static String encode_UnicodeEscape(String str, boolean use_quotes) { + char quote = use_quotes ? '?' : 0; + return encode_UnicodeEscape(str, quote); + } + + /** + * The inner logic of the string __repr__ producing an ASCII representation of the target + * string, optionally in quotations. The caller can determine whether the returned string will + * be wrapped in quotation marks, and whether Python rules are used to choose them through + * quote. + * + * @param str + * @param quoteChar '"' or '\'' use that, '?' = let Python choose, 0 or anything = no quotes + * @return encoded string (possibly the same string if unchanged) + */ + public static String encode_UnicodeEscape(String str, char quote) { + + // Choose whether to quote and the actual quote character + boolean use_quotes; + switch (quote) { + case '?': + use_quotes = true; + // Python rules + quote = str.indexOf('\'') >= 0 && str.indexOf('"') == -1 ? '"' : '\''; + break; + case '"': + case '\'': + use_quotes = true; + break; + default: + use_quotes = false; + break; + } + + // Allocate a buffer for the result (25% bigger and room for quotes) int size = str.length(); - StringBuilder v = new StringBuilder(str.length()); - - char quote = 0; + StringBuilder v = new StringBuilder(size + (size >> 2) + 2); if (use_quotes) { - quote = str.indexOf('\'') >= 0 && str.indexOf('"') == -1 ? '"' : '\''; v.append(quote); } + // Now chunter through the original string a character at a time for (int i = 0; size-- > 0;) { int ch = str.charAt(i++); - /* Escape quotes */ + // Escape quotes and backslash if ((use_quotes && ch == quote) || ch == '\\') { v.append('\\'); v.append((char)ch); @@ -368,10 +400,13 @@ v.append((char)ch); } } + if (use_quotes) { v.append(quote); } - return v.toString(); + + // Return the original string if we didn't quote or escape anything + return v.length() > size ? v.toString() : str; } private static ucnhashAPI pucnHash = null; @@ -3998,9 +4033,9 @@ * Implements PEP-3101 {}-formatting methods str.format() and * unicode.format(). When called with enclosingIterator == null, this * method takes this object as its formatting string. The method is also called (calls itself) - * to deal with nested formatting sepecifications. In that case, enclosingIterator + * to deal with nested formatting specifications. In that case, enclosingIterator * is a {@link MarkupIterator} on this object and value is a substring of this - * object needing recursive transaltion. + * object needing recursive translation. * * @param args to be interpolated into the string * @param keywords for the trailing args diff --git a/src/org/python/core/PySyntaxError.java b/src/org/python/core/PySyntaxError.java --- a/src/org/python/core/PySyntaxError.java +++ b/src/org/python/core/PySyntaxError.java @@ -16,17 +16,15 @@ String filename; - public PySyntaxError(String s, int line, int column, String text, - String filename) + public PySyntaxError(String s, int line, int column, String text, String filename) { super(Py.SyntaxError); - //XXX: null text causes Java error, though I bet I'm not supposed to - // get null text. + //XXX: null text causes Java error, though I bet I'm not supposed to get null text. if (text == null) { text = ""; } PyObject[] tmp = new PyObject[] { - new PyString(filename), new PyInteger(line), + Py.fileSystemEncode(filename), new PyInteger(line), new PyInteger(column), new PyString(text) }; diff --git a/src/org/python/core/PySystemState.java b/src/org/python/core/PySystemState.java --- a/src/org/python/core/PySystemState.java +++ b/src/org/python/core/PySystemState.java @@ -82,6 +82,9 @@ public final static PyString float_repr_style = Py.newString("short"); + /** Nominal Jython file system encoding (as sys.getfilesystemencoding()) */ + static final PyString FILE_SYSTEM_ENCODING = Py.newString("utf-8"); + public static boolean py3kwarning = false; public final static Class flags = Options.class; @@ -109,12 +112,25 @@ public static PackageManager packageManager; private static File cachedir; - private static PyList defaultPath; - private static PyList defaultArgv; - private static PyObject defaultExecutable; + private static PyList defaultPath; // list of bytes or unicode + private static PyList defaultArgv; // list of bytes or unicode + private static PyObject defaultExecutable; // bytes or unicode or None public static Properties registry; // = init_registry(); + /** + * A string giving the site-specific directory prefix where the platform independent Python + * files are installed; by default, this is based on the property python.home or + * the location of the Jython JAR. The main collection of Python library modules is installed in + * the directory prefix/Lib. This object should contain bytes in the file system + * encoding for consistency with use in the standard library (see sysconfig.py). + */ public static PyObject prefix; + /** + * A string giving the site-specific directory prefix where the platform-dependent Python files + * are installed; by default, this is the same as {@link #exec_prefix}. This object should + * contain bytes in the file system encoding for consistency with use in the standard library + * (see sysconfig.py). + */ public static PyObject exec_prefix = Py.EmptyString; public static final PyString byteorder = new PyString("big"); @@ -504,7 +520,7 @@ } public PyObject getfilesystemencoding() { - return Py.None; + return FILE_SYSTEM_ENCODING; } @@ -840,10 +856,10 @@ } } if (prefix != null) { - PySystemState.prefix = Py.newString(prefix); + PySystemState.prefix = Py.fileSystemEncode(prefix); } if (exec_prefix != null) { - PySystemState.exec_prefix = Py.newString(exec_prefix); + PySystemState.exec_prefix = Py.fileSystemEncode(exec_prefix); } try { String jythonpath = System.getenv("JYTHONPATH"); @@ -1155,7 +1171,8 @@ } cachedir = new File(props.getProperty(PYTHON_CACHEDIR, CACHEDIR_DEFAULT_NAME)); if (!cachedir.isAbsolute()) { - cachedir = new File(prefix == null ? null : prefix.toString(), cachedir.getPath()); + String prefixString = prefix == null ? null : Py.fileSystemDecode(prefix); + cachedir = new File(prefixString, cachedir.getPath()); } } @@ -1174,16 +1191,17 @@ PyList argv = new PyList(); if (args != null) { for (String arg : args) { - argv.append(Py.newStringOrUnicode(arg)); + // For consistency with CPython and the standard library, sys.argv is FS-encoded. + argv.append(Py.fileSystemEncode(arg)); } } return argv; } /** - * Determine the default sys.executable value from the registry. - * If registry is not set (as in standalone jython jar), will use sys.prefix + /bin/jython(.exe) and the file may - * not exist. Users can create a wrapper in it's place to make it work in embedded environments. + * Determine the default sys.executable value from the registry. If registry is not set (as in + * standalone jython jar), we will use sys.prefix + /bin/jython(.exe) and the file may not + * exist. Users can create a wrapper in it's place to make it work in embedded environments. * Only if sys.prefix is null, returns Py.None * * @param props a Properties registry @@ -1191,26 +1209,26 @@ */ private static PyObject initExecutable(Properties props) { String executable = props.getProperty("python.executable"); - if (executable == null) { + File executableFile; + if (executable != null) { + // The executable from the registry is a Unicode String path + executableFile = new File(executable); + } else { if (prefix == null) { return Py.None; } else { - executable = prefix.asString() + File.separator + "bin" + File.separator; - if (Platform.IS_WINDOWS) { - executable += "jython.exe"; - } else { - executable += "jython"; - } + // The prefix is a unicode or encoded bytes object + executableFile = new File(Py.fileSystemDecode(prefix), + Platform.IS_WINDOWS ? "bin\\jython.exe" : "bin/jython"); } } - File executableFile = new File(executable); try { executableFile = executableFile.getCanonicalFile(); } catch (IOException ioe) { executableFile = executableFile.getAbsoluteFile(); } - return new PyString(executableFile.getPath()); + return Py.newStringOrUnicode(executableFile.getPath()); // XXX always bytes in CPython } /** @@ -1353,8 +1371,8 @@ PyList path = new PyList(); addPaths(path, props.getProperty("python.path", "")); if (prefix != null) { - String libpath = new File(prefix.toString(), "Lib").toString(); - path.append(new PyString(libpath)); + String libpath = new File(Py.fileSystemDecode(prefix), "Lib").toString(); + path.append(Py.fileSystemEncode(libpath)); // XXX or newUnicode? } if (standalone) { // standalone jython: add the /Lib directory inside JYTHON_JAR to the path @@ -1397,7 +1415,8 @@ private static void addPaths(PyList path, String pypath) { StringTokenizer tok = new StringTokenizer(pypath, java.io.File.pathSeparator); while (tok.hasMoreTokens()) { - path.append(new PyString(tok.nextToken().trim())); + // Use unicode object if necessary to represent the element + path.append(Py.newStringOrUnicode(tok.nextToken().trim())); // XXX or newUnicode? } } @@ -1540,6 +1559,7 @@ closer.cleanup(); } + @Override public void close() { cleanup(); } public static class PySystemStateCloser { diff --git a/src/org/python/core/PyTableCode.java b/src/org/python/core/PyTableCode.java --- a/src/org/python/core/PyTableCode.java +++ b/src/org/python/core/PyTableCode.java @@ -66,6 +66,7 @@ // co_lnotab, co_stacksize }; + @Override public PyObject __dir__() { PyString members[] = new PyString[__members__.length]; for (int i = 0; i < __members__.length; i++) @@ -80,11 +81,13 @@ throw Py.AttributeError(name); } + @Override public void __setattr__(String name, PyObject value) { // no writable attributes throwReadonly(name); } + @Override public void __delattr__(String name) { throwReadonly(name); } @@ -99,6 +102,7 @@ return new PyTuple(pystr); } + @Override public PyObject __findattr_ex__(String name) { // have to craft co_varnames specially if (name == "co_varnames") { @@ -111,7 +115,7 @@ return toPyStringTuple(co_freevars); } if (name == "co_filename") { - return new PyString(co_filename); + return Py.fileSystemEncode(co_filename); // bytes object expected by clients } if (name == "co_name") { return new PyString(co_name); diff --git a/src/org/python/core/PyUnicode.java b/src/org/python/core/PyUnicode.java --- a/src/org/python/core/PyUnicode.java +++ b/src/org/python/core/PyUnicode.java @@ -89,7 +89,7 @@ } PyUnicode(StringBuilder buffer) { - this(TYPE, new String(buffer)); + this(TYPE, buffer.toString()); } private static StringBuilder fromCodePoints(Iterator iter) { @@ -713,7 +713,7 @@ for (Iterator iter = newSubsequenceIterator(start, stop, step); iter.hasNext();) { buffer.appendCodePoint(iter.next()); } - return createInstance(new String(buffer)); + return createInstance(buffer.toString()); } @ExposedMethod(type = MethodType.CMP, doc = BuiltinDocs.unicode___getslice___doc) diff --git a/src/org/python/core/StdoutWrapper.java b/src/org/python/core/StdoutWrapper.java --- a/src/org/python/core/StdoutWrapper.java +++ b/src/org/python/core/StdoutWrapper.java @@ -102,28 +102,33 @@ } private String printToFile(PyFile file, PyObject o) { - String s; + // We must ensure o is a byte string before we write it to the stream + String bytes; + if (!(o instanceof PyUnicode)) { + o = o.__str__(); + } + // o is now a PyString, but it might be unicode or bytes if (o instanceof PyUnicode) { // Use the encoding and policy defined for the stream. (Each may be null.) - s = ((PyUnicode)o).encode(file.encoding, file.errors); + bytes = ((PyUnicode)o).encode(file.encoding, file.errors); } else { - s = o.__str__().toString(); + bytes = ((PyString)o).getString(); } - file.write(s); - return s; + file.write(bytes); + return bytes; } private String printToFileWriter(PyFileWriter file, PyObject o) { - // since we are outputting directly to a character stream, - // avoid doing an encoding - String s; - if (o instanceof PyString) { - s = ((PyString) o).getString(); + // since we are outputting directly to a character stream, avoid encoding + String chars; + if (o instanceof PyUnicode) { + chars = ((PyString) o).getString(); } else { - s = o.__str__().toString(); + // Bytes here are assumed to be code points, as in PyFileWriter.write() + chars = o.__str__().getString(); } - file.write(s); - return s; + file.write(chars); + return chars; } private void printToFileObject(PyObject file, PyObject o) { @@ -248,11 +253,11 @@ } public void print(String s) { - print(Py.newStringOrUnicode(s), false, false); + print(Py.newUnicode(s), false, false); } public void println(String s) { - print(Py.newStringOrUnicode(s), false, true); + print(Py.newUnicode(s), false, true); } public void print(PyObject o) { diff --git a/src/org/python/core/SyspathArchive.java b/src/org/python/core/SyspathArchive.java --- a/src/org/python/core/SyspathArchive.java +++ b/src/org/python/core/SyspathArchive.java @@ -1,4 +1,3 @@ - package org.python.core; import java.io.*; import java.util.zip.*; @@ -8,7 +7,8 @@ private ZipFile zipFile; public SyspathArchive(String archiveName) throws IOException { - super(archiveName); + // As a string-like object (on sys.path) an FS-encoded bytes object is expected + super(Py.fileSystemEncode(archiveName).getString()); archiveName = getArchiveName(archiveName); if(archiveName == null) { throw new IOException("path '" + archiveName + "' not an archive"); @@ -20,7 +20,8 @@ } SyspathArchive(ZipFile zipFile, String archiveName) { - super(archiveName); + // As a string-like object (on sys.path) an FS-encoded bytes object is expected + super(Py.fileSystemEncode(archiveName).getString()); this.zipFile = zipFile; } diff --git a/src/org/python/core/SyspathJavaLoader.java b/src/org/python/core/SyspathJavaLoader.java --- a/src/org/python/core/SyspathJavaLoader.java +++ b/src/org/python/core/SyspathJavaLoader.java @@ -26,20 +26,20 @@ public SyspathJavaLoader(ClassLoader parent) { super(parent); } - - /** + + /** * Returns a byte[] with the contents read from an InputStream. - * + * * The stream is closed after reading the bytes. - * - * @param input The input stream + * + * @param input The input stream * @param size The number of bytes to read - * + * * @return an array of byte[size] with the contents read * */ private byte[] getBytesFromInputStream(InputStream input, int size) { - try { + try { byte[] buffer = new byte[size]; int nread = 0; while(nread < size) { @@ -56,9 +56,9 @@ } } } - + private byte[] getBytesFromDir(String dir, String name) { - try { + try { File file = getFile(dir, name); if (file == null) { return null; @@ -71,7 +71,7 @@ } } - + private byte[] getBytesFromArchive(SyspathArchive archive, String name) { String entryname = name.replace('.', SLASH_CHAR) + ".class"; ZipEntry ze = archive.getEntry(entryname); @@ -79,7 +79,7 @@ return null; } try { - return getBytesFromInputStream(archive.getInputStream(ze), + return getBytesFromInputStream(archive.getInputStream(ze), (int)ze.getSize()); } catch (IOException e) { return null; @@ -98,11 +98,11 @@ } return pkg; } - + @Override protected Class findClass(String name) throws ClassNotFoundException { PySystemState sys = Py.getSystemState(); - ClassLoader sysClassLoader = sys.getClassLoader(); + ClassLoader sysClassLoader = sys.getClassLoader(); if (sysClassLoader != null) { // sys.classLoader overrides this class loader! return sysClassLoader.loadClass(name); @@ -114,13 +114,10 @@ PyObject entry = replacePathItem(sys, i, path); if (entry instanceof SyspathArchive) { SyspathArchive archive = (SyspathArchive)entry; - buffer = getBytesFromArchive(archive, name); + buffer = getBytesFromArchive(archive, name); } else { - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); - buffer = getBytesFromDir(dir, name); + String dir = Py.fileSystemDecode(entry); + buffer = getBytesFromDir(dir, name); } if (buffer != null) { definePackageForClass(name); @@ -130,7 +127,7 @@ // couldn't find the .class file on sys.path throw new ClassNotFoundException(name); } - + @Override protected URL findResource(String res) { PySystemState sys = Py.getSystemState(); @@ -157,10 +154,7 @@ } continue; } - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = sys.getPath(entry.toString()); + String dir = sys.getPath(Py.fileSystemDecode(entry)); try { File resource = new File(dir, res); if (!resource.exists()) { @@ -179,7 +173,7 @@ throws IOException { List resources = new ArrayList(); - + PySystemState sys = Py.getSystemState(); res = deslashResource(res); @@ -204,10 +198,7 @@ } continue; } - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = sys.getPath(entry.toString()); + String dir = sys.getPath(Py.fileSystemDecode(entry)); try { File resource = new File(dir, res); if (!resource.exists()) { @@ -220,7 +211,7 @@ } return Collections.enumeration(resources); } - + static PyObject replacePathItem(PySystemState sys, int idx, PyList paths) { PyObject path = paths.__getitem__(idx); if (path instanceof SyspathArchive) { @@ -229,9 +220,9 @@ } try { - // this has the side affect of adding the jar to the PackageManager during the + // this has the side effect of adding the jar to the PackageManager during the // initialization of the SyspathArchive - path = new SyspathArchive(sys.getPath(path.toString())); + path = new SyspathArchive(sys.getPath(Py.fileSystemDecode(path))); } catch (Exception e) { return path; } diff --git a/src/org/python/core/__builtin__.java b/src/org/python/core/__builtin__.java --- a/src/org/python/core/__builtin__.java +++ b/src/org/python/core/__builtin__.java @@ -85,7 +85,7 @@ case 18: return __builtin__.eval(arg1); case 19: - __builtin__.execfile(arg1.asString()); + __builtin__.execfile(Py.fileSystemDecode(arg1)); return Py.None; case 23: return __builtin__.hex(arg1); @@ -141,7 +141,7 @@ case 18: return __builtin__.eval(arg1, arg2); case 19: - __builtin__.execfile(arg1.asString(), arg2); + __builtin__.execfile(Py.fileSystemDecode(arg1), arg2); return Py.None; case 20: return __builtin__.filter(arg1, arg2); @@ -191,7 +191,7 @@ case 18: return __builtin__.eval(arg1, arg2, arg3); case 19: - __builtin__.execfile(arg1.asString(), arg2, arg3); + __builtin__.execfile(Py.fileSystemDecode(arg1), arg2, arg3); return Py.None; case 21: return __builtin__.getattr(arg1, arg2, arg3); @@ -1629,7 +1629,7 @@ "dont_inherit"}, 3); PyObject source = ap.getPyObject(0); - String filename = ap.getString(1); + String filename = Py.fileSystemDecode(ap.getPyObject(1)); String mode = ap.getString(2); int flags = ap.getInt(3, 0); boolean dont_inherit = ap.getPyObject(4, Py.False).__nonzero__(); diff --git a/src/org/python/core/imp.java b/src/org/python/core/imp.java --- a/src/org/python/core/imp.java +++ b/src/org/python/core/imp.java @@ -296,6 +296,7 @@ return compileSource(name, makeStream(file), sourceFilename, mtime); } + /** Remove the last three characters of a file name and add the compiled suffix "$py.class". */ public static String makeCompiledFilename(String filename) { return filename.substring(0, filename.length() - 3) + "$py.class"; } @@ -420,7 +421,8 @@ } if (moduleLocation != null) { - module.__setattr__("__file__", new PyString(moduleLocation)); + // Standard library expects __file__ to be encoded bytes + module.__setattr__("__file__", Py.fileSystemEncode(moduleLocation)); } else if (module.__findattr__("__file__") == null) { // Should probably never happen (but maybe with an odd custom builtins, or // Java Integration) @@ -545,10 +547,8 @@ return loadFromLoader(loader, moduleName); } } - if (!(p instanceof PyUnicode)) { - p = p.__str__(); - } - ret = loadFromSource(sys, name, moduleName, p.toString()); + // p could be unicode or bytes (in the file system encoding) + ret = loadFromSource(sys, name, moduleName, Py.fileSystemDecode(p)); if (ret != null) { return ret; } @@ -608,7 +608,7 @@ // display names are for identification purposes (e.g. __file__): when entry is // null it forces java.io.File to be a relative path (e.g. foo/bar.py instead of // /tmp/foo/bar.py) - String displayDirName = entry.equals("") ? null : entry.toString(); + String displayDirName = entry.equals("") ? null : entry; String displaySourceName = new File(new File(displayDirName, name), sourceName).getPath(); String displayCompiledName = new File(new File(displayDirName, name), compiledName).getPath(); @@ -624,8 +624,9 @@ if (caseok(dir, name) && (sourceFile.isFile() || compiledFile.isFile())) { pkg = true; } else { + String printDirName = PyString.encode_UnicodeEscape(displayDirName, '\''); Py.warning(Py.ImportWarning, String.format( - "Not importing directory '%s': missing __init__.py", dirName)); + "Not importing directory %s: missing __init__.py", printDirName)); } } } catch (SecurityException e) { @@ -642,7 +643,7 @@ compiledFile = new File(dirName, compiledName); } else { PyModule m = addModule(modName); - PyObject filename = new PyString(new File(displayDirName, name).getPath()); + PyObject filename = Py.newStringOrUnicode(new File(displayDirName, name).getPath()); m.__dict__.__setitem__("__path__", new PyList(new PyObject[] {filename})); } @@ -935,9 +936,6 @@ } } } - if (name.indexOf(File.separatorChar) != -1) { - throw Py.ImportError("Import by filename is not supported."); - } PyObject modules = Py.getSystemState().modules; PyObject pkgMod = null; String pkgName = null; @@ -981,6 +979,13 @@ return mod; } + /** Defend against attempt to import by filename (withdrawn feature). */ + private static void checkNotFile(String name){ + if (name.indexOf(File.separatorChar) != -1) { + throw Py.ImportError("Import by filename is not supported."); + } + } + private static void ensureFromList(PyObject mod, PyObject fromlist, String name) { ensureFromList(mod, fromlist, name, false); } @@ -1023,6 +1028,7 @@ * @return an imported module (Java or Python) */ public static PyObject importName(String name, boolean top) { + checkNotFile(name); PyUnicode.checkEncoding(name); ReentrantLock importLock = Py.getSystemState().getImportLock(); importLock.lock(); @@ -1043,6 +1049,7 @@ */ public static PyObject importName(String name, boolean top, PyObject modDict, PyObject fromlist, int level) { + checkNotFile(name); PyUnicode.checkEncoding(name); ReentrantLock importLock = Py.getSystemState().getImportLock(); importLock.lock(); diff --git a/src/org/python/core/io/FileIO.java b/src/org/python/core/io/FileIO.java --- a/src/org/python/core/io/FileIO.java +++ b/src/org/python/core/io/FileIO.java @@ -67,22 +67,23 @@ * @see #FileIO(PyString name, String mode) */ public FileIO(String name, String mode) { - this(Py.newString(name), mode); + this(Py.newUnicode(name), mode); } /** - * Construct a FileIO instance for the specified file name. + * Construct a FileIO instance for the specified file name, which will be decoded using the + * nominal Jython file system encoding if it is a str/bytes rather than a + * unicode. * - * The mode can be 'r', 'w' or 'a' for reading (default), writing - * or appending. Add a '+' to the mode to allow simultaneous - * reading and writing. + * The mode can be 'r', 'w' or 'a' for reading (default), writing or appending. Add a '+' to the + * mode to allow simultaneous reading and writing. * * @param name the name of the file * @param mode a raw io file mode String */ public FileIO(PyString name, String mode) { parseMode(mode); - File absPath = new RelativeFile(name.toString()); + File absPath = new RelativeFile(Py.fileSystemDecode(name)); try { if ((appending && !(reading || plus)) || (writing && !reading && !plus)) { diff --git a/src/org/python/core/packagecache/PathPackageManager.java b/src/org/python/core/packagecache/PathPackageManager.java --- a/src/org/python/core/packagecache/PathPackageManager.java +++ b/src/org/python/core/packagecache/PathPackageManager.java @@ -40,12 +40,9 @@ + name; for (int i = 0; i < path.__len__(); i++) { + // Each entry in the path may be byte-encoded or unicode PyObject entry = path.pyget(i); - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); - + String dir = Py.fileSystemDecode(entry); File f = new RelativeFile(dir, child); try { if (f.isDirectory() && imp.caseok(f, name)) { @@ -103,11 +100,8 @@ String child = jpkg.__name__.replace('.', File.separatorChar); for (int i = 0; i < path.__len__(); i++) { - PyObject entry = path.pyget(i); - if (!(entry instanceof PyUnicode)) { - entry = entry.__str__(); - } - String dir = entry.toString(); + // Each entry in the path may be byte-encoded or unicode + String dir = Py.fileSystemDecode(path.pyget(i)); if (dir.length() == 0) { dir = null; @@ -222,10 +216,8 @@ * true if path refers to a jar. */ public void addClassPath(String path) { - PyList paths = new PyString(path).split(java.io.File.pathSeparator); - - for (int i = 0; i < paths.__len__(); i++) { - String entry = paths.pyget(i).toString(); + String[] paths = path.split(java.io.File.pathSeparator); + for (String entry: paths) { if (entry.endsWith(".jar") || entry.endsWith(".zip")) { addJarToPackages(new File(entry), true); } else { diff --git a/src/org/python/modules/_imp.java b/src/org/python/modules/_imp.java --- a/src/org/python/modules/_imp.java +++ b/src/org/python/modules/_imp.java @@ -68,14 +68,14 @@ * This needs to be consolidated with the code in (@see org.python.core.imp). * * @param name module name - * @param entry a path String + * @param entry a path String (Unicode file or directory name) * @param findingPackage if looking for a package only try to locate __init__ * @return null if no module found otherwise module information */ static ModuleInfo findFromSource(String name, String entry, boolean findingPackage, boolean preferSource) { String sourceName = "__init__.py"; - String compiledName = makeCompiledFilename(sourceName); + String compiledName = imp.makeCompiledFilename(sourceName); String directoryName = PySystemState.getPathLazy(entry); // displayDirName is for identification purposes: when null it // forces java.io.File to be a relative path (e.g. foo/bar.py @@ -97,7 +97,7 @@ } else { Py.writeDebug("import", "trying source " + dir.getPath()); sourceName = name + ".py"; - compiledName = makeCompiledFilename(sourceName); + compiledName = imp.makeCompiledFilename(sourceName); sourceFile = new File(directoryName, sourceName); compiledFile = new File(directoryName, compiledName); } @@ -152,8 +152,7 @@ throw Py.TypeError("must be a file-like object"); } PySystemState sys = Py.getSystemState(); - String compiledFilename = - makeCompiledFilename(sys.getPath(filename)); + String compiledFilename = imp.makeCompiledFilename(sys.getPath(filename)); mod = imp.createFromSource(modname.intern(), (InputStream)o, filename, compiledFilename); PyObject modules = sys.modules; @@ -161,15 +160,38 @@ return mod; } - public static PyObject load_compiled(String name, String pathname) { - return load_compiled(name, pathname, new PyFile(pathname, "rb", -1)); - } - public static PyObject reload(PyObject module) { return __builtin__.reload(module); } - public static PyObject load_compiled(String name, String pathname, PyObject file) { + /** + * Return a module with the given name, the result of executing the compiled code + * at the given pathname. If this path is a PyUnicode, it is used + * exactly; if it is a PyString it is taken to be file-system encoded. + * + * @param name the module name + * @param pathname to the compiled module (becomes __file__) + * @return the module called name + */ + public static PyObject load_compiled(String name, PyString pathname) { + String _pathname = Py.fileSystemDecode(pathname); + return _load_compiled(name, _pathname, new PyFile(_pathname, "rb", -1)); + } + + /** + * Return a module with the given name, the result of executing the compiled code + * in the given file stream. + * + * @param name the module name + * @param pathname a file path that is not null (becomes __file__) + * @param file stream from which the compiled code is taken + * @return the module called name + */ + public static PyObject load_compiled(String name, PyString pathname, PyObject file) { + return _load_compiled(name, Py.fileSystemDecode(pathname), file); + } + + private static PyObject _load_compiled(String name, String pathname, PyObject file) { InputStream stream = (InputStream) file.__tojava__(InputStream.class); if (stream == Py.NoConversion) { throw Py.TypeError("must be a file-like object"); @@ -190,8 +212,10 @@ public static PyObject find_module(String name, PyObject path) { if (path == Py.None && PySystemState.getBuiltin(name) != null) { - return new PyTuple(Py.None, Py.newString(name), - new PyTuple(Py.EmptyString, Py.EmptyString, + return new PyTuple(Py.None, + Py.newString(name), + new PyTuple(Py.EmptyString, + Py.EmptyString, Py.newInteger(C_BUILTIN))); } @@ -199,14 +223,15 @@ path = Py.getSystemState().path; } for (PyObject p : path.asIterable()) { - ModuleInfo mi = findFromSource(name, p.toString(), false, true); + ModuleInfo mi = findFromSource(name, Py.fileSystemDecode(p), false, true); if(mi == null) { continue; } return new PyTuple(mi.file, - new PyString(mi.filename), - new PyTuple(new PyString(mi.suffix), - new PyString(mi.mode), + // File names generally expected in the FS encoding + Py.fileSystemEncode(mi.filename), + new PyTuple(Py.newString(mi.suffix), + Py.newString(mi.mode), Py.newInteger(mi.type))); } throw Py.ImportError("No module named " + name); @@ -216,7 +241,8 @@ PyObject mod = Py.None; PySystemState sys = Py.getSystemState(); int type = data.__getitem__(2).asInt(); - while(mod == Py.None) { + String filenameString = Py.fileSystemDecode(filename); + while (mod == Py.None) { String compiledName; switch (type) { case PY_SOURCE: @@ -226,8 +252,8 @@ } // XXX: This should load the accompanying byte code file instead, if it exists - String resolvedFilename = sys.getPath(filename.toString()); - compiledName = makeCompiledFilename(resolvedFilename); + String resolvedFilename = sys.getPath(filenameString); + compiledName = imp.makeCompiledFilename(resolvedFilename); if (name.endsWith(".__init__")) { name = name.substring(0, name.length() - ".__init__".length()); } else if (name.equals("__init__")) { @@ -241,19 +267,20 @@ } mod = imp.createFromSource(name.intern(), (InputStream)o, - filename.toString(), compiledName, mtime); + filenameString, compiledName, mtime); break; case PY_COMPILED: - mod = load_compiled(name, filename.toString(), file); + mod = _load_compiled(name, filenameString, file); break; case PKG_DIRECTORY: PyModule m = imp.addModule(name); m.__dict__.__setitem__("__path__", new PyList(new PyObject[] {filename})); m.__dict__.__setitem__("__file__", filename); - ModuleInfo mi = findFromSource(name, filename.toString(), true, true); + ModuleInfo mi = findFromSource(name, filenameString, true, true); type = mi.type; file = mi.file; - filename = new PyString(mi.filename); + filenameString = mi.filename; + filename = Py.newStringOrUnicode(filenameString); break; default: throw Py.ImportError("No module named " + name); @@ -264,8 +291,13 @@ return mod; } - public static String makeCompiledFilename(String filename) { - return imp.makeCompiledFilename(filename); + /** + * Variant of {@link imp#makeCompiledFilename(String)} dealing with encoded bytes. In the context + * where this is used from Python, a result in encoded bytes is preferable. + */ + public static PyString makeCompiledFilename(PyString filename) { + filename = Py.fileSystemEncode(filename); + return Py.newString(imp.makeCompiledFilename(filename.getString())); } public static PyObject get_magic() { diff --git a/src/org/python/modules/_py_compile.java b/src/org/python/modules/_py_compile.java --- a/src/org/python/modules/_py_compile.java +++ b/src/org/python/modules/_py_compile.java @@ -12,22 +12,30 @@ public class _py_compile { public static PyList __all__ = new PyList(new PyString[] { new PyString("compile") }); - public static boolean compile(String filename, String cfile, String dfile) { - // Resolve relative path names. dfile is only used for error messages and should not be - // resolved + /** + * Java wrapper on the module compiler in support of of py_compile.compile. Filenames here will + * be interpreted as Unicode if they are PyUnicode, and as byte-encoded names if they only + * PyString. + * + * @param fileName actual source file name + * @param compiledName compiled filename + * @param displayName displayed source filename, only used for error messages (and not resolved) + * @return true if successful + */ + public static boolean compile(PyString fileName, PyString compiledName, PyString displayName) { + // Resolve source path and check it exists PySystemState sys = Py.getSystemState(); - filename = sys.getPath(filename); - cfile = sys.getPath(cfile); + String file = sys.getPath(Py.fileSystemDecode(fileName)); + File f = new File(file); + if (!f.exists()) { + throw Py.IOError(Errno.ENOENT, file); + } - File file = new File(filename); - if (!file.exists()) { - throw Py.IOError(Errno.ENOENT, Py.newString(filename)); - } - String name = getModuleName(file); - - byte[] bytes = org.python.core.imp.compileSource(name, file, dfile, cfile); - org.python.core.imp.cacheCompiledSource(filename, cfile, bytes); - + // Convert file in which to put the byte code and display name (each may be null) + String c = (compiledName == null) ? null : sys.getPath(Py.fileSystemDecode(compiledName)); + String d = (displayName == null) ? null : Py.fileSystemDecode(displayName); + byte[] bytes = org.python.core.imp.compileSource(getModuleName(f), f, d, c); + org.python.core.imp.cacheCompiledSource(file, c, bytes); return bytes.length > 0; } diff --git a/src/org/python/modules/posix/PosixModule.java b/src/org/python/modules/posix/PosixModule.java --- a/src/org/python/modules/posix/PosixModule.java +++ b/src/org/python/modules/posix/PosixModule.java @@ -57,6 +57,7 @@ import org.python.core.PyString; import org.python.core.PySystemState; import org.python.core.PyTuple; +import org.python.core.PyUnicode; import org.python.core.Untraversable; import org.python.core.imp; import org.python.core.io.FileIO; @@ -486,7 +487,8 @@ "getcwd() -> path\n\n" + "Return a string representing the current working directory."); public static PyObject getcwd() { - return Py.newStringOrUnicode(Py.getSystemState().getCurrentWorkingDir()); + // The return value is bytes in the file system encoding + return Py.fileSystemEncode(Py.getSystemState().getCurrentWorkingDir()); } public static PyString __doc__getcwdu = new PyString( @@ -676,9 +678,16 @@ throw Py.OSError("listdir(): an unknown error occurred: " + path); } + // Return names as bytes or unicode according to the type of the original argument PyList list = new PyList(); - for (String name : names) { - list.append(Py.newStringOrUnicode(path, name)); + if (path instanceof PyUnicode) { + for (String name : names) { + list.append(Py.newUnicode(name)); + } + } else { + for (String name : names) { + list.append(Py.fileSystemEncode(name)); + } } return list; } @@ -1343,25 +1352,24 @@ return environ; } for (Map.Entry entry : env.entrySet()) { + // The shell restricts names to a subset of ASCII and values are encoded byte strings. environ.__setitem__( - Py.newStringOrUnicode(entry.getKey()), - Py.newStringOrUnicode(entry.getValue())); + Py.newString(entry.getKey()), + Py.fileSystemEncode(entry.getValue())); } return environ; } /** - * Return a path as a String from a PyObject + * Return a path as a String from a PyObject, which must be str or + * unicode. If the path is a str (that is, bytes), it is + * interpreted into Unicode using the file system encoding. * * @param path a PyObject, raising a TypeError if an invalid path type * @return a String path */ private static String asPath(PyObject path) { - if (path instanceof PyString) { - return path.toString(); - } - throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", - path.getType().fastGetName())); + return Py.fileSystemDecode(path); } /** diff --git a/src/org/python/modules/zipimport/zipimporter.java b/src/org/python/modules/zipimport/zipimporter.java --- a/src/org/python/modules/zipimport/zipimporter.java +++ b/src/org/python/modules/zipimport/zipimporter.java @@ -20,6 +20,7 @@ import org.python.core.PySystemState; import org.python.core.PyTuple; import org.python.core.PyType; +import org.python.core.PyUnicode; import org.python.core.Traverseproc; import org.python.core.Visitproc; import org.python.core.util.FileUtil; @@ -80,7 +81,7 @@ @ExposedMethod final void zipimporter___init__(PyObject[] args, String[] kwds) { ArgParser ap = new ArgParser("__init__", args, kwds, new String[] {"path"}); - String path = ap.getString(0); + String path = Py.fileSystemDecode(ap.getPyObject(0)); zipimporter___init__(path); } @@ -113,10 +114,11 @@ pathFile = parentFile; } if (archive != null) { - files = zipimport._zip_directory_cache.__finditem__(archive); + PyUnicode archivePath = Py.newUnicode(archive); + files = zipimport._zip_directory_cache.__finditem__(archivePath); if (files == null) { files = readDirectory(archive); - zipimport._zip_directory_cache.__setitem__(archive, files); + zipimport._zip_directory_cache.__setitem__(archivePath, files); } } else { throw zipimport.ZipImportError("not a Zip file: " + path); @@ -172,11 +174,12 @@ */ @Override public String get_data(String path) { - return zipimporter_get_data(path); + return zipimporter_get_data(Py.newUnicode(path)); } @ExposedMethod - final String zipimporter_get_data(String path) { + final String zipimporter_get_data(PyObject opath) { + String path = Py.fileSystemDecode(opath); int len = archive.length(); if (len < path.length() && path.startsWith(archive + File.separator)) { path = path.substring(len + 1); @@ -246,7 +249,8 @@ final PyObject zipimporter_get_filename(String fullname) { ModuleCodeData moduleCodeData = getModuleCode(fullname); if (moduleCodeData != null) { - return Py.newStringOrUnicode(moduleCodeData.path); + // File names generally expected in the FS encoding at the Python level + return Py.fileSystemEncode(moduleCodeData.path); } return Py.None; } @@ -397,7 +401,8 @@ ZipEntry zipEntry = zipEntries.nextElement(); String name = zipEntry.getName().replace('/', File.separatorChar); - PyObject __file__ = Py.newStringOrUnicode(archive + File.separator + name); + // File names generally expected in the FS encoding at the Python level + PyObject __file__ = Py.fileSystemEncode(archive + File.separator + name); PyObject compress = Py.newInteger(zipEntry.getMethod()); PyObject data_size = new PyLong(zipEntry.getCompressedSize()); PyObject file_size = new PyLong(zipEntry.getSize()); diff --git a/src/org/python/util/jython.java b/src/org/python/util/jython.java --- a/src/org/python/util/jython.java +++ b/src/org/python/util/jython.java @@ -196,7 +196,7 @@ try { PyObject runpy = imp.importName("runpy", true); PyObject runmodule = runpy.__findattr__("_run_module_as_main"); - runmodule.__call__(Py.newStringOrUnicode(moduleName), Py.newBoolean(set_argv0)); + runmodule.__call__(Py.fileSystemEncode(moduleName), Py.newBoolean(set_argv0)); } catch (Throwable t) { Py.printException(t); interp.cleanup(); @@ -206,7 +206,7 @@ private static boolean runMainFromImporter(InteractiveConsole interp, String filename) { // Support http://bugs.python.org/issue1739468 - Allow interpreter to execute a zip file or directory - PyString argv0 = Py.newStringOrUnicode(filename); + PyString argv0 = Py.fileSystemEncode(filename); PyObject importer = imp.getImporter(argv0); if (!(importer instanceof PyNullImporter)) { /* argv0 is usable as an import source, so @@ -323,7 +323,7 @@ if (path == null) { path = ""; } - Py.getSystemState().path.insert(0, Py.newStringOrUnicode(path)); + Py.getSystemState().path.insert(0, Py.fileSystemEncode(path)); if (opts.jar) { try { runJar(opts.filename); @@ -341,8 +341,8 @@ } else { try { interp.globals.__setitem__(new PyString("__file__"), - new PyString(opts.filename)); - + // Note that __file__ is widely expected to be encoded bytes + Py.fileSystemEncode(opts.filename)); FileInputStream file; try { file = new FileInputStream(new RelativeFile(opts.filename)); diff --git a/src/shell/jython.exe b/src/shell/jython.exe index 7c9cbe9eec239c5768c17f873726220b09966341..b7500204c603274a6bdb9ec15064bd27f31c14ac GIT binary patch [stripped] diff --git a/src/shell/jython.py b/src/shell/jython.py --- a/src/shell/jython.py +++ b/src/shell/jython.py @@ -20,19 +20,68 @@ is_windows = os.name == "nt" or (os.name == "java" and os._name == "nt") +# A note about encoding: +# +# A major motivation for this program is to launch Jython on Windows, where +# console and file encoding may be different. Command-line arguments and +# environment variables are presented in Python 2.7 as byte-data, encoded +# "somehow". It becomes important to know which decoding to use as soon as +# paths may contain non-ascii characters. It is not the console encoding. +# Experiment shows that sys.getfilesystemencoding() is generally applicable +# to arguments, environment variables and spawning a subprocess. +# +# On a Windows 10 box, this comes up with pseudo-codec 'mbcs'. This supports +# European accented characters pretty well. +# +# When localised to Chinese(simplified) the FS encoding mbcs includes many +# more points than cp936 (the console encoding), although it still struggles +# with European accented characters. + +ENCODING = sys.getfilesystemencoding() or "utf-8" + + +def get_env(envvar, default=None): + """ Return the named environment variable, decoded to Unicode.""" + v = os.environ.get(envvar, default) + # Tolerate default given as bytes, as we're bound to forget sometimes + if isinstance(v, bytes): + v = v.decode(ENCODING) + # Remove quotes sometimes necessary around the value + if v is not None and v.startswith('"') and v.endswith('"'): + v = v[1:-1] + return v + +def encode_list(args, encoding=ENCODING): + """ Convert list of Unicode strings to list of encoded byte strings.""" + r = [] + for a in args: + if not isinstance(a, bytes): a = a.encode(encoding) + r.append(a) + return r + +def decode_list(args, encoding=ENCODING): + """ Convert list of byte strings to list of Unicode strings.""" + r = [] + for a in args: + if not isinstance(a, unicode): a = a.decode(encoding) + r.append(a) + return r def parse_launcher_args(args): + """ Process the given argument list into two objects, the first part being + a namespace of checked arguments to the interpreter itself, and the rest + being the Python program it will run and its arguments. + """ class Namespace(object): pass parsed = Namespace() - parsed.java = [] - parsed.properties = OrderedDict() - parsed.boot = False - parsed.jdb = False - parsed.help = False - parsed.print_requested = False - parsed.profile = False - parsed.jdb = None + parsed.boot = False # --boot flag given + parsed.jdb = False # --jdb flag given + parsed.help = False # --help or -h flag given + parsed.print_requested = False # --print flag given + parsed.profile = False # --profile flag given + parsed.properties = OrderedDict() # properties to give the JVM + parsed.java = [] # any other arguments to give the JVM it = iter(args) next(it) # ignore sys.argv[0] @@ -42,11 +91,11 @@ arg = next(it) except StopIteration: break - if arg.startswith("-D"): - k, v = arg[2:].split("=") + if arg.startswith(u"-D"): + k, v = arg[2:].split(u"=") parsed.properties[k] = v i += 1 - elif arg in ("-J-classpath", "-J-cp"): + elif arg in (u"-J-classpath", u"-J-cp"): try: next_arg = next(it) except StopIteration: @@ -55,24 +104,24 @@ bad_option("Bad option for -J-classpath") parsed.classpath = next_arg i += 2 - elif arg.startswith("-J-Xmx"): + elif arg.startswith(u"-J-Xmx"): parsed.mem = arg[2:] i += 1 - elif arg.startswith("-J-Xss"): + elif arg.startswith(u"-J-Xss"): parsed.stack = arg[2:] i += 1 - elif arg.startswith("-J"): + elif arg.startswith(u"-J"): parsed.java.append(arg[2:]) i += 1 - elif arg == "--print": + elif arg == u"--print": parsed.print_requested = True i += 1 - elif arg in ("-h", "--help"): + elif arg in (u"-h", u"--help"): parsed.help = True - elif arg in ("--boot", "--jdb", "--profile"): + elif arg in (u"--boot", u"--jdb", u"--profile"): setattr(parsed, arg[2:], True) i += 1 - elif arg == "--": + elif arg == u"--": i += 1 break else: @@ -92,13 +141,13 @@ if hasattr(self, "_uname"): return self._uname if is_windows: - self._uname = "windows" + self._uname = u"windows" else: uname = subprocess.check_output(["uname"]).strip().lower() if uname.startswith("cygwin"): - self._uname = "cygwin" + self._uname = u"cygwin" else: - self._uname = uname + self._uname = uname.decode(ENCODING) return self._uname @property @@ -114,22 +163,23 @@ return self._java_command def setup_java_command(self): + """ Sets java_home and java_command according to environment and parsed + launcher arguments --jdb and --help. + """ if self.args.help: self._java_home = None - self._java_command = "java" + self._java_command = u"java" return - - if "JAVA_HOME" not in os.environ: - self._java_home = None - self._java_command = "jdb" if self.args.jdb else "java" + + command = u"jdb" if self.args.jdb else u"java" + + self._java_home = get_env("JAVA_HOME") + if self._java_home is None or self.uname == u"cygwin": + # Assume java or jdb on the path + self._java_command = command else: - self._java_home = os.environ["JAVA_HOME"] - if self.uname == "cygwin": - self._java_command = "jdb" if self.args.jdb else "java" - else: - self._java_command = os.path.join( - self.java_home, "bin", - "jdb" if self.args.jdb else "java") + # Assume java or jdb in JAVA_HOME/bin + self._java_command = os.path.join(self._java_home, u"bin", command) @property def executable(self): @@ -139,28 +189,37 @@ # Modified from # http://stackoverflow.com/questions/3718657/how-to-properly-determine-current-script-directory-in-python/22881871#22881871 if getattr(sys, "frozen", False): # py2exe, PyInstaller, cx_Freeze - path = os.path.abspath(sys.executable) + # Frozen. Let it go with the executable path. + bytes_path = sys.executable else: - def inspect_this(): pass - path = inspect.getabsfile(inspect_this) - self._executable = os.path.realpath(path) + # Not frozen. Any object defined in this file will do. + bytes_path = inspect.getfile(JythonCommand) + # Python 2 thinks in bytes. Carefully normalise in Unicode. + path = os.path.realpath(bytes_path.decode(ENCODING)) + try: + # If possible, make this relative to the CWD. + # This helps manage multi-byte names in installation location. + path = os.path.relpath(path, os.getcwdu()) + except ValueError: + # Many reasons why this might be impossible: use an absolute path. + path = os.path.abspath(path) + self._executable = path return self._executable @property def jython_home(self): if hasattr(self, "_jython_home"): return self._jython_home - if "JYTHON_HOME" in os.environ: - self._jython_home = os.environ["JYTHON_HOME"] - else: - self._jython_home = os.path.dirname(os.path.dirname(self.executable)) - if self.uname == "cygwin": - self._jython_home = subprocess.check_output(["cygpath", "--windows", self._jython_home]).strip() + self._jython_home = get_env("JYTHON_HOME") or os.path.dirname( + os.path.dirname(self.executable)) + if self.uname == u"cygwin": + # Even on Cygwin, we need a Windows-style path for this + home = unicode_subprocess(["cygpath", "--windows", home]) return self._jython_home @property def jython_opts(): - return os.environ.get("JYTHON_OPTS", "") + return get_env("JYTHON_OPTS", "") @property def classpath_delimiter(self): @@ -179,11 +238,9 @@ else: jars.append(os.path.join(self.jython_home, "javalib", "*")) elif not os.path.exists(os.path.join(self.jython_home, "jython.jar")): - bad_option("""{jython_home} contains neither jython-dev.jar nor jython.jar. + bad_option(u"""{} contains neither jython-dev.jar nor jython.jar. Try running this script from the 'bin' directory of an installed Jython or -setting {envvar_specifier}JYTHON_HOME.""".format( - jython_home=self.jython_home, - envvar_specifier="%" if self.uname == "windows" else "$")) +setting JYTHON_HOME.""".format(self.jython_home)) else: jars = [os.path.join(self.jython_home, "jython.jar")] self._jython_jars = jars @@ -194,14 +251,14 @@ if hasattr(self.args, "classpath"): return self.args.classpath else: - return os.environ.get("CLASSPATH", ".") + return get_env("CLASSPATH", ".") @property def java_mem(self): if hasattr(self.args, "mem"): return self.args.mem else: - return os.environ.get("JAVA_MEM", "-Xmx512m") + return get_env("JAVA_MEM", "-Xmx512m") @property def java_stack(self): @@ -213,7 +270,7 @@ @property def java_opts(self): return [self.java_mem, self.java_stack] - + @property def java_profile_agent(self): return os.path.join(self.jython_home, "javalib", "profile.jar") @@ -222,68 +279,84 @@ if "JAVA_ENCODING" not in os.environ and self.uname == "darwin" and "file.encoding" not in self.args.properties: self.args.properties["file.encoding"] = "UTF-8" - def convert(self, arg): - if sys.stdout.encoding: - return arg.encode(sys.stdout.encoding) - else: - return arg - def make_classpath(self, jars): return self.classpath_delimiter.join(jars) def convert_path(self, arg): - if self.uname == "cygwin": - if not arg.startswith("/cygdrive/"): - new_path = self.convert(arg).replace("/", "\\") + if self.uname == u"cygwin": + if not arg.startswith(u"/cygdrive/"): + return arg.replace(u"/", u"\\") else: - new_path = subprocess.check_output(["cygpath", "-pw", self.convert(arg)]).strip() - return new_path + arg = arg.replace('*', r'\*') # prevent globbing + return unicode_subprocess(["cygpath", "-pw", arg]) else: - return self.convert(arg) + return arg + + def unicode_subprocess(self, unicode_command): + """ Launch a command with subprocess.check_output() and read the + output, except everything is expected to be in Unicode. + """ + cmd = [] + for c in unicode_command: + if isinstance(c, bytes): + cmd.append(c) + else: + cmd.append(c.encode(ENCODING)) + return subprocess.check_output(cmd).strip().decode(ENCODING) @property def command(self): + # Set default file encoding for just for Darwin (?) self.set_encoding() + + # Begin to build the Java part of the ultimate command args = [self.java_command] args.extend(self.java_opts) args.extend(self.args.java) + # Get the class path right (depends on --boot) classpath = self.java_classpath jython_jars = self.jython_jars if self.args.boot: - args.append("-Xbootclasspath/a:%s" % self.convert_path(self.make_classpath(jython_jars))) + args.append(u"-Xbootclasspath/a:%s" % self.convert_path(self.make_classpath(jython_jars))) else: classpath = self.make_classpath(jython_jars) + self.classpath_delimiter + classpath - args.extend(["-classpath", self.convert_path(classpath)]) + args.extend([u"-classpath", self.convert_path(classpath)]) if "python.home" not in self.args.properties: - args.append("-Dpython.home=%s" % self.convert_path(self.jython_home)) + args.append(u"-Dpython.home=%s" % self.convert_path(self.jython_home)) if "python.executable" not in self.args.properties: - args.append("-Dpython.executable=%s" % self.convert_path(self.executable)) + args.append(u"-Dpython.executable=%s" % self.convert_path(self.executable)) if "python.launcher.uname" not in self.args.properties: - args.append("-Dpython.launcher.uname=%s" % self.uname) - # Determines whether running on a tty for the benefit of + args.append(u"-Dpython.launcher.uname=%s" % self.uname) + + # Determine whether running on a tty for the benefit of # running on Cygwin. This step is needed because the Mintty # terminal emulator doesn't behave like a standard Microsoft # Windows tty, and so JNR Posix doesn't detect it properly. if "python.launcher.tty" not in self.args.properties: - args.append("-Dpython.launcher.tty=%s" % str(os.isatty(sys.stdin.fileno())).lower()) - if self.uname == "cygwin" and "python.console" not in self.args.properties: - args.append("-Dpython.console=org.python.core.PlainConsole") + args.append(u"-Dpython.launcher.tty=%s" % str(os.isatty(sys.stdin.fileno())).lower()) + if self.uname == u"cygwin" and "python.console" not in self.args.properties: + args.append(u"-Dpython.console=org.python.core.PlainConsole") + if self.args.profile: - args.append("-XX:-UseSplitVerifier") - args.append("-javaagent:%s" % self.convert_path(self.java_profile_agent)) + args.append(u"-XX:-UseSplitVerifier") + args.append(u"-javaagent:%s" % self.convert_path(self.java_profile_agent)) + for k, v in self.args.properties.iteritems(): - args.append("-D%s=%s" % (self.convert(k), self.convert(v))) - args.append("org.python.util.jython") + args.append(u"-D%s=%s" % (k, v)) + + args.append(u"org.python.util.jython") + if self.args.help: - args.append("--help") + args.append(u"--help") + args.extend(self.jython_args) return args def bad_option(msg): - print >> sys.stderr, """ + print >> sys.stderr, u""" {msg} usage: jython [option] ... [-c cmd | -m mod | file | -] [arg] ... Try `jython -h' for more information. @@ -312,19 +385,24 @@ """ def support_java_opts(args): + """ Generator from options intended for the JVM. Options beginning -D go + through unchanged, others are prefixed with -J. + """ + # Input is expected to be Unicode, but just in case ... + if isinstance(args, bytes): args = args.decode(ENCODING) it = iter(args) while it: arg = next(it) - if arg.startswith("-D"): + if arg.startswith(u"-D"): yield arg - elif arg in ("-classpath", "-cp"): - yield "-J" + arg + elif arg in (u"-classpath", u"-cp"): + yield u"-J" + arg try: yield next(it) except StopIteration: bad_option("Argument expected for -classpath option in JAVA_OPTS") else: - yield "-J" + arg + yield u"-J" + arg # copied from subprocess module in Jython; see @@ -378,37 +456,36 @@ return argv - -def decode_args(sys_args): - args = [sys_args[0]] - - def get_env_opts(envvar): - opts = os.environ.get(envvar, "") - if is_windows: - return cmdline2list(opts) - else: - return shlex.split(opts) - - java_opts = get_env_opts("JAVA_OPTS") - jython_opts = get_env_opts("JYTHON_OPTS") - - args.extend(support_java_opts(java_opts)) - args.extend(sys_args[1:]) - - if sys.stdout.encoding: - if sys.stdout.encoding.lower() == "cp65001": - sys.exit("""Jython does not support code page 65001 (CP_UTF8). -Please try another code page by setting it with the chcp command.""") - args = [arg.decode(sys.stdout.encoding) for arg in args] - jython_opts = [arg.decode(sys.stdout.encoding) for arg in jython_opts] - - return args, jython_opts - +def get_env_opts(envvar): + """ Return a list of the values in the named environment variable, + split according to shell conventions, and decoded to Unicode. + """ + opts = os.environ.get(envvar, "") # bytes at this point + if is_windows: + opts = cmdline2list(opts) + else: + opts = shlex.split(opts) + return decode_list(opts) def main(sys_args): - sys_args, jython_opts = decode_args(sys_args) + # The entire program must work in Unicode + sys_args = decode_list(sys_args) + + # sys_args[0] is this script (which we'll replace with 'java' eventually). + # Insert options for the java command from the environment. + sys_args[1:1] = support_java_opts(get_env_opts("JAVA_OPTS")) + + # Parse the composite arguments (yes, even the ones from JAVA_OPTS), + # and return the "unparsed" tail considered arguments for Jython itself. args, jython_args = parse_launcher_args(sys_args) + + # Build the data from which we can generate the command ultimately. + # Jython options supplied from the environment stand in front of the + # unparsed tail from the command line. + jython_opts = get_env_opts("JYTHON_OPTS") jython_command = JythonCommand(args, jython_opts + jython_args) + + # This is the "fully adjusted" command to launch, but still as Unicode. command = jython_command.command if args.profile and not args.help: @@ -416,23 +493,32 @@ os.unlink("profile.txt") except OSError: pass + if args.print_requested and not args.help: - if jython_command.uname == "windows": - print subprocess.list2cmdline(jython_command.command) + if jython_command.uname == u"windows": + # Add escapes and quotes necessary to Windows. + # Normally used for a byte strings but Python is tolerant :) + command_line = subprocess.list2cmdline(command) else: - print " ".join(pipes.quote(arg) for arg in jython_command.command) + # Just concatenate with spaces + command_line = u" ".join(command) + # It is possible the Unicode cannot be encoded for the console + enc = sys.stdout.encoding or 'ascii' + sys.stdout.write(command_line.encode(enc, 'replace')) else: - if not (is_windows or not hasattr(os, "execvp") or args.help or jython_command.uname == "cygwin"): + if not (is_windows or not hasattr(os, "execvp") or args.help or + jython_command.uname == u"cygwin"): # Replace this process with the java process. # # NB such replacements actually do not work under Windows, # but if tried, they also fail very badly by hanging. # So don't even try! + command = encode_list(command) os.execvp(command[0], command[1:]) else: result = 1 try: - result = subprocess.call(command) + result = subprocess.call(encode_list(command)) if args.help: print_help() except KeyboardInterrupt: -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:06:52 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:06:52 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Use_UTF-8_for_file_paths_e?= =?utf-8?q?xpressed_in_bytes=2E?= Message-ID: <20170521090144.68092.7CF14BACD08F5C52@psf.io> https://hg.python.org/jython/rev/1888a0b15f81 changeset: 8084:1888a0b15f81 user: Jeff Allen date: Thu Apr 20 23:20:46 2017 +0100 summary: Use UTF-8 for file paths expressed in bytes. This fairly extensive change regularises the approach to file and path names in the interests of handling non-ascii paths correctly. See notes to issue #2356. We are not finished with the consequential changes, but to commit work so far helps make it manageable. regrtest runs with 24 failed tests. files: CPythonLib.includes | 1 + Lib/ntpath.py | 560 ---------- Lib/subprocess.py | 38 +- src/org/python/core/Py.java | 134 ++- src/org/python/core/PyBytecode.java | 9 +- src/org/python/core/PyFile.java | 4 - src/org/python/core/PyNullImporter.java | 13 +- src/org/python/core/PySystemState.java | 53 +- src/org/python/core/PyTableCode.java | 6 +- src/org/python/core/StdoutWrapper.java | 3 +- src/org/python/core/imp.java | 13 +- src/org/python/core/io/FileIO.java | 10 +- src/org/python/modules/_imp.java | 30 +- src/org/python/modules/posix/PosixModule.java | 18 +- 14 files changed, 224 insertions(+), 668 deletions(-) diff --git a/CPythonLib.includes b/CPythonLib.includes --- a/CPythonLib.includes +++ b/CPythonLib.includes @@ -110,6 +110,7 @@ netrc.py nntplib.py numbers.py +ntpath.py nturl2path.py opcode.py optparse.py diff --git a/Lib/ntpath.py b/Lib/ntpath.py deleted file mode 100644 --- a/Lib/ntpath.py +++ /dev/null @@ -1,560 +0,0 @@ -# Module 'ntpath' -- common operations on WinNT/Win95 pathnames -"""Common pathname manipulations, WindowsNT/95 version. - -Instead of importing this module directly, import os and refer to this -module as os.path. -""" - -import os -import sys -import stat -import genericpath -import warnings - -from genericpath import * - -__all__ = ["normcase","isabs","join","splitdrive","split","splitext", - "basename","dirname","commonprefix","getsize","getmtime", - "getatime","getctime", "islink","exists","lexists","isdir","isfile", - "ismount","walk","expanduser","expandvars","normpath","abspath", - "splitunc","curdir","pardir","sep","pathsep","defpath","altsep", - "extsep","devnull","realpath","supports_unicode_filenames","relpath"] - -# strings representing various path-related bits and pieces -curdir = '.' -pardir = '..' -extsep = '.' -sep = '\\' -pathsep = ';' -altsep = '/' -defpath = '.;C:\\bin' -if 'ce' in sys.builtin_module_names: - defpath = '\\Windows' -elif 'os2' in sys.builtin_module_names: - # OS/2 w/ VACPP - altsep = '/' -devnull = 'nul' - -# Normalize the case of a pathname and map slashes to backslashes. -# Other normalizations (such as optimizing '../' away) are not done -# (this is done by normpath). - -def normcase(s): - """Normalize case of pathname. - - Makes all characters lowercase and all slashes into backslashes.""" - return s.replace("/", "\\").lower() - - -# Return whether a path is absolute. -# Trivial in Posix, harder on the Mac or MS-DOS. -# For DOS it is absolute if it starts with a slash or backslash (current -# volume), or if a pathname after the volume letter and colon / UNC resource -# starts with a slash or backslash. - -def isabs(s): - """Test whether a path is absolute""" - s = splitdrive(s)[1] - return s != '' and s[:1] in '/\\' - - -# Join two (or more) paths. - -def join(a, *p): - """Join two or more pathname components, inserting "\\" as needed. - If any component is an absolute path, all previous path components - will be discarded.""" - path = a - for b in p: - b_wins = 0 # set to 1 iff b makes path irrelevant - if path == "": - b_wins = 1 - - elif isabs(b): - # This probably wipes out path so far. However, it's more - # complicated if path begins with a drive letter: - # 1. join('c:', '/a') == 'c:/a' - # 2. join('c:/', '/a') == 'c:/a' - # But - # 3. join('c:/a', '/b') == '/b' - # 4. join('c:', 'd:/') = 'd:/' - # 5. join('c:/', 'd:/') = 'd:/' - if path[1:2] != ":" or b[1:2] == ":": - # Path doesn't start with a drive letter, or cases 4 and 5. - b_wins = 1 - - # Else path has a drive letter, and b doesn't but is absolute. - elif len(path) > 3 or (len(path) == 3 and - path[-1] not in "/\\"): - # case 3 - b_wins = 1 - - if b_wins: - path = b - else: - # Join, and ensure there's a separator. - assert len(path) > 0 - if path[-1] in "/\\": - if b and b[0] in "/\\": - path += b[1:] - else: - path += b - elif path[-1] == ":": - path += b - elif b: - if b[0] in "/\\": - path += b - else: - path += "\\" + b - else: - # path is not empty and does not end with a backslash, - # but b is empty; since, e.g., split('a/') produces - # ('a', ''), it's best if join() adds a backslash in - # this case. - path += '\\' - - return path - - -# Split a path in a drive specification (a drive letter followed by a -# colon) and the path specification. -# It is always true that drivespec + pathspec == p -def splitdrive(p): - """Split a pathname into drive and path specifiers. Returns a 2-tuple -"(drive,path)"; either part may be empty""" - if p[1:2] == ':': - return p[0:2], p[2:] - return '', p - - -# Parse UNC paths -def splitunc(p): - """Split a pathname into UNC mount point and relative path specifiers. - - Return a 2-tuple (unc, rest); either part may be empty. - If unc is not empty, it has the form '//host/mount' (or similar - using backslashes). unc+rest is always the input path. - Paths containing drive letters never have an UNC part. - """ - if p[1:2] == ':': - return '', p # Drive letter present - firstTwo = p[0:2] - if firstTwo == '//' or firstTwo == '\\\\': - # is a UNC path: - # vvvvvvvvvvvvvvvvvvvv equivalent to drive letter - # \\machine\mountpoint\directories... - # directory ^^^^^^^^^^^^^^^ - normp = normcase(p) - index = normp.find('\\', 2) - if index == -1: - ##raise RuntimeError, 'illegal UNC path: "' + p + '"' - return ("", p) - index = normp.find('\\', index + 1) - if index == -1: - index = len(p) - return p[:index], p[index:] - return '', p - - -# Split a path in head (everything up to the last '/') and tail (the -# rest). After the trailing '/' is stripped, the invariant -# join(head, tail) == p holds. -# The resulting head won't end in '/' unless it is the root. - -def split(p): - """Split a pathname. - - Return tuple (head, tail) where tail is everything after the final slash. - Either part may be empty.""" - - d, p = splitdrive(p) - # set i to index beyond p's last slash - i = len(p) - while i and p[i-1] not in '/\\': - i = i - 1 - head, tail = p[:i], p[i:] # now tail has no slashes - # remove trailing slashes from head, unless it's all slashes - head2 = head - while head2 and head2[-1] in '/\\': - head2 = head2[:-1] - head = head2 or head - return d + head, tail - - -# Split a path in root and extension. -# The extension is everything starting at the last dot in the last -# pathname component; the root is everything before that. -# It is always true that root + ext == p. - -def splitext(p): - return genericpath._splitext(p, sep, altsep, extsep) -splitext.__doc__ = genericpath._splitext.__doc__ - - -# Return the tail (basename) part of a path. - -def basename(p): - """Returns the final component of a pathname""" - return split(p)[1] - - -# Return the head (dirname) part of a path. - -def dirname(p): - """Returns the directory component of a pathname""" - return split(p)[0] - -# Is a path a symbolic link? -# This will always return false on systems where posix.lstat doesn't exist. - -def islink(path): - """Test for symbolic link. - On WindowsNT/95 and OS/2 always returns false - """ - return False - -# alias exists to lexists -lexists = exists - -# Is a path a mount point? Either a root (with or without drive letter) -# or an UNC path with at most a / or \ after the mount point. - -def ismount(path): - """Test whether a path is a mount point (defined as root of drive)""" - unc, rest = splitunc(path) - if unc: - return rest in ("", "/", "\\") - p = splitdrive(path)[1] - return len(p) == 1 and p[0] in '/\\' - - -# Directory tree walk. -# For each directory under top (including top itself, but excluding -# '.' and '..'), func(arg, dirname, filenames) is called, where -# dirname is the name of the directory and filenames is the list -# of files (and subdirectories etc.) in the directory. -# The func may modify the filenames list, to implement a filter, -# or to impose a different order of visiting. - -def walk(top, func, arg): - """Directory tree walk with callback function. - - For each directory in the directory tree rooted at top (including top - itself, but excluding '.' and '..'), call func(arg, dirname, fnames). - dirname is the name of the directory, and fnames a list of the names of - the files and subdirectories in dirname (excluding '.' and '..'). func - may modify the fnames list in-place (e.g. via del or slice assignment), - and walk will only recurse into the subdirectories whose names remain in - fnames; this can be used to implement a filter, or to impose a specific - order of visiting. No semantics are defined for, or required of, arg, - beyond that arg is always passed to func. It can be used, e.g., to pass - a filename pattern, or a mutable object designed to accumulate - statistics. Passing None for arg is common.""" - warnings.warnpy3k("In 3.x, os.path.walk is removed in favor of os.walk.", - stacklevel=2) - try: - names = os.listdir(top) - except os.error: - return - func(arg, top, names) - for name in names: - name = join(top, name) - if isdir(name): - walk(name, func, arg) - - -# Expand paths beginning with '~' or '~user'. -# '~' means $HOME; '~user' means that user's home directory. -# If the path doesn't begin with '~', or if the user or $HOME is unknown, -# the path is returned unchanged (leaving error reporting to whatever -# function is called with the expanded path as argument). -# See also module 'glob' for expansion of *, ? and [...] in pathnames. -# (A function should also be defined to do full *sh-style environment -# variable expansion.) - -def expanduser(path): - """Expand ~ and ~user constructs. - - If user or $HOME is unknown, do nothing.""" - if path[:1] != '~': - return path - i, n = 1, len(path) - while i < n and path[i] not in '/\\': - i = i + 1 - - if 'HOME' in os.environ: - userhome = os.environ['HOME'] - elif 'USERPROFILE' in os.environ: - userhome = os.environ['USERPROFILE'] - elif not 'HOMEPATH' in os.environ: - return path - else: - try: - drive = os.environ['HOMEDRIVE'] - except KeyError: - drive = '' - userhome = join(drive, os.environ['HOMEPATH']) - - if i != 1: #~user - userhome = join(dirname(userhome), path[1:i]) - - return userhome + path[i:] - - -# Expand paths containing shell variable substitutions. -# The following rules apply: -# - no expansion within single quotes -# - '$$' is translated into '$' -# - '%%' is translated into '%' if '%%' are not seen in %var1%%var2% -# - ${varname} is accepted. -# - $varname is accepted. -# - %varname% is accepted. -# - varnames can be made out of letters, digits and the characters '_-' -# (though is not verifed in the ${varname} and %varname% cases) -# XXX With COMMAND.COM you can use any characters in a variable name, -# XXX except '^|<>='. - -def expandvars(path): - """Expand shell variables of the forms $var, ${var} and %var%. - - Unknown variables are left unchanged.""" - if '$' not in path and '%' not in path: - return path - import string - varchars = string.ascii_letters + string.digits + '_-' - res = '' - index = 0 - pathlen = len(path) - while index < pathlen: - c = path[index] - if c == '\'': # no expansion within single quotes - path = path[index + 1:] - pathlen = len(path) - try: - index = path.index('\'') - res = res + '\'' + path[:index + 1] - except ValueError: - res = res + path - index = pathlen - 1 - elif c == '%': # variable or '%' - if path[index + 1:index + 2] == '%': - res = res + c - index = index + 1 - else: - path = path[index+1:] - pathlen = len(path) - try: - index = path.index('%') - except ValueError: - res = res + '%' + path - index = pathlen - 1 - else: - var = path[:index] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '%' + var + '%' - elif c == '$': # variable or '$$' - if path[index + 1:index + 2] == '$': - res = res + c - index = index + 1 - elif path[index + 1:index + 2] == '{': - path = path[index+2:] - pathlen = len(path) - try: - index = path.index('}') - var = path[:index] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '${' + var + '}' - except ValueError: - res = res + '${' + path - index = pathlen - 1 - else: - var = '' - index = index + 1 - c = path[index:index + 1] - while c != '' and c in varchars: - var = var + c - index = index + 1 - c = path[index:index + 1] - if var in os.environ: - res = res + os.environ[var] - else: - res = res + '$' + var - if c != '': - index = index - 1 - else: - res = res + c - index = index + 1 - return res - - -# Normalize a path, e.g. A//B, A/./B and A/foo/../B all become A\B. -# Previously, this function also truncated pathnames to 8+3 format, -# but as this module is called "ntpath", that's obviously wrong! - -def normpath(path): - """Normalize path, eliminating double slashes, etc.""" - # Preserve unicode (if path is unicode) - backslash, dot = (u'\\', u'.') if isinstance(path, unicode) else ('\\', '.') - if path.startswith(('\\\\.\\', '\\\\?\\')): - # in the case of paths with these prefixes: - # \\.\ -> device names - # \\?\ -> literal paths - # do not do any normalization, but return the path unchanged - return path - path = path.replace("/", "\\") - prefix, path = splitdrive(path) - # We need to be careful here. If the prefix is empty, and the path starts - # with a backslash, it could either be an absolute path on the current - # drive (\dir1\dir2\file) or a UNC filename (\\server\mount\dir1\file). It - # is therefore imperative NOT to collapse multiple backslashes blindly in - # that case. - # The code below preserves multiple backslashes when there is no drive - # letter. This means that the invalid filename \\\a\b is preserved - # unchanged, where a\\\b is normalised to a\b. It's not clear that there - # is any better behaviour for such edge cases. - if prefix == '': - # No drive letter - preserve initial backslashes - while path[:1] == "\\": - prefix = prefix + backslash - path = path[1:] - else: - # We have a drive letter - collapse initial backslashes - if path.startswith("\\"): - prefix = prefix + backslash - path = path.lstrip("\\") - comps = path.split("\\") - i = 0 - while i < len(comps): - if comps[i] in ('.', ''): - del comps[i] - elif comps[i] == '..': - if i > 0 and comps[i-1] != '..': - del comps[i-1:i+1] - i -= 1 - elif i == 0 and prefix.endswith("\\"): - del comps[i] - else: - i += 1 - else: - i += 1 - # If the path is now empty, substitute '.' - if not prefix and not comps: - comps.append(dot) - return prefix + backslash.join(comps) - - -# Return an absolute path. -try: - from nt import _getfullpathname - -except ImportError: # no built-in nt module - maybe it's Jython ;) - - if os._name == 'nt' : - # on Windows so Java version of sys deals in NT paths - def abspath(path): - """Return the absolute version of a path.""" - try: - if isinstance(path, unicode): - # Result must be unicode - if path: - path = sys.getPath(path) - else: - # Empty path must return current working directory - path = os.getcwdu() - else: - # Result must be bytes - if path: - path = sys.getPath(path).encode('latin-1') - else: - # Empty path must return current working directory - path = os.getcwd() - except EnvironmentError: - pass # Bad path - return unchanged. - return normpath(path) - - else: - # not running on Windows - mock up something sensible - def abspath(path): - """Return the absolute version of a path.""" - try: - if isinstance(path, unicode): - # Result must be unicode - if path: - path = join(os.getcwdu(), path) - else: - # Empty path must return current working directory - path = os.getcwdu() - else: - # Result must be bytes - if path: - path = join(os.getcwd(), path) - else: - # Empty path must return current working directory - path = os.getcwd() - except EnvironmentError: - pass # Bad path - return unchanged. - return normpath(path) - -else: # use native Windows method on Windows - def abspath(path): - """Return the absolute version of a path.""" - - if path: # Empty path must return current working directory. - try: - path = _getfullpathname(path) - except WindowsError: - pass # Bad path - return unchanged. - elif isinstance(path, unicode): - path = os.getcwdu() - else: - path = os.getcwd() - return normpath(path) - -# realpath is a no-op on systems without islink support -realpath = abspath -# Win9x family and earlier have no Unicode filename support. -supports_unicode_filenames = (hasattr(sys, "getwindowsversion") and - sys.getwindowsversion()[3] >= 2) - -def _abspath_split(path): - abs = abspath(normpath(path)) - prefix, rest = splitunc(abs) - is_unc = bool(prefix) - if not is_unc: - prefix, rest = splitdrive(abs) - return is_unc, prefix, [x for x in rest.split(sep) if x] - -def relpath(path, start=curdir): - """Return a relative version of a path""" - - if not path: - raise ValueError("no path specified") - - start_is_unc, start_prefix, start_list = _abspath_split(start) - path_is_unc, path_prefix, path_list = _abspath_split(path) - - if path_is_unc ^ start_is_unc: - raise ValueError("Cannot mix UNC and non-UNC paths (%s and %s)" - % (path, start)) - if path_prefix.lower() != start_prefix.lower(): - if path_is_unc: - raise ValueError("path is on UNC root %s, start on UNC root %s" - % (path_prefix, start_prefix)) - else: - raise ValueError("path is on drive %s, start on drive %s" - % (path_prefix, start_prefix)) - # Work out how much of the filepath is shared by start and path. - i = 0 - for e1, e2 in zip(start_list, path_list): - if e1.lower() != e2.lower(): - break - i += 1 - - rel_list = [pardir] * (len(start_list)-i) + path_list[i:] - if not rel_list: - return curdir - return join(*rel_list) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -438,6 +438,7 @@ import java.nio.ByteBuffer import org.python.core.io.RawIOBase import org.python.core.io.StreamIO + from org.python.core.Py import fileSystemDecode else: import select _has_poll = hasattr(select, 'poll') @@ -779,7 +780,7 @@ maintain those byte values (which may be butchered as Strings) for the subprocess if they haven't been modified. """ - # Determine what's safe to merge + # Determine what's necessary to merge (new or different) merge_env = dict((key, value) for key, value in env.iteritems() if key not in builder_env or builder_env.get(key) != value) @@ -789,8 +790,10 @@ for entry in entries: if entry.getKey() not in env: entries.remove() - - builder_env.putAll(merge_env) + # add anything new or different in env + for key, value in merge_env.iteritems(): + # If the new value is bytes, assume it to be FS-encoded + builder_env.put(key, fileSystemDecode(value)) class Popen(object): @@ -1308,9 +1311,6 @@ args = _cmdline2listimpl(args) else: args = list(args) - # NOTE: CPython posix (execv) will str() any unicode - # args first, maybe we should do the same on - # posix. Windows passes unicode through, however if any(not isinstance(arg, (str, unicode)) for arg in args): raise TypeError('args must contain only strings') args = _escape_args(args) @@ -1321,6 +1321,11 @@ if executable is not None: args[0] = executable + # NOTE: CPython posix (execv) will FS-encode any unicode args, but + # pass on bytes unchanged, because that's what the system expects. + # Java expects unicode, so we do the converse: leave unicode + # unchanged but FS-decode any supplied as bytes. + args = [fileSystemDecode(arg) for arg in args] builder = java.lang.ProcessBuilder(args) if stdin is None: @@ -1330,16 +1335,20 @@ if stderr is None: builder.redirectError(java.lang.ProcessBuilder.Redirect.INHERIT) - # os.environ may be inherited for compatibility with CPython + # os.environ may be inherited for compatibility with CPython. + # Elements taken from os.environ are FS-decoded to unicode. _setup_env(dict(os.environ if env is None else env), builder.environment()) + # The current working directory must also be unicode. if cwd is None: - cwd = os.getcwd() - elif not os.path.exists(cwd): - raise OSError(errno.ENOENT, os.strerror(errno.ENOENT), cwd) - elif not os.path.isdir(cwd): - raise OSError(errno.ENOTDIR, os.strerror(errno.ENOTDIR), cwd) + cwd = os.getcwdu() + else: + cwd = fileSystemDecode(cwd) + if not os.path.exists(cwd): + raise OSError(errno.ENOENT, os.strerror(errno.ENOENT), cwd) + elif not os.path.isdir(cwd): + raise OSError(errno.ENOTDIR, os.strerror(errno.ENOTDIR), cwd) builder.directory(java.io.File(cwd)) # Let Java manage redirection of stderr to stdout (it's more @@ -1890,9 +1899,10 @@ args = _cmdline2listimpl(command) args = _escape_args(args) args = _shell_command + args - cwd = os.getcwd() + cwd = os.getcwdu() - + # Python supplies FS-encoded arguments while Java expects String + args = [fileSystemDecode(arg) for arg in args] builder = java.lang.ProcessBuilder(args) builder.directory(java.io.File(cwd)) diff --git a/src/org/python/core/Py.java b/src/org/python/core/Py.java --- a/src/org/python/core/Py.java +++ b/src/org/python/core/Py.java @@ -84,6 +84,7 @@ throw new StreamCorruptedException("unknown singleton: " + which); } } + /* Holds the singleton None and Ellipsis objects */ /** The singleton None Python object **/ public final static PyObject None = new PyNone(); @@ -222,6 +223,10 @@ return new PyException(Py.IOError, args); } + public static PyException IOError(Constant errno, String filename) { + return new PyException(Py.IOError, Py.fileSystemEncode(filename)); // XXX newStringOrUnicode? + } + public static PyException IOError(Constant errno, PyObject filename) { int value = errno.intValue(); PyObject args = new PyTuple(Py.newInteger(value), PosixModule.strerror(value), filename); @@ -683,6 +688,103 @@ } } + /** + * Return a file name or path as Unicode (Java UTF-16 String), decoded if necessary + * from a Python bytes object, using the file system encoding. In Jython, this + * encoding is UTF-8, irrespective of the OS platform. This method is comparable with Python 3 + * os.fsdecode, but for Java use, in places such as the os module. If + * the argument is not a PyUnicode, it will be decoded using the nominal Jython + * file system encoding. If the argument is a PyUnicode, its + * String is returned. + * + * @param filename as bytes to decode, or already as unicode + * @return unicode version of path + */ + public static String fileSystemDecode(PyString filename) { + String s = filename.getString(); + if (filename instanceof PyUnicode || CharMatcher.ascii().matchesAllOf(s)) { + // Already encoded or usable as ASCII + return s; + } else { + // It's bytes, so must decode properly + assert "utf-8".equals(PySystemState.FILE_SYSTEM_ENCODING.toString()); + return codecs.PyUnicode_DecodeUTF8(s, null); + } + } + + /** + * As {@link #fileSystemDecode(PyString)} but raising ValueError if not a + * str or unicode. + * + * @param filename as bytes to decode, or already as unicode + * @return unicode version of the file name + */ + public static String fileSystemDecode(PyObject filename) { + if (filename instanceof PyString) { + return fileSystemDecode((PyString)filename); + } else + throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", + filename.getType().fastGetName())); + } + + /** + * Return a PyString object we can use as a file name or file path in places where Python + * expects a bytes (that is a str) object in the file system encoding. + * In Jython, this encoding is UTF-8, irrespective of the OS platform. + *

+ * This is subtly different from CPython's use of "file system encoding", which tracks the + * platform's choice so that OS services may be called that have a bytes interface. Jython's + * interaction with the OS occurs via Java using String arguments representing Unicode values, + * so we have no need to match the encoding actually chosen by the platform (e.g. 'mbcs' on + * Windows). Rather we need a nominal Jython file system encoding, for use where the standard + * library forces byte paths on us (in Python 2). There is no reason for this choice to vary + * with OS platform. Methods receiving paths as bytes will + * {@link #fileSystemDecode(PyString)} them again for Java. + * + * @param filename as unicode to encode, or already as bytes + * @return encoded bytes version of path + */ + public static PyString fileSystemEncode(String filename) { + if (CharMatcher.ascii().matchesAllOf(filename)) { + // Just wrap it as US-ASCII is a subset of the file system encoding + return Py.newString(filename); + } else { + // It's non just US-ASCII, so must encode properly + assert "utf-8".equals(PySystemState.FILE_SYSTEM_ENCODING.toString()); + return Py.newString(codecs.PyUnicode_EncodeUTF8(filename, null)); + } + } + + /** + * Return a PyString object we can use as a file name or file path in places where Python + * expects a bytes (that is, str) object in the file system encoding. + * In Jython, this encoding is UTF-8, irrespective of the OS platform. This method is comparable + * with Python 3 os.fsencode. If the argument is a PyString, it is returned + * unchanged. If the argument is a PyUnicode, it is converted to a bytes using the + * nominal Jython file system encoding. + * + * @param filename as unicode to encode, or already as bytes + * @return encoded bytes version of path + */ + public static PyString fileSystemEncode(PyString filename) { + return (filename instanceof PyUnicode) ? fileSystemEncode(filename.getString()) : filename; + } + + /** + * Convert a PyList path to a list of Java String objects decoded from + * the path elements to strings guaranteed usable in the Java API. + * + * @param path a Python search path + * @return equivalent Java list + */ + private static List fileSystemDecode(PyList path) { + List list = new ArrayList<>(path.__len__()); + for (PyObject filename : path.getList()) { + list.add(fileSystemDecode(filename)); + } + return list; + } + public static PyStringMap newStringMap() { // enable lazy bootstrapping (see issue #1671) if (!PyType.hasBuilder(PyStringMap.class)) { @@ -1282,7 +1384,7 @@ if (moduleName == null) { buf.append(""); } else { - String moduleStr = moduleName.toString(); + String moduleStr = Py.fileSystemDecode(moduleName); if (!moduleStr.equals("exceptions")) { buf.append(moduleStr); buf.append("."); @@ -1294,7 +1396,7 @@ } if (value != null && value != Py.None) { // only print colon if the str() of the object is not the empty string - PyObject s = useRepr ? value.__repr__() : value.__str__(); + PyObject s = useRepr ? value.__repr__() : value; if (!(s instanceof PyString) || s.__len__() != 0) { buf.append(": "); } @@ -1565,6 +1667,16 @@ } } + private static final String IMPORT_SITE_ERROR = "" + + "Cannot import site module and its dependencies: %s\n" + + "Determine if the following attributes are correct:\n" // + + " * sys.path: %s\n" + + " This attribute might be including the wrong directories, such as from CPython\n" + + " * sys.prefix: %s\n" + + " This attribute is set by the system property python.home, although it can\n" + + " be often automatically determined by the location of the Jython jar file\n\n" + + "You can use the -S option or python.import.site=false to not import the site module"; + public static boolean importSiteIfSelected() { if (Options.importSite) { try { @@ -1574,18 +1686,10 @@ } catch (PyException pye) { if (pye.match(Py.ImportError)) { PySystemState sys = Py.getSystemState(); - throw Py.ImportError(String.format("" - + "Cannot import site module and its dependencies: %s\n" - + "Determine if the following attributes are correct:\n" - + " * sys.path: %s\n" - + " This attribute might be including the wrong directories, such as from CPython\n" - + " * sys.prefix: %s\n" - + " This attribute is set by the system property python.home, although it can\n" - + " be often automatically determined by the location of the Jython jar file\n\n" - + "You can use the -S option or python.import.site=false to not import the site module", - pye.value.__getattr__("args").__getitem__(0), - sys.path, - sys.prefix)); + String value = pye.value.__getattr__("args").__getitem__(0).toString(); + List path = fileSystemDecode(sys.path); + throw Py.ImportError( + String.format(IMPORT_SITE_ERROR, value, path, PySystemState.prefix)); } else { throw pye; } @@ -2266,7 +2370,7 @@ } /* Here we would actually like to call cls.__findattr__("__metaclass__") * rather than cls.getType(). However there are circumstances where the - * metaclass doesn't show up as __metaclass__. On the other hand we need + * metaclass doesn't show up as __metaclass__. On the other hand we need * to avoid that checker refers to builtin type___subclasscheck__ or * type___instancecheck__. Filtering out checker-instances of * PyBuiltinMethodNarrow does the trick. We also filter out PyMethodDescr diff --git a/src/org/python/core/PyBytecode.java b/src/org/python/core/PyBytecode.java --- a/src/org/python/core/PyBytecode.java +++ b/src/org/python/core/PyBytecode.java @@ -116,11 +116,13 @@ throw Py.AttributeError(name); } + @Override public void __setattr__(String name, PyObject value) { // no writable attributes throwReadonly(name); } + @Override public void __delattr__(String name) { throwReadonly(name); } @@ -137,6 +139,7 @@ return new PyTuple(pystr); } + @Override public PyObject __findattr_ex__(String name) { // have to craft co_varnames specially if (name == "co_varnames") { @@ -149,7 +152,7 @@ return toPyStringTuple(co_freevars); } if (name == "co_filename") { - return new PyString(co_filename); + return Py.fileSystemEncode(co_filename); // bytes object expected by clients } if (name == "co_name") { return new PyString(co_name); @@ -1156,7 +1159,7 @@ "zap" this information, to prevent END_FINALLY from re-raising the exception. (But non-local gotos should still be resumed.) - */ + */ PyObject exit; PyObject u = stack.pop(), v, w; if (u == Py.None) { @@ -1350,7 +1353,7 @@ if (why != Why.RETURN) { retval = Py.None; } - } else { + } else { // store the stack in the frame for reentry from the yield; f.f_savedlocals = stack.popN(stack.size()); } diff --git a/src/org/python/core/PyFile.java b/src/org/python/core/PyFile.java --- a/src/org/python/core/PyFile.java +++ b/src/org/python/core/PyFile.java @@ -168,10 +168,6 @@ ArgParser ap = new ArgParser("file", args, kwds, new String[] {"name", "mode", "buffering"}, 1); PyObject name = ap.getPyObject(0); - if (!(name instanceof PyString)) { - throw Py.TypeError("coercing to Unicode: need string, '" + name.getType().fastGetName() - + "' type found"); - } String mode = ap.getString(1, "r"); int bufsize = ap.getInt(2, -1); file___init__(new FileIO((PyString) name, parseMode(mode)), name, mode, bufsize); diff --git a/src/org/python/core/PyNullImporter.java b/src/org/python/core/PyNullImporter.java --- a/src/org/python/core/PyNullImporter.java +++ b/src/org/python/core/PyNullImporter.java @@ -20,7 +20,7 @@ public PyNullImporter(PyObject pathObj) { super(); - String pathStr = asPath(pathObj); + String pathStr = Py.fileSystemDecode(pathObj); if (pathStr.equals("")) { throw Py.ImportError("empty pathname"); } @@ -42,17 +42,6 @@ return Py.None; } - // FIXME Refactoring move helper function to a central util library - // FIXME Also can take in account working in zip file systems - - private static String asPath(PyObject pathObj) { - if (!(pathObj instanceof PyString)) { - throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", - pathObj.getType().fastGetName())); - } - return pathObj.toString(); - } - private static boolean isDir(String pathStr) { if (pathStr.equals("")) { return false; diff --git a/src/org/python/core/PySystemState.java b/src/org/python/core/PySystemState.java --- a/src/org/python/core/PySystemState.java +++ b/src/org/python/core/PySystemState.java @@ -82,6 +82,9 @@ public final static PyString float_repr_style = Py.newString("short"); + /** Nominal Jython file system encoding (as sys.getfilesystemencoding()) */ + static final PyString FILE_SYSTEM_ENCODING = Py.newString("utf-8"); + public static boolean py3kwarning = false; public final static Class flags = Options.class; @@ -109,13 +112,13 @@ public static PackageManager packageManager; private static File cachedir; - private static PyList defaultPath; - private static PyList defaultArgv; - private static PyObject defaultExecutable; + private static PyList defaultPath; // list of bytes or unicode + private static PyList defaultArgv; // list of bytes or unicode + private static PyObject defaultExecutable; // bytes or unicode or None public static Properties registry; // = init_registry(); - public static PyObject prefix; - public static PyObject exec_prefix = Py.EmptyString; + public static PyObject prefix; // bytes or unicode + public static PyObject exec_prefix = Py.EmptyString; // bytes or unicode public static final PyString byteorder = new PyString("big"); public static final int maxint = Integer.MAX_VALUE; @@ -504,7 +507,7 @@ } public PyObject getfilesystemencoding() { - return Py.None; + return FILE_SYSTEM_ENCODING; } @@ -840,10 +843,10 @@ } } if (prefix != null) { - PySystemState.prefix = Py.newString(prefix); + PySystemState.prefix = Py.newStringOrUnicode(prefix); } if (exec_prefix != null) { - PySystemState.exec_prefix = Py.newString(exec_prefix); + PySystemState.exec_prefix = Py.newStringOrUnicode(exec_prefix); } try { String jythonpath = System.getenv("JYTHONPATH"); @@ -1174,16 +1177,16 @@ PyList argv = new PyList(); if (args != null) { for (String arg : args) { - argv.append(Py.newStringOrUnicode(arg)); + argv.append(Py.newStringOrUnicode(arg)); // XXX or always newUnicode? } } return argv; } /** - * Determine the default sys.executable value from the registry. - * If registry is not set (as in standalone jython jar), will use sys.prefix + /bin/jython(.exe) and the file may - * not exist. Users can create a wrapper in it's place to make it work in embedded environments. + * Determine the default sys.executable value from the registry. If registry is not set (as in + * standalone jython jar), we will use sys.prefix + /bin/jython(.exe) and the file may not + * exist. Users can create a wrapper in it's place to make it work in embedded environments. * Only if sys.prefix is null, returns Py.None * * @param props a Properties registry @@ -1191,26 +1194,26 @@ */ private static PyObject initExecutable(Properties props) { String executable = props.getProperty("python.executable"); - if (executable == null) { + File executableFile; + if (executable != null) { + // The executable from the registry is a Unicode String path + executableFile = new File(executable); + } else { if (prefix == null) { return Py.None; } else { - executable = prefix.asString() + File.separator + "bin" + File.separator; - if (Platform.IS_WINDOWS) { - executable += "jython.exe"; - } else { - executable += "jython"; - } + // The prefix is a unicode or encoded bytes object + executableFile = new File(Py.fileSystemDecode(prefix), + Platform.IS_WINDOWS ? "bin\\jython.exe" : "bin/jython"); } } - File executableFile = new File(executable); try { executableFile = executableFile.getCanonicalFile(); } catch (IOException ioe) { executableFile = executableFile.getAbsoluteFile(); } - return new PyString(executableFile.getPath()); + return Py.newStringOrUnicode(executableFile.getPath()); // XXX always bytes in CPython } /** @@ -1353,8 +1356,8 @@ PyList path = new PyList(); addPaths(path, props.getProperty("python.path", "")); if (prefix != null) { - String libpath = new File(prefix.toString(), "Lib").toString(); - path.append(new PyString(libpath)); + String libpath = new File(Py.fileSystemDecode(prefix), "Lib").toString(); + path.append(Py.fileSystemEncode(libpath)); // XXX or newStringOrUnicode or newUnicode? } if (standalone) { // standalone jython: add the /Lib directory inside JYTHON_JAR to the path @@ -1397,7 +1400,8 @@ private static void addPaths(PyList path, String pypath) { StringTokenizer tok = new StringTokenizer(pypath, java.io.File.pathSeparator); while (tok.hasMoreTokens()) { - path.append(new PyString(tok.nextToken().trim())); + // Use unicode object if necessary to represent the element + path.append(Py.newStringOrUnicode(tok.nextToken().trim())); } } @@ -1540,6 +1544,7 @@ closer.cleanup(); } + @Override public void close() { cleanup(); } public static class PySystemStateCloser { diff --git a/src/org/python/core/PyTableCode.java b/src/org/python/core/PyTableCode.java --- a/src/org/python/core/PyTableCode.java +++ b/src/org/python/core/PyTableCode.java @@ -66,6 +66,7 @@ // co_lnotab, co_stacksize }; + @Override public PyObject __dir__() { PyString members[] = new PyString[__members__.length]; for (int i = 0; i < __members__.length; i++) @@ -80,11 +81,13 @@ throw Py.AttributeError(name); } + @Override public void __setattr__(String name, PyObject value) { // no writable attributes throwReadonly(name); } + @Override public void __delattr__(String name) { throwReadonly(name); } @@ -99,6 +102,7 @@ return new PyTuple(pystr); } + @Override public PyObject __findattr_ex__(String name) { // have to craft co_varnames specially if (name == "co_varnames") { @@ -111,7 +115,7 @@ return toPyStringTuple(co_freevars); } if (name == "co_filename") { - return new PyString(co_filename); + return Py.fileSystemEncode(co_filename); // bytes object expected by clients } if (name == "co_name") { return new PyString(co_name); diff --git a/src/org/python/core/StdoutWrapper.java b/src/org/python/core/StdoutWrapper.java --- a/src/org/python/core/StdoutWrapper.java +++ b/src/org/python/core/StdoutWrapper.java @@ -105,7 +105,8 @@ String s; if (o instanceof PyUnicode) { // Use the encoding and policy defined for the stream. (Each may be null.) - s = ((PyUnicode)o).encode(file.encoding, file.errors); + s = ((PyUnicode)o).encode(file.encoding, "replace"); //FIXME: back to ... + // s = ((PyUnicode)o).encode(file.encoding, file.errors); } else { s = o.__str__().toString(); } diff --git a/src/org/python/core/imp.java b/src/org/python/core/imp.java --- a/src/org/python/core/imp.java +++ b/src/org/python/core/imp.java @@ -418,7 +418,8 @@ } if (moduleLocation != null) { - module.__setattr__("__file__", new PyString(moduleLocation)); + // Standard library expects __file__ to be encoded bytes + module.__setattr__("__file__", Py.fileSystemEncode(moduleLocation)); } else if (module.__findattr__("__file__") == null) { // Should probably never happen (but maybe with an odd custom builtins, or // Java Integration) @@ -543,10 +544,8 @@ return loadFromLoader(loader, moduleName); } } - if (!(p instanceof PyUnicode)) { - p = p.__str__(); - } - ret = loadFromSource(sys, name, moduleName, p.toString()); + // p could be unicode or bytes (in the file system encoding) + ret = loadFromSource(sys, name, moduleName, Py.fileSystemDecode(p)); if (ret != null) { return ret; } @@ -606,7 +605,7 @@ // display names are for identification purposes (e.g. __file__): when entry is // null it forces java.io.File to be a relative path (e.g. foo/bar.py instead of // /tmp/foo/bar.py) - String displayDirName = entry.equals("") ? null : entry.toString(); + String displayDirName = entry.equals("") ? null : entry; String displaySourceName = new File(new File(displayDirName, name), sourceName).getPath(); String displayCompiledName = new File(new File(displayDirName, name), compiledName).getPath(); @@ -640,7 +639,7 @@ compiledFile = new File(dirName, compiledName); } else { PyModule m = addModule(modName); - PyObject filename = new PyString(new File(displayDirName, name).getPath()); + PyObject filename = Py.newStringOrUnicode(new File(displayDirName, name).getPath()); // XXX fileSystemEncode? m.__dict__.__setitem__("__path__", new PyList(new PyObject[] {filename})); } diff --git a/src/org/python/core/io/FileIO.java b/src/org/python/core/io/FileIO.java --- a/src/org/python/core/io/FileIO.java +++ b/src/org/python/core/io/FileIO.java @@ -64,10 +64,10 @@ private boolean emulateAppend; /** - * @see #FileIO(PyString name, String mode) + * @see #FileIO(String name, String mode) */ - public FileIO(String name, String mode) { - this(Py.newString(name), mode); + public FileIO(PyString name, String mode) { + this(Py.fileSystemDecode(name), mode); } /** @@ -80,9 +80,9 @@ * @param name the name of the file * @param mode a raw io file mode String */ - public FileIO(PyString name, String mode) { + public FileIO(String name, String mode) { parseMode(mode); - File absPath = new RelativeFile(name.toString()); + File absPath = new RelativeFile(name); try { if ((appending && !(reading || plus)) || (writing && !reading && !plus)) { diff --git a/src/org/python/modules/_imp.java b/src/org/python/modules/_imp.java --- a/src/org/python/modules/_imp.java +++ b/src/org/python/modules/_imp.java @@ -68,7 +68,7 @@ * This needs to be consolidated with the code in (@see org.python.core.imp). * * @param name module name - * @param entry a path String + * @param entry a path String (Unicode file or directory name) * @param findingPackage if looking for a package only try to locate __init__ * @return null if no module found otherwise module information */ @@ -190,8 +190,10 @@ public static PyObject find_module(String name, PyObject path) { if (path == Py.None && PySystemState.getBuiltin(name) != null) { - return new PyTuple(Py.None, Py.newString(name), - new PyTuple(Py.EmptyString, Py.EmptyString, + return new PyTuple(Py.None, + Py.newString(name), + new PyTuple(Py.EmptyString, + Py.EmptyString, Py.newInteger(C_BUILTIN))); } @@ -199,14 +201,14 @@ path = Py.getSystemState().path; } for (PyObject p : path.asIterable()) { - ModuleInfo mi = findFromSource(name, p.toString(), false, true); + ModuleInfo mi = findFromSource(name, Py.fileSystemDecode(p), false, true); if(mi == null) { continue; } return new PyTuple(mi.file, - new PyString(mi.filename), - new PyTuple(new PyString(mi.suffix), - new PyString(mi.mode), + Py.newStringOrUnicode(mi.filename), + new PyTuple(Py.newString(mi.suffix), + Py.newString(mi.mode), Py.newInteger(mi.type))); } throw Py.ImportError("No module named " + name); @@ -216,7 +218,8 @@ PyObject mod = Py.None; PySystemState sys = Py.getSystemState(); int type = data.__getitem__(2).asInt(); - while(mod == Py.None) { + String filenameString = Py.fileSystemDecode(filename); + while (mod == Py.None) { String compiledName; switch (type) { case PY_SOURCE: @@ -226,7 +229,7 @@ } // XXX: This should load the accompanying byte code file instead, if it exists - String resolvedFilename = sys.getPath(filename.toString()); + String resolvedFilename = sys.getPath(filenameString); compiledName = makeCompiledFilename(resolvedFilename); if (name.endsWith(".__init__")) { name = name.substring(0, name.length() - ".__init__".length()); @@ -241,19 +244,20 @@ } mod = imp.createFromSource(name.intern(), (InputStream)o, - filename.toString(), compiledName, mtime); + filenameString, compiledName, mtime); break; case PY_COMPILED: - mod = load_compiled(name, filename.toString(), file); + mod = load_compiled(name, filenameString, file); break; case PKG_DIRECTORY: PyModule m = imp.addModule(name); m.__dict__.__setitem__("__path__", new PyList(new PyObject[] {filename})); m.__dict__.__setitem__("__file__", filename); - ModuleInfo mi = findFromSource(name, filename.toString(), true, true); + ModuleInfo mi = findFromSource(name, filenameString, true, true); type = mi.type; file = mi.file; - filename = new PyString(mi.filename); + filenameString = mi.filename; + filename = Py.newStringOrUnicode(filenameString); break; default: throw Py.ImportError("No module named " + name); diff --git a/src/org/python/modules/posix/PosixModule.java b/src/org/python/modules/posix/PosixModule.java --- a/src/org/python/modules/posix/PosixModule.java +++ b/src/org/python/modules/posix/PosixModule.java @@ -486,7 +486,8 @@ "getcwd() -> path\n\n" + "Return a string representing the current working directory."); public static PyObject getcwd() { - return Py.newStringOrUnicode(Py.getSystemState().getCurrentWorkingDir()); + // The return value is bytes in the file system encoding + return Py.fileSystemEncode(Py.getSystemState().getCurrentWorkingDir()); } public static PyString __doc__getcwdu = new PyString( @@ -1343,25 +1344,24 @@ return environ; } for (Map.Entry entry : env.entrySet()) { + // The shell restricts names to a subset of ASCII and values are encoded byte strings. environ.__setitem__( - Py.newStringOrUnicode(entry.getKey()), - Py.newStringOrUnicode(entry.getValue())); + Py.newString(entry.getKey()), + Py.fileSystemEncode(entry.getValue())); } return environ; } /** - * Return a path as a String from a PyObject + * Return a path as a String from a PyObject, which must be str or + * unicode. If the path is a str (that is, bytes), it is + * interpreted into Unicode using the file system encoding. * * @param path a PyObject, raising a TypeError if an invalid path type * @return a String path */ private static String asPath(PyObject path) { - if (path instanceof PyString) { - return path.toString(); - } - throw Py.TypeError(String.format("coercing to Unicode: need string, %s type found", - path.getType().fastGetName())); + return Py.fileSystemDecode(path); } /** -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:06:53 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:06:53 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Skip_test=5Fio_failure_in_?= =?utf-8?q?concurrent_access_to_a_buffered_file_=28=232588=29=2E?= Message-ID: <20170521090145.55566.05204B038C36D642@psf.io> https://hg.python.org/jython/rev/e02c01d28a50 changeset: 8088:e02c01d28a50 user: Jeff Allen date: Sat May 06 16:48:12 2017 +0100 summary: Skip test_io failure in concurrent access to a buffered file (#2588). Revealed while testing encoding for cp936/gbk, but probably not related. files: Lib/test/test_io.py | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -2438,6 +2438,7 @@ self.assertEqual(f.errors, "replace") @unittest.skipUnless(threading, 'Threading required for this test.') + @unittest.skipIf(support.is_jython, "Not thread-safe: Jython issue 2588.") def test_threads_write(self): # Issue6750: concurrent writes could duplicate data event = threading.Event() -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:06:53 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:06:53 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Fixes_encodings=2E=5Fjava_?= =?utf-8?q?for_incremental_mode_of_gbk/cp936?= Message-ID: <20170521090146.55899.AFE569D0337D67C9@psf.io> https://hg.python.org/jython/rev/05fc242d9dd2 changeset: 8089:05fc242d9dd2 user: Jeff Allen date: Sat May 13 17:11:05 2017 +0100 summary: Fixes encodings._java for incremental mode of gbk/cp936 Addresses failures evident on Windows with code page 936 when running test_io with non-ascii paths. Position 'cookie' not handled correctly. files: Lib/encodings/_java.py | 8 ++++++-- src/org/python/core/PyLong.java | 3 +++ 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/Lib/encodings/_java.py b/Lib/encodings/_java.py --- a/Lib/encodings/_java.py +++ b/Lib/encodings/_java.py @@ -162,12 +162,16 @@ def reset(self): self.buffer = "" + self.decoder.reset() def getstate(self): - return self.buffer or 0 + # No way to extract the internal state of a Java decoder. + return self.buffer or "", 0 def setstate(self, state): - self.buffer = state or "" + self.buffer, _ = state or ("", 0) + # No way to restore: reset possible EOF state. + self.decoder.reset() class StreamWriter(NonfinalCodec, codecs.StreamWriter): diff --git a/src/org/python/core/PyLong.java b/src/org/python/core/PyLong.java --- a/src/org/python/core/PyLong.java +++ b/src/org/python/core/PyLong.java @@ -295,6 +295,9 @@ @Override public Object __tojava__(Class c) { try { + if (c == Boolean.TYPE || c == Boolean.class) { + return new Boolean(!getValue().equals(BigInteger.ZERO)); + } if (c == Byte.TYPE || c == Byte.class) { return new Byte((byte)getLong(Byte.MIN_VALUE, Byte.MAX_VALUE)); } -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sun May 21 05:06:54 2017 From: jython-checkins at python.org (jeff.allen) Date: Sun, 21 May 2017 09:06:54 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Fix_test=5Frunpy_failure_d?= =?utf-8?q?ue_to_unicode_zip-file_name=2C_observed_on_Linux=2E?= Message-ID: <20170521090147.39614.26C16A346168C01E@psf.io> https://hg.python.org/jython/rev/f4a6679623d7 changeset: 8091:f4a6679623d7 user: Jeff Allen date: Sat May 20 23:58:07 2017 +0100 summary: Fix test_runpy failure due to unicode zip-file name, observed on Linux. Under issue #2356, the sys.path entry for a zip archive became unicode, but correct use of FS-encoded bytes in this change allows us to reinstate (and pass) the CPython version of test_runpy. We also work-around the unlink() failure that made it necessary to suppress the test on Windows. files: Lib/test/regrtest.py | 1 - Lib/test/script_helper.py | 6 +- Lib/test/test_runpy.py | 402 ------------ src/org/python/core/SyspathArchive.java | 9 +- 4 files changed, 10 insertions(+), 408 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -1372,7 +1372,6 @@ test_mailbox # fails miserably and ruins other tests test_os_jy # Locale tests fail on Cygwin (but not Windows) # test_popen # Passes, but see http://bugs.python.org/issue1559298 - test_runpy # OSError: unlink() test_select_new # Hangs (Windows), though ok run singly test_urllib2 # file not on local host (likely Windows only) """, diff --git a/Lib/test/script_helper.py b/Lib/test/script_helper.py --- a/Lib/test/script_helper.py +++ b/Lib/test/script_helper.py @@ -20,6 +20,8 @@ from test.test_support import strip_python_stderr +_IS_JYTHON_WINDOWS = sys.platform.startswith('java') and os._name == 'nt' + # Executing the interpreter in a subprocess def _assert_python(expected_success, *args, **env_vars): cmd_line = [sys.executable] @@ -101,7 +103,9 @@ try: yield dirname finally: - shutil.rmtree(dirname) + # On Windows, unlink failures within rmtree often mask the true nature + # of a failing test (or sometimes a passing one). + shutil.rmtree(dirname, ignore_errors=_IS_JYTHON_WINDOWS) def make_script(script_dir, script_basename, source): script_filename = script_basename+os.extsep+'py' diff --git a/Lib/test/test_runpy.py b/Lib/test/test_runpy.py deleted file mode 100644 --- a/Lib/test/test_runpy.py +++ /dev/null @@ -1,402 +0,0 @@ -# Test the runpy module -import unittest -import os -import os.path -import sys -import re -import tempfile -from test.test_support import verbose, run_unittest, forget -from test.script_helper import (temp_dir, make_script, compile_script, - make_pkg, make_zip_script, make_zip_pkg) - - -from runpy import _run_code, _run_module_code, run_module, run_path -# Note: This module can't safely test _run_module_as_main as it -# runs its tests in the current process, which would mess with the -# real __main__ module (usually test.regrtest) -# See test_cmd_line_script for a test that executes that code path - -# Set up the test code and expected results - -class RunModuleCodeTest(unittest.TestCase): - """Unit tests for runpy._run_code and runpy._run_module_code""" - - expected_result = ["Top level assignment", "Lower level reference"] - test_source = ( - "# Check basic code execution\n" - "result = ['Top level assignment']\n" - "def f():\n" - " result.append('Lower level reference')\n" - "f()\n" - "# Check the sys module\n" - "import sys\n" - "run_argv0 = sys.argv[0]\n" - "run_name_in_sys_modules = __name__ in sys.modules\n" - "if run_name_in_sys_modules:\n" - " module_in_sys_modules = globals() is sys.modules[__name__].__dict__\n" - "# Check nested operation\n" - "import runpy\n" - "nested = runpy._run_module_code('x=1\\n', mod_name='')\n" - ) - - def test_run_code(self): - saved_argv0 = sys.argv[0] - d = _run_code(self.test_source, {}) - self.assertEqual(d["result"], self.expected_result) - self.assertIs(d["__name__"], None) - self.assertIs(d["__file__"], None) - self.assertIs(d["__loader__"], None) - self.assertIs(d["__package__"], None) - self.assertIs(d["run_argv0"], saved_argv0) - self.assertNotIn("run_name", d) - self.assertIs(sys.argv[0], saved_argv0) - - def test_run_module_code(self): - initial = object() - name = "" - file = "Some other nonsense" - loader = "Now you're just being silly" - package = '' # Treat as a top level module - d1 = dict(initial=initial) - saved_argv0 = sys.argv[0] - d2 = _run_module_code(self.test_source, - d1, - name, - file, - loader, - package) - self.assertNotIn("result", d1) - self.assertIs(d2["initial"], initial) - self.assertEqual(d2["result"], self.expected_result) - self.assertEqual(d2["nested"]["x"], 1) - self.assertIs(d2["__name__"], name) - self.assertTrue(d2["run_name_in_sys_modules"]) - self.assertTrue(d2["module_in_sys_modules"]) - self.assertIs(d2["__file__"], file) - self.assertIs(d2["run_argv0"], file) - self.assertIs(d2["__loader__"], loader) - self.assertIs(d2["__package__"], package) - self.assertIs(sys.argv[0], saved_argv0) - self.assertNotIn(name, sys.modules) - - -class RunModuleTest(unittest.TestCase): - """Unit tests for runpy.run_module""" - - def expect_import_error(self, mod_name): - try: - run_module(mod_name) - except ImportError: - pass - else: - self.fail("Expected import error for " + mod_name) - - def test_invalid_names(self): - # Builtin module - self.expect_import_error("sys") - # Non-existent modules - self.expect_import_error("sys.imp.eric") - self.expect_import_error("os.path.half") - self.expect_import_error("a.bee") - self.expect_import_error(".howard") - self.expect_import_error("..eaten") - # Package without __main__.py - self.expect_import_error("multiprocessing") - - def test_library_module(self): - run_module("runpy") - - def _add_pkg_dir(self, pkg_dir): - os.mkdir(pkg_dir) - pkg_fname = os.path.join(pkg_dir, "__init__"+os.extsep+"py") - pkg_file = open(pkg_fname, "w") - pkg_file.close() - return pkg_fname - - def _make_pkg(self, source, depth, mod_base="runpy_test"): - pkg_name = "__runpy_pkg__" - test_fname = mod_base+os.extsep+"py" - pkg_dir = sub_dir = tempfile.mkdtemp() - if verbose: print " Package tree in:", sub_dir - sys.path.insert(0, pkg_dir) - if verbose: print " Updated sys.path:", sys.path[0] - for i in range(depth): - sub_dir = os.path.join(sub_dir, pkg_name) - pkg_fname = self._add_pkg_dir(sub_dir) - if verbose: print " Next level in:", sub_dir - if verbose: print " Created:", pkg_fname - mod_fname = os.path.join(sub_dir, test_fname) - mod_file = open(mod_fname, "w") - mod_file.write(source) - mod_file.close() - if verbose: print " Created:", mod_fname - mod_name = (pkg_name+".")*depth + mod_base - return pkg_dir, mod_fname, mod_name - - def _del_pkg(self, top, depth, mod_name): - for entry in list(sys.modules): - if entry.startswith("__runpy_pkg__"): - del sys.modules[entry] - if verbose: print " Removed sys.modules entries" - del sys.path[0] - if verbose: print " Removed sys.path entry" - for root, dirs, files in os.walk(top, topdown=False): - for name in files: - try: - os.remove(os.path.join(root, name)) - except OSError, ex: - if verbose: print ex # Persist with cleaning up - for name in dirs: - fullname = os.path.join(root, name) - try: - os.rmdir(fullname) - except OSError, ex: - if verbose: print ex # Persist with cleaning up - try: - os.rmdir(top) - if verbose: print " Removed package tree" - except OSError, ex: - if verbose: print ex # Persist with cleaning up - - def _check_module(self, depth): - pkg_dir, mod_fname, mod_name = ( - self._make_pkg("x=1\n", depth)) - forget(mod_name) - try: - if verbose: print "Running from source:", mod_name - d1 = run_module(mod_name) # Read from source - self.assertIn("x", d1) - self.assertTrue(d1["x"] == 1) - del d1 # Ensure __loader__ entry doesn't keep file open - __import__(mod_name) - os.remove(mod_fname) - if verbose: print "Running from compiled:", mod_name - d2 = run_module(mod_name) # Read from bytecode - self.assertIn("x", d2) - self.assertTrue(d2["x"] == 1) - del d2 # Ensure __loader__ entry doesn't keep file open - finally: - self._del_pkg(pkg_dir, depth, mod_name) - if verbose: print "Module executed successfully" - - def _check_package(self, depth): - pkg_dir, mod_fname, mod_name = ( - self._make_pkg("x=1\n", depth, "__main__")) - pkg_name, _, _ = mod_name.rpartition(".") - forget(mod_name) - try: - if verbose: print "Running from source:", pkg_name - d1 = run_module(pkg_name) # Read from source - self.assertIn("x", d1) - self.assertTrue(d1["x"] == 1) - del d1 # Ensure __loader__ entry doesn't keep file open - __import__(mod_name) - os.remove(mod_fname) - if verbose: print "Running from compiled:", pkg_name - d2 = run_module(pkg_name) # Read from bytecode - self.assertIn("x", d2) - self.assertTrue(d2["x"] == 1) - del d2 # Ensure __loader__ entry doesn't keep file open - finally: - self._del_pkg(pkg_dir, depth, pkg_name) - if verbose: print "Package executed successfully" - - def _add_relative_modules(self, base_dir, source, depth): - if depth <= 1: - raise ValueError("Relative module test needs depth > 1") - pkg_name = "__runpy_pkg__" - module_dir = base_dir - for i in range(depth): - parent_dir = module_dir - module_dir = os.path.join(module_dir, pkg_name) - # Add sibling module - sibling_fname = os.path.join(module_dir, "sibling"+os.extsep+"py") - sibling_file = open(sibling_fname, "w") - sibling_file.close() - if verbose: print " Added sibling module:", sibling_fname - # Add nephew module - uncle_dir = os.path.join(parent_dir, "uncle") - self._add_pkg_dir(uncle_dir) - if verbose: print " Added uncle package:", uncle_dir - cousin_dir = os.path.join(uncle_dir, "cousin") - self._add_pkg_dir(cousin_dir) - if verbose: print " Added cousin package:", cousin_dir - nephew_fname = os.path.join(cousin_dir, "nephew"+os.extsep+"py") - nephew_file = open(nephew_fname, "w") - nephew_file.close() - if verbose: print " Added nephew module:", nephew_fname - - def _check_relative_imports(self, depth, run_name=None): - contents = r"""\ -from __future__ import absolute_import -from . import sibling -from ..uncle.cousin import nephew -""" - pkg_dir, mod_fname, mod_name = ( - self._make_pkg(contents, depth)) - try: - self._add_relative_modules(pkg_dir, contents, depth) - pkg_name = mod_name.rpartition('.')[0] - if verbose: print "Running from source:", mod_name - d1 = run_module(mod_name, run_name=run_name) # Read from source - self.assertIn("__package__", d1) - self.assertTrue(d1["__package__"] == pkg_name) - self.assertIn("sibling", d1) - self.assertIn("nephew", d1) - del d1 # Ensure __loader__ entry doesn't keep file open - __import__(mod_name) - os.remove(mod_fname) - if verbose: print "Running from compiled:", mod_name - d2 = run_module(mod_name, run_name=run_name) # Read from bytecode - self.assertIn("__package__", d2) - self.assertTrue(d2["__package__"] == pkg_name) - self.assertIn("sibling", d2) - self.assertIn("nephew", d2) - del d2 # Ensure __loader__ entry doesn't keep file open - finally: - self._del_pkg(pkg_dir, depth, mod_name) - if verbose: print "Module executed successfully" - - def test_run_module(self): - for depth in range(4): - if verbose: print "Testing package depth:", depth - self._check_module(depth) - - def test_run_package(self): - for depth in range(1, 4): - if verbose: print "Testing package depth:", depth - self._check_package(depth) - - def test_explicit_relative_import(self): - for depth in range(2, 5): - if verbose: print "Testing relative imports at depth:", depth - self._check_relative_imports(depth) - - def test_main_relative_import(self): - for depth in range(2, 5): - if verbose: print "Testing main relative imports at depth:", depth - self._check_relative_imports(depth, "__main__") - - -class RunPathTest(unittest.TestCase): - """Unit tests for runpy.run_path""" - # Based on corresponding tests in test_cmd_line_script - - test_source = """\ -# Script may be run with optimisation enabled, so don't rely on assert -# statements being executed -def assertEqual(lhs, rhs): - if lhs != rhs: - raise AssertionError('%r != %r' % (lhs, rhs)) -def assertIs(lhs, rhs): - if lhs is not rhs: - raise AssertionError('%r is not %r' % (lhs, rhs)) -# Check basic code execution -result = ['Top level assignment'] -def f(): - result.append('Lower level reference') -f() -assertEqual(result, ['Top level assignment', 'Lower level reference']) -# Check the sys module -import sys -assertIs(globals(), sys.modules[__name__].__dict__) -argv0 = sys.argv[0] -""" - - def _make_test_script(self, script_dir, script_basename, source=None): - if source is None: - source = self.test_source - return make_script(script_dir, script_basename, source) - - def _check_script(self, script_name, expected_name, expected_file, - expected_argv0, expected_package): - result = run_path(script_name) - self.assertEqual(result["__name__"], expected_name) - self.assertEqual(result["__file__"], expected_file) - self.assertIn("argv0", result) - self.assertEqual(result["argv0"], expected_argv0) - self.assertEqual(result["__package__"], expected_package) - - def _check_import_error(self, script_name, msg): - msg = re.escape(msg) - self.assertRaisesRegexp(ImportError, msg, run_path, script_name) - - def test_basic_script(self): - with temp_dir() as script_dir: - mod_name = 'script' - script_name = self._make_test_script(script_dir, mod_name) - self._check_script(script_name, "", script_name, - script_name, None) - - def test_script_compiled(self): - with temp_dir() as script_dir: - mod_name = 'script' - script_name = self._make_test_script(script_dir, mod_name) - compiled_name = compile_script(script_name) - os.remove(script_name) - self._check_script(compiled_name, "", compiled_name, - compiled_name, None) - - def test_directory(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - self._check_script(script_dir, "", script_name, - script_dir, '') - - def test_directory_compiled(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - compiled_name = compile_script(script_name) - os.remove(script_name) - self._check_script(script_dir, "", compiled_name, - script_dir, '') - - def test_directory_error(self): - with temp_dir() as script_dir: - mod_name = 'not_main' - script_name = self._make_test_script(script_dir, mod_name) - msg = "can't find '__main__' module in %r" % script_dir - self._check_import_error(script_dir, msg) - - def test_zipfile(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - zip_name, fname = make_zip_script(script_dir, 'test_zip', script_name) - self._check_script(zip_name, "", fname, zip_name, '') - - def test_zipfile_compiled(self): - with temp_dir() as script_dir: - mod_name = '__main__' - script_name = self._make_test_script(script_dir, mod_name) - compiled_name = compile_script(script_name) - zip_name, fname = make_zip_script(script_dir, 'test_zip', compiled_name) - self._check_script(zip_name, "", fname, zip_name, '') - - def test_zipfile_error(self): - with temp_dir() as script_dir: - mod_name = 'not_main' - script_name = self._make_test_script(script_dir, mod_name) - zip_name, fname = make_zip_script(script_dir, 'test_zip', script_name) - msg = "can't find '__main__' module in '%s'" % zip_name - self._check_import_error(zip_name, msg) - - def test_main_recursion_error(self): - with temp_dir() as script_dir, temp_dir() as dummy_dir: - mod_name = '__main__' - source = ("import runpy\n" - "runpy.run_path(%r)\n") % dummy_dir - script_name = self._make_test_script(script_dir, mod_name, source) - zip_name, fname = make_zip_script(script_dir, 'test_zip', script_name) - msg = "recursion depth exceeded" - self.assertRaisesRegexp(RuntimeError, msg, run_path, zip_name) - - - -def test_main(): - run_unittest(RunModuleCodeTest, RunModuleTest, RunPathTest) - -if __name__ == "__main__": - test_main() diff --git a/src/org/python/core/SyspathArchive.java b/src/org/python/core/SyspathArchive.java --- a/src/org/python/core/SyspathArchive.java +++ b/src/org/python/core/SyspathArchive.java @@ -1,14 +1,14 @@ - package org.python.core; import java.io.*; import java.util.zip.*; @Untraversable -public class SyspathArchive extends PyUnicode { +public class SyspathArchive extends PyString { private ZipFile zipFile; public SyspathArchive(String archiveName) throws IOException { - super(archiveName); + // As a string-like object (on sys.path) an FS-encoded bytes object is expected + super(Py.fileSystemEncode(archiveName).getString()); archiveName = getArchiveName(archiveName); if(archiveName == null) { throw new IOException("path '" + archiveName + "' not an archive"); @@ -20,7 +20,8 @@ } SyspathArchive(ZipFile zipFile, String archiveName) { - super(archiveName); + // As a string-like object (on sys.path) an FS-encoded bytes object is expected + super(Py.fileSystemEncode(archiveName).getString()); this.zipFile = zipFile; } -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Tue May 23 15:18:22 2017 From: jython-checkins at python.org (jeff.allen) Date: Tue, 23 May 2017 19:18:22 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Improve_handling_of_Jython?= =?utf-8?q?_home_directory_in_launcher=2E?= Message-ID: <20170523191822.68071.864AAC2CC8302391@psf.io> https://hg.python.org/jython/rev/6a5e73d57b5b changeset: 8094:6a5e73d57b5b user: Jeff Allen date: Mon May 22 23:27:07 2017 +0100 summary: Improve handling of Jython home directory in launcher. Fixes regression in launcher that prevents running from bin directory. files: src/shell/jython.exe | Bin src/shell/jython.py | 9 +++++++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/src/shell/jython.exe b/src/shell/jython.exe index b7500204c603274a6bdb9ec15064bd27f31c14ac..04305a09cef83132a50cff9383940a0b395c4eb0 GIT binary patch [stripped] diff --git a/src/shell/jython.py b/src/shell/jython.py --- a/src/shell/jython.py +++ b/src/shell/jython.py @@ -210,11 +210,16 @@ def jython_home(self): if hasattr(self, "_jython_home"): return self._jython_home - self._jython_home = get_env("JYTHON_HOME") or os.path.dirname( - os.path.dirname(self.executable)) + home = get_env("JYTHON_HOME") + if home is None: + # Not just dirname twice in case dirname(executable) == '' + home = os.path.join(os.path.dirname(self.executable), u'..') + # This could be a relative path like .\.. + home = os.path.normpath(home) if self.uname == u"cygwin": # Even on Cygwin, we need a Windows-style path for this home = unicode_subprocess(["cygpath", "--windows", home]) + self._jython_home = home return self._jython_home @property -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Wed May 24 13:30:03 2017 From: jython-checkins at python.org (stefan.richthofer) Date: Wed, 24 May 2017 17:30:03 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Implemented_workaround_for?= =?utf-8?q?_=232536_by_adding_warning_and_skipping_dangerous?= Message-ID: <20170524173001.98818.8FBB1A926DD9E0EC@psf.io> https://hg.python.org/jython/rev/2b06bc95594d changeset: 8095:2b06bc95594d user: Stefan Richthofer date: Wed May 24 19:29:21 2017 +0200 summary: Implemented workaround for #2536 by adding warning and skipping dangerous tests. Issue is kept open to discuss better future solutions. files: Lib/json/tests/test_recursion.py | 5 +++++ Lib/test/test_isinstance.py | 2 ++ NEWS | 1 + src/org/python/core/PyTableCode.java | 6 ++++++ 4 files changed, 14 insertions(+), 0 deletions(-) diff --git a/Lib/json/tests/test_recursion.py b/Lib/json/tests/test_recursion.py --- a/Lib/json/tests/test_recursion.py +++ b/Lib/json/tests/test_recursion.py @@ -1,4 +1,6 @@ from json.tests import PyTest, CTest +import unittest +from test import test_support class JSONTestObject: @@ -65,6 +67,7 @@ self.fail("didn't raise ValueError on default recursion") + @unittest.skipIf(test_support.is_jython, "See http://bugs.jython.org/issue2536.") def test_highly_nested_objects_decoding(self): # test that loading highly-nested objects doesn't segfault when C # accelerations are used. See #12017 @@ -83,6 +86,7 @@ with self.assertRaises(RuntimeError): self.loads(u'[' * 100000 + u'1' + u']' * 100000) + @unittest.skipIf(test_support.is_jython, "See http://bugs.jython.org/issue2536.") def test_highly_nested_objects_encoding(self): # See #12051 l, d = [], {} @@ -93,6 +97,7 @@ with self.assertRaises(RuntimeError): self.dumps(d) + @unittest.skipIf(test_support.is_jython, "See http://bugs.jython.org/issue2536.") def test_endless_recursion(self): # See #12051 class EndlessJSONEncoder(self.json.JSONEncoder): diff --git a/Lib/test/test_isinstance.py b/Lib/test/test_isinstance.py --- a/Lib/test/test_isinstance.py +++ b/Lib/test/test_isinstance.py @@ -246,11 +246,13 @@ if test_support.have_unicode: self.assertEqual(True, issubclass(str, (unicode, (Child, NewChild, basestring)))) + @unittest.skipIf(test_support.is_jython, "See http://bugs.jython.org/issue2536.") def test_subclass_recursion_limit(self): # make sure that issubclass raises RuntimeError before the C stack is # blown self.assertRaises(RuntimeError, blowstack, issubclass, str, str) + @unittest.skipIf(test_support.is_jython, "See http://bugs.jython.org/issue2536.") def test_isinstance_recursion_limit(self): # make sure that issubclass raises RuntimeError before the C stack is # blown diff --git a/NEWS b/NEWS --- a/NEWS +++ b/NEWS @@ -4,6 +4,7 @@ Jython 2.7.1rc1 Bugs fixed + - [ 2536 ] deadlocks in regrtests due to StackOverflowError in finally block (workaround, still open) - [ 2356 ] java.lang.IllegalArgumentException on startup on Windows if username not ASCII - [ 1839 ] sys.getfilesystemencoding() is None (now utf-8) - [ 2579 ] Pyc files are not loading for too large modules if path contains __pyclasspath__ diff --git a/src/org/python/core/PyTableCode.java b/src/org/python/core/PyTableCode.java --- a/src/org/python/core/PyTableCode.java +++ b/src/org/python/core/PyTableCode.java @@ -171,6 +171,12 @@ ret = funcs.call_function(func_id, frame, ts); } catch (Throwable t) { // Convert exceptions that occurred in Java code to PyExceptions + if (!(t instanceof Exception)) { + Py.warning(Py.RuntimeWarning, "PyTableCode.call caught a Throwable that is " + + "not an Exception:\n"+t+"\nJython internals might be in a bad state now " + + "that can cause deadlocks later on." + + "\nSee http://bugs.jython.org/issue2536 for details."); + } PyException pye = Py.JavaError(t); pye.tracebackHere(frame); -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Thu May 25 04:10:17 2017 From: jython-checkins at python.org (jeff.allen) Date: Thu, 25 May 2017 08:10:17 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Make_zipimporter_archive_a?= =?utf-8?q?ttribute_FS-encoded_to_allow_for_non-ascii_paths=2E?= Message-ID: <20170525081017.64461.8F875AA961932FD5@psf.io> https://hg.python.org/jython/rev/097a1441a68f changeset: 8096:097a1441a68f user: Jeff Allen date: Thu May 25 08:27:05 2017 +0100 summary: Make zipimporter archive attribute FS-encoded to allow for non-ascii paths. files: src/org/python/modules/zipimport/zipimporter.java | 18 ++++++--- 1 files changed, 12 insertions(+), 6 deletions(-) diff --git a/src/org/python/modules/zipimport/zipimporter.java b/src/org/python/modules/zipimport/zipimporter.java --- a/src/org/python/modules/zipimport/zipimporter.java +++ b/src/org/python/modules/zipimport/zipimporter.java @@ -49,10 +49,15 @@ "a zipfile. ZipImportError is raised if 'archivepath' doesn't point to\n" + "a valid Zip archive."); - /** Pathname of the Zip archive */ - @ExposedGet + /** Path to the Zip archive */ public String archive; + /** Path to the Zip archive as FS-encoded str. */ + @ExposedGet(name = "archive") + public PyString getArchive() { + return Py.fileSystemEncode(archive); + } + /** File prefix: "a/sub/directory/" */ @ExposedGet public String prefix; @@ -516,12 +521,13 @@ @ExposedMethod(names = "__repr__") final String zipimporter_toString() { - String displayArchive = archive != null ? archive : "???"; + // __repr__ has to return bytes not unicode + String bytesName = archive != null ? Py.fileSystemEncode(archive).getString() : "???"; if (prefix != null && !"".equals(prefix)) { - return String.format("", - displayArchive, File.separatorChar, prefix); + return String.format("", bytesName, + File.separatorChar, prefix); } - return String.format("", displayArchive); + return String.format("", bytesName); } /** -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Fri May 26 18:56:28 2017 From: jython-checkins at python.org (stefan.richthofer) Date: Fri, 26 May 2017 22:56:28 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Updated_several_extlibs_to?= =?utf-8?q?_most_recent_versions_as_of_this_writing=2E_Libs_where?= Message-ID: <20170526225627.4887.B8AC79CAEBAD90F6@psf.io> https://hg.python.org/jython/rev/f6b3ddbc1df8 changeset: 8097:f6b3ddbc1df8 user: Stefan Richthofer date: Sat May 27 00:52:34 2017 +0200 summary: Updated several extlibs to most recent versions as of this writing. Libs where an update attempt caused notable issues were left unmodified. files: NEWS | 8 +- b/.idea/libraries/extlibs.xml | 79 +++++--- build.xml | 88 +++++---- extlibs/asm-5.0.4.jar | Bin extlibs/asm-5.2.jar | Bin extlibs/asm-commons-5.0.4.jar | Bin extlibs/asm-commons-5.2.jar | Bin extlibs/asm-util-5.0.4.jar | Bin extlibs/asm-util-5.2.jar | Bin extlibs/bcpkix-jdk15on-1.54.jar | Bin extlibs/bcpkix-jdk15on-1.57.jar | Bin extlibs/bcprov-jdk15on-1.54.jar | Bin extlibs/bcprov-jdk15on-1.57.jar | Bin extlibs/commons-compress-1.12.jar | Bin extlibs/commons-compress-1.14.jar | Bin extlibs/guava-20.0.jar | Bin extlibs/guava-22.0-android.jar | Bin extlibs/icu4j-58.1.jar | Bin extlibs/icu4j-59_1.jar | Bin extlibs/jffi-1.2.13.jar | Bin extlibs/jffi-1.2.15.jar | Bin extlibs/jffi-aarch64-Linux.jar | Bin extlibs/jffi-ppc-AIX.jar | Bin extlibs/jffi-ppc64-Linux.jar | Bin extlibs/jffi-ppc64le-Linux.jar | Bin extlibs/jffi-x86_64-OpenBSD.jar | Bin extlibs/jline-2.14.2.jar | Bin extlibs/jline-2.14.3.jar | Bin extlibs/jnr-constants-0.9.5.jar | Bin extlibs/jnr-constants-0.9.9.jar | Bin extlibs/jnr-ffi-2.1.0.jar | Bin extlibs/jnr-ffi-2.1.5.jar | Bin extlibs/jnr-posix-3.0.31.jar | Bin extlibs/jnr-posix-3.0.41.jar | Bin extlibs/mysql-connector-java-5.1.42-bin.jar | Bin extlibs/mysql-connector-java-5.1.6.jar | Bin extlibs/netty-buffer-4.1.11.Final.jar | Bin extlibs/netty-buffer-4.1.6.Final.jar | Bin extlibs/netty-codec-4.1.11.Final.jar | Bin extlibs/netty-codec-4.1.6.Final.jar | Bin extlibs/netty-common-4.1.11.Final.jar | Bin extlibs/netty-common-4.1.6.Final.jar | Bin extlibs/netty-handler-4.1.11.Final.jar | Bin extlibs/netty-handler-4.1.6.Final.jar | Bin extlibs/netty-resolver-4.1.11.Final.jar | Bin extlibs/netty-resolver-4.1.6.Final.jar | Bin extlibs/netty-transport-4.1.11.Final.jar | Bin extlibs/netty-transport-4.1.6.Final.jar | Bin extlibs/postgresql-42.1.1.jre7.jar | Bin extlibs/postgresql-8.3-603.jdbc4.jar | Bin 50 files changed, 97 insertions(+), 78 deletions(-) diff --git a/NEWS b/NEWS --- a/NEWS +++ b/NEWS @@ -85,6 +85,13 @@ - [ 1767 ] Rich comparisons New Features + - Updated Netty to 4.1.11, ASM to 5.2, BouncyCastle to 1.57, Commons Compress to 1.14, + Guava to 22.0, ICU4J to 59.1, JFFI to 1.2.15, JNR-JFFI to 2.1.5, JNR-POSIX to 3.0.41, + JNR-Constants 0.9.9, JLine to 2.14.3, MySQL Connector to 5.1.42, PostgreSQL to 42.1.1 + Note: + You might find it strange that Jython bundles guava-22.0-android.jar rather than guava-22.0.jar. + This is the official way to support Java 7 with Guava > 20.0, also on non-Android platforms. + See https://github.com/google/guava/wiki/Release22#guava-release-220-release-notes. - Recognize cpython_cmd property to automatically build CPython bytecode for oversized functions (e.g. jython -J-Dcpython_cmd=python). This is especially convenient when installing things like SymPy via pip; it would frequently prompt you to provide yet @@ -110,7 +117,6 @@ Python level or for client code using PyBuffer via the "fully encapsulated" API. It risks breaking code that makes direct access to a byte array via PyBuffer, implements the PyBuffer interface, or extends implementation classes in org.python.core.buffer. - - Updated Netty to 4.1.4 - Fixed platform.mac_ver to provide actual info on Mac OS similar to CPython behavior. - Added uname function to posix module. The mostly Java-based implementation even works to some extend on non-posix systems (e.g. Windows). diff --git a/b/.idea/libraries/extlibs.xml b/b/.idea/libraries/extlibs.xml --- a/b/.idea/libraries/extlibs.xml +++ b/b/.idea/libraries/extlibs.xml @@ -1,45 +1,56 @@ - - - - - - + + + + + + + + + + + + + + + + - - - - + - - - - - - - - - - - - - - - - - + + + + + + + + + + - - - - - - - + + + + + + + + + + + + + + + + + + diff --git a/build.xml b/build.xml --- a/build.xml +++ b/build.xml @@ -142,31 +142,31 @@ - - + + - - - - - - - - + + + + + + + + - + - - - - - - - - - + + + + + + + + + @@ -174,8 +174,8 @@ - - + + @@ -552,36 +552,37 @@ - - - + + + - + - + - + - + - + - + - + - + - + - + - + + @@ -589,6 +590,7 @@ + @@ -601,17 +603,17 @@ - - + + - - + + - + @@ -752,8 +754,8 @@ - - + + diff --git a/extlibs/asm-5.0.4.jar b/extlibs/asm-5.0.4.jar deleted file mode 100644 index cdb283dd7f6d7d420ba0fca8ec23e1f38fc3b7c1..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/asm-5.2.jar b/extlibs/asm-5.2.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..aea11818d5181162d58cb2c206cd2243f9357b30 GIT binary patch [stripped] diff --git a/extlibs/asm-commons-5.0.4.jar b/extlibs/asm-commons-5.0.4.jar deleted file mode 100644 index e89265f1e6c315689b63f11eed9397bb2e7ab9ff..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/asm-commons-5.2.jar b/extlibs/asm-commons-5.2.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..cdd2e45c99bdeceafb8cb78a0fa195a2cdb53a37 GIT binary patch [stripped] diff --git a/extlibs/asm-util-5.0.4.jar b/extlibs/asm-util-5.0.4.jar deleted file mode 100644 index 59bf48fb0c14e21f9c85722e27035904d405219e..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/asm-util-5.2.jar b/extlibs/asm-util-5.2.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..686c3f0d04245fc1ed2d6459d2ad265033e0d62d GIT binary patch [stripped] diff --git a/extlibs/bcpkix-jdk15on-1.54.jar b/extlibs/bcpkix-jdk15on-1.54.jar deleted file mode 100644 index 86f7f0be194c0671460eea7acc2bc0ea479e304b..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/bcpkix-jdk15on-1.57.jar b/extlibs/bcpkix-jdk15on-1.57.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..5ce7d5c5cc49c03102e3bb5248405b52431f9ebd GIT binary patch [stripped] diff --git a/extlibs/bcprov-jdk15on-1.54.jar b/extlibs/bcprov-jdk15on-1.54.jar deleted file mode 100644 index bd95185ae8129aa3781f43bc09a194c7bd4121b1..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/bcprov-jdk15on-1.57.jar b/extlibs/bcprov-jdk15on-1.57.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..5a10986b3aac075df6f5400028cf2a6a0a8eb9fa GIT binary patch [stripped] diff --git a/extlibs/commons-compress-1.12.jar b/extlibs/commons-compress-1.12.jar deleted file mode 100644 index 4867705ea099d7ec5fd28e71451b3d4462d76b7d..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/commons-compress-1.14.jar b/extlibs/commons-compress-1.14.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7490eb8996c5dd7b9be15ff01519767b10e907dc GIT binary patch [stripped] diff --git a/extlibs/guava-20.0.jar b/extlibs/guava-20.0.jar deleted file mode 100644 index 632772f3a4d2197c0247cf32031fb489f1531446..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/guava-22.0-android.jar b/extlibs/guava-22.0-android.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..53a6c5cc369889ac0345959a8f8791fd0550b5b2 GIT binary patch [stripped] diff --git a/extlibs/icu4j-58.1.jar b/extlibs/icu4j-58.1.jar deleted file mode 100644 index de0792be150a87b3b6cdeaec1daadd242420be62..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/icu4j-59_1.jar b/extlibs/icu4j-59_1.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..3dc69c8a12f4d66001fe41d9e7ea26a41471c6bf GIT binary patch [stripped] diff --git a/extlibs/jffi-1.2.13.jar b/extlibs/jffi-1.2.13.jar deleted file mode 100644 index 44c539a307e591cc05d193ea4d099b1c6bcca624..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/jffi-1.2.15.jar b/extlibs/jffi-1.2.15.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0a100f9742d3e84700417890dc3b9ebb1ca35538 GIT binary patch [stripped] diff --git a/extlibs/jffi-aarch64-Linux.jar b/extlibs/jffi-aarch64-Linux.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..1d4438e3fdd467539cd11a738367b84d4a97002a GIT binary patch [stripped] diff --git a/extlibs/jffi-ppc-AIX.jar b/extlibs/jffi-ppc-AIX.jar index 8235a3484884243b1d17d5e19340e4166f3b0b55..eed0ab732698e3c105f16622b02ce388e94eefec GIT binary patch [stripped] diff --git a/extlibs/jffi-ppc64-Linux.jar b/extlibs/jffi-ppc64-Linux.jar index 8235a3484884243b1d17d5e19340e4166f3b0b55..cfae4bbf633cdf2c493464f99def993bf1b01728 GIT binary patch [stripped] diff --git a/extlibs/jffi-ppc64le-Linux.jar b/extlibs/jffi-ppc64le-Linux.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7931dd1c2734ad45b5a3e8554a181b5179a37aae GIT binary patch [stripped] diff --git a/extlibs/jffi-x86_64-OpenBSD.jar b/extlibs/jffi-x86_64-OpenBSD.jar index 8235a3484884243b1d17d5e19340e4166f3b0b55..3e14e03e403e162bcb929763a5d3f8206cac4e90 GIT binary patch [stripped] diff --git a/extlibs/jline-2.14.2.jar b/extlibs/jline-2.14.2.jar deleted file mode 100644 index 02683de89cf5c6dbbe9abb0caf363347dc36ba07..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/jline-2.14.3.jar b/extlibs/jline-2.14.3.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..1dd3ef4660637fff87aea40662f54a635f29e141 GIT binary patch [stripped] diff --git a/extlibs/jnr-constants-0.9.5.jar b/extlibs/jnr-constants-0.9.5.jar deleted file mode 100644 index 41bd7637e405ac68b493249f2607a1c45f18fc94..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/jnr-constants-0.9.9.jar b/extlibs/jnr-constants-0.9.9.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..a877bc42b6df7755e5eab266139a1c5bd8076a4d GIT binary patch [stripped] diff --git a/extlibs/jnr-ffi-2.1.0.jar b/extlibs/jnr-ffi-2.1.0.jar deleted file mode 100644 index b3b3f031378bab1aeda322275862c5a368cf6b43..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/jnr-ffi-2.1.5.jar b/extlibs/jnr-ffi-2.1.5.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0ada90e99e8da8e25cdd32ef72bde55f602349f2 GIT binary patch [stripped] diff --git a/extlibs/jnr-posix-3.0.31.jar b/extlibs/jnr-posix-3.0.31.jar deleted file mode 100644 index ca14e7f8576de662403be519116816a8caa1085a..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/jnr-posix-3.0.41.jar b/extlibs/jnr-posix-3.0.41.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..a5f72ca042cf9e7d87fb0f46b8cef8470dbe8938 GIT binary patch [stripped] diff --git a/extlibs/mysql-connector-java-5.1.42-bin.jar b/extlibs/mysql-connector-java-5.1.42-bin.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..4c6df38c142a66e1a8ed0a39358484eb71425a6b GIT binary patch [stripped] diff --git a/extlibs/mysql-connector-java-5.1.6.jar b/extlibs/mysql-connector-java-5.1.6.jar deleted file mode 100644 index 0539039f716034c4896c8eaa81c075c7fa3bc997..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/netty-buffer-4.1.11.Final.jar b/extlibs/netty-buffer-4.1.11.Final.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..28d32ae3217b108d5f18e6b4c7f9b1a138963556 GIT binary patch [stripped] diff --git a/extlibs/netty-buffer-4.1.6.Final.jar b/extlibs/netty-buffer-4.1.6.Final.jar deleted file mode 100644 index eb9571b132f76e9b58491abca25c87c4865a788f..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/netty-codec-4.1.11.Final.jar b/extlibs/netty-codec-4.1.11.Final.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..4e6c46b275dd8a053804768779c1c72e54cc6449 GIT binary patch [stripped] diff --git a/extlibs/netty-codec-4.1.6.Final.jar b/extlibs/netty-codec-4.1.6.Final.jar deleted file mode 100644 index 46e6d3457e0de1631961b474936ee53d411c4ff9..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/netty-common-4.1.11.Final.jar b/extlibs/netty-common-4.1.11.Final.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..c1db5980b47cf3db66b97591c8ded843f12bd12a GIT binary patch [stripped] diff --git a/extlibs/netty-common-4.1.6.Final.jar b/extlibs/netty-common-4.1.6.Final.jar deleted file mode 100644 index f119295ae4dada6c4224580d7f05148369fdcd3d..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/netty-handler-4.1.11.Final.jar b/extlibs/netty-handler-4.1.11.Final.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..19c510eada2ebb2ce3ae6508caa762407f48d742 GIT binary patch [stripped] diff --git a/extlibs/netty-handler-4.1.6.Final.jar b/extlibs/netty-handler-4.1.6.Final.jar deleted file mode 100644 index 283ae99a7dd1036ca05c40bf06bbc06250d2eeea..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/netty-resolver-4.1.11.Final.jar b/extlibs/netty-resolver-4.1.11.Final.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..41340f7b49e7df18e8a609697992da38ff07b95f GIT binary patch [stripped] diff --git a/extlibs/netty-resolver-4.1.6.Final.jar b/extlibs/netty-resolver-4.1.6.Final.jar deleted file mode 100644 index 1a9975797464c9b9b56bc2d487509037bfd296df..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/netty-transport-4.1.11.Final.jar b/extlibs/netty-transport-4.1.11.Final.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..59843cf314ebcb8f0ea7c5025a45a9c9cd1c6f10 GIT binary patch [stripped] diff --git a/extlibs/netty-transport-4.1.6.Final.jar b/extlibs/netty-transport-4.1.6.Final.jar deleted file mode 100644 index b9610fee2e6f3068a92f26cd636414741a37ac5a..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] diff --git a/extlibs/postgresql-42.1.1.jre7.jar b/extlibs/postgresql-42.1.1.jre7.jar new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..99b60a33847013f7f9b9b532cc92eccdb4e8baec GIT binary patch [stripped] diff --git a/extlibs/postgresql-8.3-603.jdbc4.jar b/extlibs/postgresql-8.3-603.jdbc4.jar deleted file mode 100644 index 0bf0de0577a1e52b9206ce72a1617f18af20d2d7..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 GIT binary patch [stripped] -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Fri May 26 20:08:08 2017 From: jython-checkins at python.org (jim.baker) Date: Sat, 27 May 2017 00:08:08 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Update_README_to_highlight?= =?utf-8?q?_one_vital_contributor=27s_role_in_getting_2=2E7=2E1_out?= Message-ID: <20170527000808.4618.ED7A63E56B7D8A51@psf.io> https://hg.python.org/jython/rev/ab5434be88aa changeset: 8098:ab5434be88aa user: Jim Baker date: Fri May 26 20:07:13 2017 -0400 summary: Update README to highlight one vital contributor's role in getting 2.7.1 out files: README.txt | 14 +++++++++----- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/README.txt b/README.txt --- a/README.txt +++ b/README.txt @@ -31,8 +31,12 @@ Please see ACKNOWLEDGMENTS for details about Jython's copyright, license, contributors, and mailing lists; and NEWS for detailed release notes, including bugs fixed, backwards breaking changes, and -new features. Thanks go to Amobee (http://www.amobee.com/) for -sponsoring this release. We also deeply thank all who contribute to -Jython, including - but not limited to - bug reports, patches, pull -requests, documentation changes, support emails, and fantastic -conversation on Freenode at #jython. +new features. Thanks go to Google for sponsoring Stefan Richthofer for +the Google Summer of Code; there are many others to thanks, but +Stefan's work proved instrumental for getting 2.7.1 out, all in +preparation for actual work on JyNI for the summer of 2017 +(http://jyni.org/). We also deeply thank all who contribute to Jython, +including - but not limited to - bug reports, patches, pull requests, +documentation changes, support emails, and fantastic conversation on +Freenode at #jython. Please join us there for your questions and +answers! -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Sat May 27 13:38:49 2017 From: jython-checkins at python.org (jim.baker) Date: Sat, 27 May 2017 17:38:49 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Typo=2C_formatting=2C_mino?= =?utf-8?q?r_wording_for_README?= Message-ID: <20170527173849.64963.7FF5524539B7DA97@psf.io> https://hg.python.org/jython/rev/ba0584414168 changeset: 8099:ba0584414168 user: Jim Baker date: Sat May 27 13:38:37 2017 -0400 summary: Typo, formatting, minor wording for README files: README.txt | 56 +++++++++++++++++++++-------------------- 1 files changed, 29 insertions(+), 27 deletions(-) diff --git a/README.txt b/README.txt --- a/README.txt +++ b/README.txt @@ -1,16 +1,17 @@ Jython: Python for the Java Platform -Welcome to Jython 2.7.1 release candidate 1! +Welcome to Jython 2.7.1 release candidate 2! -This is the first release candidate of the 2.7.1 version of Jython. Along with -language and runtime compatibility with CPython 2.7.1, Jython 2.7 provides -substantial support of the Python ecosystem. This includes built-in support of -pip/setuptools (you can use with bin/pip) and a native launcher for Windows -(bin/jython.exe), with the implication that you can finally install Jython -scripts on Windows. +This is the first release candidate of the 2.7.1 version of +Jython. Along with language and runtime compatibility with CPython +2.7.1, Jython 2.7 provides substantial support of the Python +ecosystem. This includes built-in support of pip/setuptools (you can +use with bin/pip) and a native launcher for Windows (bin/jython.exe), +with the implication that you can finally install Jython scripts on +Windows. -* Note that if you have JYTHON_HOME set, you should unset it to avoid problems -with the installer and pip/setuptools. +**Note that if you have JYTHON_HOME set, you should unset it to avoid +problems with the installer and pip/setuptools.** Jim Baker presented a talk at PyCon 2015 about Jython 2.7, including demos of new features: https://www.youtube.com/watch?v=hLm3garVQFo @@ -18,25 +19,26 @@ The release was compiled on OSX using JDK 7 and requires a minimum of Java 7 to run. -Please try this release out and report any bugs at http://bugs.jython.org -You can test your installation of Jython (not the standalone JAR) by running -the regression tests, with the command: +Please try this release out and report any bugs at +http://bugs.jython.org You can test your installation of Jython (not +the standalone jar) by running the regression tests, with the command: jython -m test.regrtest -e -m regrtest_memo.txt -For Windows, there is a simple script to do this: jython_regrtest.bat. In -either case, the memo file regrtest_memo.txt will be useful in the bug report -if you see test failures. The regression tests can take about half an hour. +For Windows, there is a simple script to do this: jython_regrtest.bat. +In either case, the memo file regrtest_memo.txt will be useful in the +bug report if you see test failures. The regression tests can take +about half an hour. -Please see ACKNOWLEDGMENTS for details about Jython's copyright, -license, contributors, and mailing lists; and NEWS for detailed -release notes, including bugs fixed, backwards breaking changes, and -new features. Thanks go to Google for sponsoring Stefan Richthofer for -the Google Summer of Code; there are many others to thanks, but -Stefan's work proved instrumental for getting 2.7.1 out, all in -preparation for actual work on JyNI for the summer of 2017 -(http://jyni.org/). We also deeply thank all who contribute to Jython, -including - but not limited to - bug reports, patches, pull requests, -documentation changes, support emails, and fantastic conversation on -Freenode at #jython. Please join us there for your questions and -answers! +See ACKNOWLEDGMENTS for details about Jython's copyright, license, +contributors, and mailing lists; and NEWS for detailed release notes, +including bugs fixed, backwards breaking changes, and new +features. Thanks go to Google for sponsoring Stefan Richthofer for the +Google Summer of Code; there are so many others to thank, but Stefan's +work proved instrumental for getting 2.7.1 out, all in preparation for +his actual work on JyNI for the summer of 2017 +(http://jyni.org/). Motivation helps! We also deeply thank all who +contribute to Jython, including - but not limited to - bug reports, +patches, pull requests, documentation changes, support emails, and +fantastic conversation on Freenode at #jython. Join us there for your +questions and answers! -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Mon May 29 13:54:53 2017 From: jython-checkins at python.org (frank.wierzbicki) Date: Mon, 29 May 2017 17:54:53 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Update_versions_for_releas?= =?utf-8?q?e=2E?= Message-ID: <20170529175420.105638.C7422FD79CC07A9C@psf.io> https://hg.python.org/jython/rev/3a7f75bd075a changeset: 8100:3a7f75bd075a user: Frank Wierzbicki date: Mon May 29 17:53:59 2017 +0000 summary: Update versions for release. files: build.xml | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/build.xml b/build.xml --- a/build.xml +++ b/build.xml @@ -84,15 +84,15 @@ - - + + - + @@ -398,7 +398,7 @@ - ======================= -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Mon May 29 13:55:00 2017 From: jython-checkins at python.org (frank.wierzbicki) Date: Mon, 29 May 2017 17:55:00 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Version_wording_change=2E?= Message-ID: <20170529175459.19536.59D7F638C97D0147@psf.io> https://hg.python.org/jython/rev/850c2491cb25 changeset: 8101:850c2491cb25 user: Frank Wierzbicki date: Mon May 29 17:54:39 2017 +0000 summary: Version wording change. files: README.txt | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/README.txt b/README.txt --- a/README.txt +++ b/README.txt @@ -2,7 +2,7 @@ Welcome to Jython 2.7.1 release candidate 2! -This is the first release candidate of the 2.7.1 version of +This is the second release candidate of the 2.7.1 version of Jython. Along with language and runtime compatibility with CPython 2.7.1, Jython 2.7 provides substantial support of the Python ecosystem. This includes built-in support of pip/setuptools (you can -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Mon May 29 13:56:57 2017 From: jython-checkins at python.org (frank.wierzbicki) Date: Mon, 29 May 2017 17:56:57 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Added_tag_v2=2E7=2E1rc2_fo?= =?utf-8?q?r_changeset_850c2491cb25?= Message-ID: <20170529175626.19170.9FB04DD0AA14FF11@psf.io> https://hg.python.org/jython/rev/f66327aa5de9 changeset: 8102:f66327aa5de9 user: Frank Wierzbicki date: Mon May 29 17:56:06 2017 +0000 summary: Added tag v2.7.1rc2 for changeset 850c2491cb25 files: .hgtags | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/.hgtags b/.hgtags --- a/.hgtags +++ b/.hgtags @@ -104,3 +104,4 @@ 03f4808038f8bbc246b6d6a022aecfde087eeb91 v2.7.1rc1 03f4808038f8bbc246b6d6a022aecfde087eeb91 v2.7.1rc1 330556fdad478b61f93a548643743c3d0214fd40 v2.7.1rc1 +850c2491cb25a54846ba0aedf70062074b12e673 v2.7.1rc2 -- Repository URL: https://hg.python.org/jython From jython-checkins at python.org Mon May 29 14:27:37 2017 From: jython-checkins at python.org (frank.wierzbicki) Date: Mon, 29 May 2017 18:27:37 +0000 Subject: [Jython-checkins] =?utf-8?q?jython=3A_Remove_explicit_reference_?= =?utf-8?q?to_CPython_2=2E7_micro_version=2E?= Message-ID: <20170529182633.18984.A316F83CF11CF0BF@psf.io> https://hg.python.org/jython/rev/d4cd06b8c8c7 changeset: 8103:d4cd06b8c8c7 user: Frank Wierzbicki date: Mon May 29 18:26:11 2017 +0000 summary: Remove explicit reference to CPython 2.7 micro version. files: README.txt | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/README.txt b/README.txt --- a/README.txt +++ b/README.txt @@ -4,7 +4,7 @@ This is the second release candidate of the 2.7.1 version of Jython. Along with language and runtime compatibility with CPython -2.7.1, Jython 2.7 provides substantial support of the Python +2.7, Jython 2.7 provides substantial support of the Python ecosystem. This includes built-in support of pip/setuptools (you can use with bin/pip) and a native launcher for Windows (bin/jython.exe), with the implication that you can finally install Jython scripts on -- Repository URL: https://hg.python.org/jython