[Jython-checkins] jython: _io.open() in Java

jeff.allen jython-checkins at python.org
Sun Dec 9 21:29:38 CET 2012


http://hg.python.org/jython/rev/3cce7c6141a3
changeset:   6892:3cce7c6141a3
user:        Jeff Allen <ja...py at farowl.co.uk>
date:        Thu Dec 06 07:29:57 2012 +0000
summary:
  _io.open() in Java
Eliminates Python implementation in _jyio of open(). the Java implementation does not yet support file descriptors (neither int nor Jython-style). Failures in test_io slightly up at fail/error/skip = 8/58/99

files:
  Lib/_jyio.py                             |  217 +---------
  Lib/test/test_io.py                      |    3 -
  src/org/python/modules/_io/OpenMode.java |  242 ++++++++++
  src/org/python/modules/_io/PyFileIO.java |   20 +-
  src/org/python/modules/_io/_io.java      |  253 ++++++++++-
  5 files changed, 506 insertions(+), 229 deletions(-)


diff --git a/Lib/_jyio.py b/Lib/_jyio.py
--- a/Lib/_jyio.py
+++ b/Lib/_jyio.py
@@ -40,11 +40,10 @@
 __metaclass__ = type
 
 # open() uses st_blksize whenever we can
-DEFAULT_BUFFER_SIZE = 8 * 1024  # bytes
+from _io import DEFAULT_BUFFER_SIZE
 
 # NOTE: Base classes defined here are registered with the "official" ABCs
-# defined in io.py. We don't use real inheritance though, because we don't
-# want to inherit the C implementations.
+# defined in io.py.
 
 
 class BlockingIOError(IOError):
@@ -58,217 +57,7 @@
         self.characters_written = characters_written
 
 
-def open(file, mode="r", buffering=-1,
-         encoding=None, errors=None,
-         newline=None, closefd=True):
-
-    r"""Open file and return a stream.  Raise IOError upon failure.
-
-    file is either a text or byte string giving the name (and the path
-    if the file isn't in the current working directory) of the file to
-    be opened or an integer file descriptor of the file to be
-    wrapped. (If a file descriptor is given, it is closed when the
-    returned I/O object is closed, unless closefd is set to False.)
-
-    mode is an optional string that specifies the mode in which the file
-    is opened. It defaults to 'r' which means open for reading in text
-    mode.  Other common values are 'w' for writing (truncating the file if
-    it already exists), and 'a' for appending (which on some Unix systems,
-    means that all writes append to the end of the file regardless of the
-    current seek position). In text mode, if encoding is not specified the
-    encoding used is platform dependent. (For reading and writing raw
-    bytes use binary mode and leave encoding unspecified.) The available
-    modes are:
-
-    ========= ===============================================================
-    Character Meaning
-    --------- ---------------------------------------------------------------
-    'r'       open for reading (default)
-    'w'       open for writing, truncating the file first
-    'a'       open for writing, appending to the end of the file if it exists
-    'b'       binary mode
-    't'       text mode (default)
-    '+'       open a disk file for updating (reading and writing)
-    'U'       universal newline mode (for backwards compatibility; unneeded
-              for new code)
-    ========= ===============================================================
-
-    The default mode is 'rt' (open for reading text). For binary random
-    access, the mode 'w+b' opens and truncates the file to 0 bytes, while
-    'r+b' opens the file without truncation.
-
-    Python distinguishes between files opened in binary and text modes,
-    even when the underlying operating system doesn't. Files opened in
-    binary mode (appending 'b' to the mode argument) return contents as
-    bytes objects without any decoding. In text mode (the default, or when
-    't' is appended to the mode argument), the contents of the file are
-    returned as strings, the bytes having been first decoded using a
-    platform-dependent encoding or using the specified encoding if given.
-
-    buffering is an optional integer used to set the buffering policy.
-    Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
-    line buffering (only usable in text mode), and an integer > 1 to indicate
-    the size of a fixed-size chunk buffer.  When no buffering argument is
-    given, the default buffering policy works as follows:
-
-    * Binary files are buffered in fixed-size chunks; the size of the buffer
-      is chosen using a heuristic trying to determine the underlying device's
-      "block size" and falling back on `io.DEFAULT_BUFFER_SIZE`.
-      On many systems, the buffer will typically be 4096 or 8192 bytes long.
-
-    * "Interactive" text files (files for which isatty() returns True)
-      use line buffering.  Other text files use the policy described above
-      for binary files.
-
-    encoding is the name of the encoding used to decode or encode the
-    file. This should only be used in text mode. The default encoding is
-    platform dependent, but any encoding supported by Python can be
-    passed.  See the codecs module for the list of supported encodings.
-
-    errors is an optional string that specifies how encoding errors are to
-    be handled---this argument should not be used in binary mode. Pass
-    'strict' to raise a ValueError exception if there is an encoding error
-    (the default of None has the same effect), or pass 'ignore' to ignore
-    errors. (Note that ignoring encoding errors can lead to data loss.)
-    See the documentation for codecs.register for a list of the permitted
-    encoding error strings.
-
-    newline controls how universal newlines works (it only applies to text
-    mode). It can be None, '', '\n', '\r', and '\r\n'.  It works as
-    follows:
-
-    * On input, if newline is None, universal newlines mode is
-      enabled. Lines in the input can end in '\n', '\r', or '\r\n', and
-      these are translated into '\n' before being returned to the
-      caller. If it is '', universal newline mode is enabled, but line
-      endings are returned to the caller untranslated. If it has any of
-      the other legal values, input lines are only terminated by the given
-      string, and the line ending is returned to the caller untranslated.
-
-    * On output, if newline is None, any '\n' characters written are
-      translated to the system default line separator, os.linesep. If
-      newline is '', no translation takes place. If newline is any of the
-      other legal values, any '\n' characters written are translated to
-      the given string.
-
-    If closefd is False, the underlying file descriptor will be kept open
-    when the file is closed. This does not work when a file name is given
-    and must be True in that case.
-
-    open() returns a file object whose type depends on the mode, and
-    through which the standard file operations such as reading and writing
-    are performed. When open() is used to open a file in a text mode ('w',
-    'r', 'wt', 'rt', etc.), it returns a TextIOWrapper. When used to open
-    a file in a binary mode, the returned class varies: in read binary
-    mode, it returns a BufferedReader; in write binary and append binary
-    modes, it returns a BufferedWriter, and in read/write mode, it returns
-    a BufferedRandom.
-
-    It is also possible to use a string or bytearray as a file for both
-    reading and writing. For strings StringIO can be used like a file
-    opened in a text mode, and for bytes a BytesIO can be used like a file
-    opened in a binary mode.
-    """
-    if not isinstance(file, (basestring, int, long)):
-        raise TypeError("invalid file: %r" % file)
-    if not isinstance(mode, basestring):
-        raise TypeError("invalid mode: %r" % mode)
-    if not isinstance(buffering, (int, long)):
-        raise TypeError("invalid buffering: %r" % buffering)
-    if encoding is not None and not isinstance(encoding, basestring):
-        raise TypeError("invalid encoding: %r" % encoding)
-    if errors is not None and not isinstance(errors, basestring):
-        raise TypeError("invalid errors: %r" % errors)
-    modes = set(mode)
-    if modes - set("arwb+tU") or len(mode) > len(modes):
-        raise ValueError("invalid mode: %r" % mode)
-    reading = "r" in modes
-    writing = "w" in modes
-    appending = "a" in modes
-    updating = "+" in modes
-    text = "t" in modes
-    binary = "b" in modes
-    if "U" in modes:
-        if writing or appending:
-            raise ValueError("can't use U and writing mode at once")
-        reading = True
-    if text and binary:
-        raise ValueError("can't have text and binary mode at once")
-    if reading + writing + appending > 1:
-        raise ValueError("can't have read/write/append mode at once")
-    if not (reading or writing or appending):
-        raise ValueError("must have exactly one of read/write/append mode")
-    if binary and encoding is not None:
-        raise ValueError("binary mode doesn't take an encoding argument")
-    if binary and errors is not None:
-        raise ValueError("binary mode doesn't take an errors argument")
-    if binary and newline is not None:
-        raise ValueError("binary mode doesn't take a newline argument")
-    raw = FileIO(file,
-                 (reading and "r" or "") +
-                 (writing and "w" or "") +
-                 (appending and "a" or "") +
-                 (updating and "+" or ""),
-                 closefd)
-    line_buffering = False
-    if buffering == 1 or buffering < 0 and raw.isatty():
-        buffering = -1
-        line_buffering = True
-    if buffering < 0:
-        buffering = DEFAULT_BUFFER_SIZE
-        try:
-            bs = os.fstat(raw.fileno()).st_blksize
-        except (os.error, AttributeError):
-            pass
-        else:
-            if bs > 1:
-                buffering = bs
-    if buffering < 0:
-        raise ValueError("invalid buffering size")
-    if buffering == 0:
-        if binary:
-            return raw
-        raise ValueError("can't have unbuffered text I/O")
-    if updating:
-        buffer = BufferedRandom(raw, buffering)
-    elif writing or appending:
-        buffer = BufferedWriter(raw, buffering)
-    elif reading:
-        buffer = BufferedReader(raw, buffering)
-    else:
-        raise ValueError("unknown mode: %r" % mode)
-    if binary:
-        return buffer
-    text = TextIOWrapper(buffer, encoding, errors, newline, line_buffering)
-    text.mode = mode
-    return text
-
-
-class DocDescriptor:
-    """Helper for builtins.open.__doc__
-    """
-    def __get__(self, obj, typ):
-        return (
-            "open(file, mode='r', buffering=-1, encoding=None, "
-                 "errors=None, newline=None, closefd=True)\n\n" +
-            open.__doc__)
-
-
-class OpenWrapper:
-    """Wrapper for builtins.open
-
-    Trick so that open won't become a bound method when stored
-    as a class variable (as dbm.dumb does).
-
-    See initstdio() in Python/pythonrun.c.
-    """
-    __doc__ = DocDescriptor()
-
-    def __new__(cls, *args, **kwargs):
-        return open(*args, **kwargs)
-
-
-from _io import UnsupportedOperation
+from _io import (open, UnsupportedOperation)
 
 
 class _IOBase:
diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py
--- a/Lib/test/test_io.py
+++ b/Lib/test/test_io.py
@@ -2951,9 +2951,6 @@
     py_io_ns.update((x.__name__, globs["Py" + x.__name__]) for x in mocks)
     # Avoid turning open into a bound method.
     py_io_ns["open"] = pyio.OpenWrapper
-    # XXX: While we use _jyio.py, the same trick is necessary for it too
-    import _jyio                              # XXX
-    c_io_ns["open"] = _jyio.OpenWrapper       # XXX
     for test in tests:
         if test.__name__.startswith("C"):
             for name, obj in c_io_ns.items():
diff --git a/src/org/python/modules/_io/OpenMode.java b/src/org/python/modules/_io/OpenMode.java
new file mode 100644
--- /dev/null
+++ b/src/org/python/modules/_io/OpenMode.java
@@ -0,0 +1,242 @@
+package org.python.modules._io;
+
+import org.python.core.Py;
+import org.python.core.PyException;
+
+/**
+ * An object able to check a file access mode provided as a String and represent it as boolean
+ * attributes and in a normalised form. Such a string is the the mode argument of the several open()
+ * functions available in Python and certain constructors for streams-like objects.
+ */
+public class OpenMode {
+
+    /** Original string supplied as the mode */
+    public final String originalModeString;
+
+    /** Whether this file is opened for reading ('r') */
+    public boolean reading;
+
+    /** Whether this file is opened for writing ('w') */
+    public boolean writing;
+
+    /** Whether this file is opened in appending mode ('a') */
+    public boolean appending;
+
+    /** Whether this file is opened for updating ('+') */
+    public boolean updating;
+
+    /** Whether this file is opened in binary mode ('b') */
+    public boolean binary;
+
+    /** Whether this file is opened in text mode ('t') */
+    public boolean text;
+
+    /** Whether this file is opened in universal newlines mode ('U') */
+    public boolean universal;
+
+    /** Whether the mode contained some other symbol from the allowed ones */
+    public boolean other;
+
+    /** Set true when any invalid symbol or combination is discovered */
+    public boolean invalid;
+
+    /**
+     * Error message describing the way in which the mode is invalid, or null if no problem has been
+     * found. This field may be set by the constructor (in the case of duplicate or unrecognised
+     * mode letters), by the {@link #isValid()} method, or by client code. A non-null value will
+     * cause {@link #isValid()} to return false.
+     */
+    public String message;
+
+    /**
+     * Decode the given string to an OpenMode object, checking for duplicate or unrecognised mode
+     * letters. Valid letters are those in "rwa+btU". Errors in the mode string do not raise an
+     * exception, they simply generate an appropriate error message in {@link #message}. After
+     * construction, a client should always call {@link #isValid()} to complete validity checks.
+     *
+     * @param mode
+     */
+    public OpenMode(String mode) {
+
+        originalModeString = mode;
+        int n = mode.length();
+        boolean duplicate = false;
+
+        for (int i = 0; i < n; i++) {
+            char c = mode.charAt(i);
+
+            switch (c) {
+                case 'r':
+                    duplicate = reading;
+                    reading = true;
+                    break;
+                case 'w':
+                    duplicate = writing;
+                    writing = true;
+                    break;
+                case 'a':
+                    duplicate = appending;
+                    appending = true;
+                    break;
+                case '+':
+                    duplicate = updating;
+                    updating = true;
+                    break;
+                case 't':
+                    duplicate = text;
+                    text = true;
+                    break;
+                case 'b':
+                    duplicate = binary;
+                    binary = true;
+                    break;
+                case 'U':
+                    duplicate = universal;
+                    universal = true;
+                    break;
+                default:
+                    other = true;
+            }
+
+            // duplicate is set iff c was encountered previously */
+            if (duplicate) {
+                invalid = true;
+                break;
+            }
+        }
+
+    }
+
+    /**
+     * Adjust and validate the flags decoded from the mode string. The method affects the flags
+     * where the presence of one flag implies another, then if the {@link #invalid} flag is not
+     * already <code>true</code>, it checks the validity of the flags against combinations allowed
+     * by the Python <code>io.open()</code> function. In the case of a violation, it sets the
+     * <code>invalid</code> flag, and sets {@link #message} to a descriptive message. The point of
+     * the qualification "if the <code>invalid</code> flag is not already <code>true</code>" is that
+     * the message should always describe the first problem discovered. If left blank, as in fact
+     * the constructor does, it will be filled by the generic message when {@link #checkValid()} is
+     * finally called. Clients may override this method (by sub-classing) to express the validation
+     * correct in their context.
+     * <p>
+     * The invalid combinations enforced here are those for the "raw" (ie non-text) file types:
+     * <ul>
+     * <li>universal & (writing | appending)),</li>
+     * <li>text & binary</li>,
+     * <li>reading & writing,</li>
+     * <li>appending & (reading | writing)</li>
+     * </ul>
+     * See also {@link #validate(String, String, String)} for additional checks relevant to text
+     * files.
+     */
+    public void validate() {
+
+        // Implications
+        reading |= universal;
+
+        // Standard tests
+        if (!invalid) {
+            if (universal && (writing || appending)) {
+                message = "can't use U and writing mode at once";
+            } else if (text && binary) {
+                message = "can't have text and binary mode at once";
+            } else if (reading && writing || appending && (reading || writing)) {
+                message = "must have exactly one of read/write/append mode";
+            }
+            invalid |= (message != null);
+        }
+    }
+
+    /**
+     * Perform additional validation of the flags relevant to text files. If {@link #invalid} is not
+     * already <code>true</code>, and the mode includes {@link #binary}, then all the arguments to
+     * this call must be <code>null</code>. If the criterion is not met, then on return from the
+     * method, <code>invalid==true</code> and {@link #message} is set to a standard error message.
+     * This is the standard additional validation applicable to text files. (By "standard" we mean
+     * the test and messages that CPython <code>io.open</code> uses.)
+     *
+     * @param encoding argument to <code>open()</code>
+     * @param errors argument to <code>open()</code>
+     * @param newline argument to <code>open()</code>
+     */
+    public void validate(String encoding, String errors, String newline) {
+
+        // If the basic tests passed and binary mode is set one check text arguments null
+        if (!invalid && binary) {
+            if (encoding != null) {
+                message = "binary mode doesn't take an encoding argument";
+            } else if (errors != null) {
+                message = "binary mode doesn't take an errors argument";
+            } else if (newline != null) {
+                message = "binary mode doesn't take a newline argument";
+            }
+            invalid = (message != null);
+        }
+    }
+
+    /**
+     * Call {@link #validate()} and raise an exception if the mode string is not valid,
+     * as signalled by either {@link #invalid}
+     * or {@link #other} being <code>true</code> after that call. If no more specific message has been assigned in
+     * {@link #message}, report the original mode string.
+     *
+     * @throws PyException (ValueError) if the mode string was invalid.
+     */
+    public void checkValid() throws PyException {
+
+        // Actually peform the check
+        validate();
+
+        // The 'other' flag reports alien symbols in the original mode string
+        invalid |= other;
+
+        // Finally, if invalid, report this as an error
+        if (invalid) {
+            if (message == null) {
+                // Duplicates discovered in the constructor or invalid symbols
+                message = String.format("invalid mode: '%.20s'", originalModeString);
+            }
+            throw Py.ValueError(message);
+        }
+    }
+
+    public String rawmode() {
+        StringBuilder m = new StringBuilder(2);
+        if (appending) {
+            m.append('a');
+        } else if (writing) {
+            m.append('w');
+        } else {
+            m.append('r');
+        }
+        if (updating) {
+            m.append('+');
+        }
+        return m.toString();
+    }
+
+    @Override
+    public String toString() {
+        StringBuilder m = new StringBuilder(4);
+        if (appending) {
+            m.append('a');
+        } else if (writing) {
+            m.append('w');
+        } else {
+            m.append('r');
+        }
+        if (updating) {
+            m.append('+');
+        }
+        if (text) {
+            m.append('t');
+        } else if (binary) {
+            m.append('b');
+        }
+        if (universal) {
+            m.append('U');
+        }
+        return m.toString();
+    }
+
+}
diff --git a/src/org/python/modules/_io/PyFileIO.java b/src/org/python/modules/_io/PyFileIO.java
--- a/src/org/python/modules/_io/PyFileIO.java
+++ b/src/org/python/modules/_io/PyFileIO.java
@@ -1,4 +1,4 @@
-/* Copyright (c) Jython Developers */
+/* Copyright (c)2012 Jython Developers */
 package org.python.modules._io;
 
 import java.nio.ByteBuffer;
@@ -71,9 +71,10 @@
         String mode = ap.getString(1, "r");
         boolean closefd = Py.py2boolean(ap.getPyObject(2, Py.True));
         // TODO: make this work with file channels so closefd=False can be used
-        if (!closefd)
-        	throw Py.ValueError("Cannot use closefd=False with file name");
-        
+        if (!closefd) {
+            throw Py.ValueError("Cannot use closefd=False with file name");
+        }
+
         FileIO___init__((PyString)name, mode, closefd);
         closer = new Closer(file, Py.getSystemState());
     }
@@ -83,7 +84,7 @@
         this.name = name;
         this.mode = mode;
         this.closefd = closefd;
-        this.file = new FileIO((PyString) name, mode.replaceAll("b", ""));
+        this.file = new FileIO(name, mode.replaceAll("b", ""));
     }
 
     private String parseMode(String mode) {
@@ -141,8 +142,9 @@
 
     @ExposedMethod(doc = "True if file supports random-access.")
     final boolean FileIO_seekable() {
-    	if (seekable == null)
-    		seekable = file.seek(0, 0) >= 0;
+    	if (seekable == null) {
+            seekable = file.seek(0, 0) >= 0;
+        }
     	return seekable;
     }
 
@@ -158,8 +160,9 @@
 
     @ExposedMethod(defaults = {"null"}, doc = BuiltinDocs.file_truncate_doc)
     final PyObject FileIO_truncate(PyObject position) {
-        if (position == null)
+        if (position == null) {
             return Py.java2py(FileIO_truncate());
+        }
     	return Py.java2py(FileIO_truncate(position.asLong()));
     }
 
@@ -311,6 +314,7 @@
         }
 
         /** For closing as part of a shutdown process */
+        @Override
         public Void call() {
             file.close();
             sys = null;
diff --git a/src/org/python/modules/_io/_io.java b/src/org/python/modules/_io/_io.java
--- a/src/org/python/modules/_io/_io.java
+++ b/src/org/python/modules/_io/_io.java
@@ -1,22 +1,23 @@
 /* Copyright (c)2012 Jython Developers */
 package org.python.modules._io;
 
+import org.python.core.ArgParser;
 import org.python.core.ClassDictInit;
 import org.python.core.Py;
 import org.python.core.PyException;
+import org.python.core.PyInteger;
 import org.python.core.PyObject;
 import org.python.core.PyString;
 import org.python.core.PyStringMap;
 import org.python.core.PyType;
 import org.python.core.imp;
+import org.python.core.io.IOBase;
 
 /**
- * The Python _io module.
+ * The Python _io module implemented in Java.
  */
 public class _io implements ClassDictInit {
 
-    public static final PyString __doc__ = new PyString("Java implementation of _io.");
-
     /**
      * This method is called when the module is loaded, to populate the namespace (dictionary) of
      * the module. The dictionary has been initialised at this point reflectively from the methods
@@ -26,7 +27,8 @@
      */
     public static void classDictInit(PyObject dict) {
         dict.__setitem__("__name__", new PyString("_io"));
-        dict.__setitem__("__doc__", __doc__);
+        dict.__setitem__("__doc__", new PyString(__doc__));
+        dict.__setitem__("DEFAULT_BUFFER_SIZE", DEFAULT_BUFFER_SIZE);
         dict.__setitem__("FileIO", PyFileIO.TYPE);
 
         // Define UnsupportedOperation exception by constructing the type
@@ -75,4 +77,247 @@
         return type;
     }
 
+    /** Default buffer size obtained from {@link IOBase#DEFAULT_BUFFER_SIZE}. */
+    private static final int _DEFAULT_BUFFER_SIZE = IOBase.DEFAULT_BUFFER_SIZE;
+
+    /** Default buffer size for export. */
+    public static final PyInteger DEFAULT_BUFFER_SIZE = new PyInteger(_DEFAULT_BUFFER_SIZE);
+
+    /**
+     * Open file and return a stream. Raise IOError upon failure. This is a port to Java of the
+     * CPython _io.open (Modules/_io/_iomodule.c) following the same logic, but expressed with the
+     * benefits of Java syntax.
+     *
+     * @param args array of arguments from Python call via Jython framework
+     * @param kwds array of keywords from Python call via Jython framework
+     * @return the stream object
+     */
+    public static PyObject open(PyObject[] args, String[] kwds) {
+
+        // Get the arguments to variables
+        ArgParser ap = new ArgParser("open", args, kwds, openKwds, 1);
+        PyObject file = ap.getPyObject(0);
+        String m = ap.getString(1, "r");
+        int buffering = ap.getInt(2, -1);
+        final String encoding = ap.getString(3, null);
+        final String errors = ap.getString(4, null);
+        final String newline = ap.getString(5, null);
+        boolean closefd = Py.py2boolean(ap.getPyObject(6, Py.True));
+
+        // Decode the mode string
+        OpenMode mode = new OpenMode(m) {
+
+            @Override
+            public void validate() {
+                super.validate();
+                validate(encoding, errors, newline);
+            }
+        };
+
+        mode.checkValid();
+
+        int line_buffering;
+
+        /*
+         * Create the Raw file stream. Let the constructor deal with the variants and argument
+         * checking.
+         */
+        // XXX open() doesn't yet support file descriptors or the "digested" mode
+        PyFileIO raw = new PyFileIO(file.toString(), mode.rawmode(), closefd);
+
+        // XXX Can this work: boolean isatty = raw.isatty() ? Or maybe:
+        // PyObject res = PyObject_CallMethod(raw, "isatty", NULL);
+        boolean isatty = false;
+
+        /*
+         * Work out a felicitous buffer size
+         */
+        if (buffering == 1 || (buffering < 0 && isatty)) {
+            buffering = -1;
+            line_buffering = 1;
+        } else {
+            line_buffering = 0;
+        }
+
+        if (buffering < 0) {
+            // Try to establish the default buffer size for this file using the OS.
+            buffering = _DEFAULT_BUFFER_SIZE;
+            // PyObject res = PyObject_CallMethod(raw, "fileno", NULL);
+            // if (fstat(fileno, &st) >= 0 && st.st_blksize > 1)
+            // buffering = st.st_blksize;
+        }
+
+        if (buffering < 0) {
+            throw Py.ValueError("invalid buffering size");
+        }
+
+        // If not buffering, return the raw file object
+        if (buffering == 0) {
+            if (!mode.binary) {
+                throw Py.ValueError("can't have unbuffered text I/O");
+            }
+            return raw;
+        }
+
+        // We are buffering, so wrap raw into a buffered file
+        PyObject bufferType = null;
+        PyObject io = imp.load("io");
+
+        if (mode.updating) {
+            bufferType = io.__getattr__("BufferedRandom");
+        } else if (mode.writing || mode.appending) {
+            bufferType = io.__getattr__("BufferedWriter");
+        } else if (mode.reading) {
+            bufferType = io.__getattr__("BufferedReader");
+        } else {
+            // Can it really still go wrong? I don't think so.
+            throw Py.ValueError(String.format("unknown mode: '%s'", mode.originalModeString));
+        }
+
+        PyInteger pyBuffering = new PyInteger(buffering);
+        PyObject buffer = bufferType.__call__(raw, pyBuffering);
+
+        // If binary, return the buffered file
+        if (mode.binary) {
+            return buffer;
+        }
+
+        /* We are opening in text mode, so wrap buffer into a TextIOWrapper */
+        PyObject textType = io.__getattr__("TextIOWrapper");
+        PyObject[] textArgs =
+                {buffer, ap.getPyObject(3, Py.None), ap.getPyObject(4, Py.None),
+                        ap.getPyObject(5, Py.None), Py.newInteger(line_buffering)};
+        PyObject wrapper = textType.__call__(textArgs);
+
+        return wrapper;
+    }
+
+    private static final String[] openKwds = {"file", "mode", "buffering", "encoding", "errors",
+            "newline", "closefd"};
+
+    public static final String __doc__ =
+            "The io module provides the Python interfaces to stream handling. The\n"
+                    + "builtin open function is defined in this module.\n" + "\n"
+                    + "At the top of the I/O hierarchy is the abstract base class IOBase. It\n"
+                    + "defines the basic interface to a stream. Note, however, that there is no\n"
+                    + "seperation between reading and writing to streams; implementations are\n"
+                    + "allowed to throw an IOError if they do not support a given operation.\n"
+                    + "\n"
+                    + "Extending IOBase is RawIOBase which deals simply with the reading and\n"
+                    + "writing of raw bytes to a stream. FileIO subclasses RawIOBase to provide\n"
+                    + "an interface to OS files.\n" + "\n"
+                    + "BufferedIOBase deals with buffering on a raw byte stream (RawIOBase). Its\n"
+                    + "subclasses, BufferedWriter, BufferedReader, and BufferedRWPair buffer\n"
+                    + "streams that are readable, writable, and both respectively.\n"
+                    + "BufferedRandom provides a buffered interface to random access\n"
+                    + "streams. BytesIO is a simple stream of in-memory bytes.\n" + "\n"
+                    + "Another IOBase subclass, TextIOBase, deals with the encoding and decoding\n"
+                    + "of streams into text. TextIOWrapper, which extends it, is a buffered text\n"
+                    + "interface to a buffered raw stream (`BufferedIOBase`). Finally, StringIO\n"
+                    + "is a in-memory stream for text.\n" + "\n"
+                    + "Argument names are not part of the specification, and only the arguments\n"
+                    + "of open() are intended to be used as keyword arguments.\n";
+
+// + "\n"
+// + "data:\n"
+// + "\n"
+// + "DEFAULT_BUFFER_SIZE\n"
+// + "\n"
+// + "   An int containing the default buffer size used by the module's buffered\n"
+// + "   I/O classes. open() uses the file's blksize (as obtained by os.stat) if\n"
+// + "   possible.\n";
+
+    public static final String __doc__open =
+            "Open file and return a stream.  Raise IOError upon failure.\n" + "\n"
+                    + "file is either a text or byte string giving the name (and the path\n"
+                    + "if the file isn't in the current working directory) of the file to\n"
+                    + "be opened or an integer file descriptor of the file to be\n"
+                    + "wrapped. (If a file descriptor is given, it is closed when the\n"
+                    + "returned I/O object is closed, unless closefd is set to False.)\n" + "\n"
+                    + "mode is an optional string that specifies the mode in which the file\n"
+                    + "is opened. It defaults to 'r' which means open for reading in text\n"
+                    + "mode.  Other common values are 'w' for writing (truncating the file if\n"
+                    + "it already exists), and 'a' for appending (which on some Unix systems,\n"
+                    + "means that all writes append to the end of the file regardless of the\n"
+                    + "current seek position). In text mode, if encoding is not specified the\n"
+                    + "encoding used is platform dependent. (For reading and writing raw\n"
+                    + "bytes use binary mode and leave encoding unspecified.) The available\n"
+                    + "modes are:\n" + "\n"
+                    + "========= ===============================================================\n"
+                    + "Character Meaning\n"
+                    + "--------- ---------------------------------------------------------------\n"
+                    + "'r'       open for reading (default)\n"
+                    + "'w'       open for writing, truncating the file first\n"
+                    + "'a'       open for writing, appending to the end of the file if it exists\n"
+                    + "'b'       binary mode\n" + "'t'       text mode (default)\n"
+                    + "'+'       open a disk file for updating (reading and writing)\n"
+                    + "'U'       universal newline mode (for backwards compatibility; unneeded\n"
+                    + "          for new code)\n"
+                    + "========= ===============================================================\n"
+                    + "\n"
+                    + "The default mode is 'rt' (open for reading text). For binary random\n"
+                    + "access, the mode 'w+b' opens and truncates the file to 0 bytes, while\n"
+                    + "'r+b' opens the file without truncation.\n" + "\n"
+                    + "Python distinguishes between files opened in binary and text modes,\n"
+                    + "even when the underlying operating system doesn't. Files opened in\n"
+                    + "binary mode (appending 'b' to the mode argument) return contents as\n"
+                    + "bytes objects without any decoding. In text mode (the default, or when\n"
+                    + "'t' is appended to the mode argument), the contents of the file are\n"
+                    + "returned as strings, the bytes having been first decoded using a\n"
+                    + "platform-dependent encoding or using the specified encoding if given.\n"
+                    + "\n" + "buffering is an optional integer used to set the buffering policy.\n"
+                    + "Pass 0 to switch buffering off (only allowed in binary mode), 1 to select\n"
+                    + "line buffering (only usable in text mode), and an integer > 1 to indicate\n"
+                    + "the size of a fixed-size chunk buffer.  When no buffering argument is\n"
+                    + "given, the default buffering policy works as follows:\n" + "\n"
+                    + "* Binary files are buffered in fixed-size chunks; the size of the buffer\n"
+                    + "  is chosen using a heuristic trying to determine the underlying device's\n"
+                    + "  \"block size\" and falling back on `io.DEFAULT_BUFFER_SIZE`.\n"
+                    + "  On many systems, the buffer will typically be 4096 or 8192 bytes long.\n"
+                    + "\n"
+                    + "* \"Interactive\" text files (files for which isatty() returns True)\n"
+                    + "  use line buffering.  Other text files use the policy described above\n"
+                    + "  for binary files.\n" + "\n"
+                    + "encoding is the name of the encoding used to decode or encode the\n"
+                    + "file. This should only be used in text mode. The default encoding is\n"
+                    + "platform dependent, but any encoding supported by Python can be\n"
+                    + "passed.  See the codecs module for the list of supported encodings.\n"
+                    + "\n"
+                    + "errors is an optional string that specifies how encoding errors are to\n"
+                    + "be handled---this argument should not be used in binary mode. Pass\n"
+                    + "'strict' to raise a ValueError exception if there is an encoding error\n"
+                    + "(the default of None has the same effect), or pass 'ignore' to ignore\n"
+                    + "errors. (Note that ignoring encoding errors can lead to data loss.)\n"
+                    + "See the documentation for codecs.register for a list of the permitted\n"
+                    + "encoding error strings.\n" + "\n"
+                    + "newline controls how universal newlines works (it only applies to text\n"
+                    + "mode). It can be None, '', '\\n', '\\r', and '\\r\\n'.  It works as\n"
+                    + "follows:\n" + "\n"
+                    + "* On input, if newline is None, universal newlines mode is\n"
+                    + "  enabled. Lines in the input can end in '\\n', '\\r', or '\\r\\n', and\n"
+                    + "  these are translated into '\\n' before being returned to the\n"
+                    + "  caller. If it is '', universal newline mode is enabled, but line\n"
+                    + "  endings are returned to the caller untranslated. If it has any of\n"
+                    + "  the other legal values, input lines are only terminated by the given\n"
+                    + "  string, and the line ending is returned to the caller untranslated.\n"
+                    + "\n" + "* On output, if newline is None, any '\\n' characters written are\n"
+                    + "  translated to the system default line separator, os.linesep. If\n"
+                    + "  newline is '', no translation takes place. If newline is any of the\n"
+                    + "  other legal values, any '\\n' characters written are translated to\n"
+                    + "  the given string.\n" + "\n"
+                    + "If closefd is False, the underlying file descriptor will be kept open\n"
+                    + "when the file is closed. This does not work when a file name is given\n"
+                    + "and must be True in that case.\n" + "\n"
+                    + "open() returns a file object whose type depends on the mode, and\n"
+                    + "through which the standard file operations such as reading and writing\n"
+                    + "are performed. When open() is used to open a file in a text mode ('w',\n"
+                    + "'r', 'wt', 'rt', etc.), it returns a TextIOWrapper. When used to open\n"
+                    + "a file in a binary mode, the returned class varies: in read binary\n"
+                    + "mode, it returns a BufferedReader; in write binary and append binary\n"
+                    + "modes, it returns a BufferedWriter, and in read/write mode, it returns\n"
+                    + "a BufferedRandom.\n" + "\n"
+                    + "It is also possible to use a string or bytearray as a file for both\n"
+                    + "reading and writing. For strings StringIO can be used like a file\n"
+                    + "opened in a text mode, and for bytes a BytesIO can be used like a file\n"
+                    + "opened in a binary mode.\n";
 }

-- 
Repository URL: http://hg.python.org/jython


More information about the Jython-checkins mailing list