[Python-bugs-list] [ python-Bugs-451890 ] Building with Large File Support fails
noreply@sourceforge.net
noreply@sourceforge.net
Sun, 09 Sep 2001 19:35:14 -0700
Bugs item #451890, was opened at 2001-08-16 18:00
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=451890&group_id=5470
Category: Build
Group: Python 2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Gerhard Häring (ghaering)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Building with Large File Support fails
Initial Comment:
(At least) on Linux, building 2.2-HEAD fails when
building with Large File Support. In
Objects/fileobject.c function _portable_ftell line
262.
----------------------------------------------------------------------
>Comment By: Gerhard Häring (ghaering)
Date: 2001-09-09 19:35
Message:
Logged In: YES
user_id=163326
Just a quick update. I've tested your latest CVS changes and
I can seek and write with offsets above sys.maxint just
fine now. Out of the box (on my Linux). The filesystem must
support LFS, too, of course. Even reiserfs doesn't support
that w/o formatting the partition with "-v 2". I can't speak
for ext2, but I guess you must format the partition with
some special option, to to support files > 2 GB.
(Just FYI, to save some time: for just testing seek, you can
open "/dev/null" or "/dev/zero".)
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-09 19:09
Message:
Logged In: YES
user_id=6380
I've done this in CVS now, but now the largefile build even
triggers on systems where the kernel (or the filesystem?)
doesn't support large files, but glibc does. Seeking to a
position > 2GB works, but writing triggers an IOError
exception on flush() or close(). In some sense this is right
(the binary might be moved to another kernel). But on such a
system test_largefile now fails, because its test for
largefile "support" isn't good enough. What to do next? Put
some test for a largefile-supporting kernel in the configure
script, or in test_largefile?
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2001-09-09 01:44
Message:
Logged In: YES
user_id=21627
I'd recommend to always ac_define _LARGEFILE_SOURCE and
_FILE_OFFSET_BITS=64. It will be very hard to find in a
test what exactly they change. Instead, we should trust
that if they are recognized at all, they do the right
thing. If there is an early AC_DEFINE for them, they will
get into confdefs.h and influence the outcome of all later
tests (e.g. the one measuring off_t).
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-08 21:18
Message:
Logged In: YES
user_id=6380
Interesting! My test script for large files worked, so
_FILE_OFFSET_BITS and _LARGEFILE_SOURCE are defined in your
pyconfig.h, but apparently the test for
HAVE_LARGEFILE_SUPPORT failed, because that symbol is *not*
set in your pyconfig.h -- and everthing else keys off it!
So the only symbol you really need to pass is
HAVE_LARGEFILE_SUPPORT, and as a workaround you can define
that yourself in pyconfig.h.
This symbol is defined by a bit of configure code that looks
like this in the m4 input:
AC_MSG_CHECKING(whether to enable large file support)
if test "$have_long_long" = yes -a \
"$ac_cv_sizeof_off_t" -gt "$ac_cv_sizeof_long" -a \
"$ac_cv_sizeof_long_long" -ge "$ac_cv_sizeof_off_t"; then
AC_DEFINE(HAVE_LARGEFILE_SUPPORT)
AC_MSG_RESULT(yes)
else
AC_MSG_RESULT(no)
fi
Can you upload config.status? That should tell me which of
those symbols doesn't have the right value. My guess is that
off_t is measured at 32 bits because _FILE_OFFSET_BITS is
not defined as 64 at the point that the symbol is measured.
So I have to tweak more stuff... Back to the drawing board.
:-(
----------------------------------------------------------------------
Comment By: Gerhard Häring (ghaering)
Date: 2001-09-08 13:10
Message:
Logged In: YES
user_id=163326
To find out the glibc version, you can invoke "glibcbug".
My default bug report says:
...
Release: libc-2.2.2
No, I don't get LFS support without manual work, with
CVS-HEAD and 2.2a3. I've uploaded my entire config.log file,
maybe you can make some sense of it. (it does find fello and
fseeko, but my pyconfig.h doesn't define the needed macros).
Come to think of it, I'll upload my pyconfig.h, too.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2001-09-08 12:22
Message:
Logged In: NO
(This is Guido, in a hurry, not logged in :-)
Gerhard, I'm surprised you still had to pass options to
make. It works without those for me. (How do I tell the
version of glibc I'm using?)
Can you tell me what config.log says after
"checking for CFLAGS to enable large files"?
Have you tried 2.2a3?
----------------------------------------------------------------------
Comment By: Gerhard Häring (ghaering)
Date: 2001-09-08 12:12
Message:
Logged In: YES
user_id=163326
Guido, I can build the current CVS now with LFS, too (Linux
2.4, glibc 2.2). I saw you did a lot in the configure
script, but I still had to give options to the make command
(grabbed them from Sean's latest source RPMs).
This worked for me:
./configure
make OPT="-g -O3 -D_FILE_OFFSET_BITS=64
-DHAVE_LARGEFILE_SUPPORT" CFLAGS="-g -O3
-D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT"
Shouldn't the feature define HAVE_LARGEFILE_SUPPORT be
automatically added to pyconfig.h?
It would perhaps be a good idea add the info on how to build
with LFS to the build instructions.
Thanks,
Gerhard
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-05 11:36
Message:
Logged In: YES
user_id=6380
Gerhard, can you try the current CVS? I've done a few things
to try and fix this. I can now build just fine on a pretty
recent Linux 2.4 kernel.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2001-09-03 02:23
Message:
Logged In: YES
user_id=21627
To fix the bug at hand (building fails), the following
strategy might be sufficient:
- produce an autoconf test that checks whether fpos_t is
integral, and "large"; define this by default for MSVC
- use this test in portable_fseek/portable_ftell.
I also wonder why the order in which APIs are tried is
different in fseek and ftell (fseek tries fseeko first,
ftell tries ftello only second).
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-08-20 13:19
Message:
Logged In: YES
user_id=31435
By itself, adding opaque getpos/setpos sounds pretty easy
(BTW, f{get,set}pos are std in C99).
Returning a usable 64-bit integer remains a x-platform
mess. The C99 rationale sez f{get,set}pos are the intended
way to work with large files, but they provide no way to
break the abstraction (Jeremy & I both looked in vain --
there is no defined way to extract the stream position from
an fpos_t object, neither to do arithmetic on one).
On Windows, f{get,set}pos are (currently) the only way to
get a 64-bit stream position from the MS C library (and MS
doesn't (currently) mix that in with a state encoding; the
Win32 API has other ways to deal with this).
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-08-20 06:21
Message:
Logged In: YES
user_id=6380
OK, so we need to add separate getpos() and setpos() methods
that return an opaque wrapper for an fpos_t. That sounds
like serious work, plus it will require changing Python apps
that use seek and tell.
So I think we shold *also* continue to search for a way to
use a 64-bit seek offset for Python's seek() and tell()
methods -- I'm presuming this is hidden *somewhere* in the
fpos_t still, since the underlying OS certainly uses
lseek64(). If there's no way to extract it out of the
fpos_t, I propose to call lseek64() directly (after using a
fflush()) on the file descriptor.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-08-19 22:24
Message:
Logged In: YES
user_id=31435
Noting that C99 *requires* fpos_t values to hold all the
info in an mbstate_t, in addition to stream position info.
So we have to expect others to follow glibc in this, and
eventually everyone. fpos_t cannot resolve to an array
type, but anything else is fair (in particular it need not
map to an integral type -- and probably won't anymore).
We have to give up belief that fpos_t is a number, because
it's not. We can believe that ftell returns a number,
because it does <wink> -- but ftell isn't suitable for
large file support.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2001-08-17 06:13
Message:
Logged In: YES
user_id=21627
This started in glibc 2.2, I believe, so it would appear in
Redhat 7, SuSE 7, etc.
To see the problem, you have to ./configure with
CFLAGS="-D_FILE_OFFSET_BITS=64" OPT="-O2 $(CFLAGS)"; see
pyconfig.h.
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-08-17 03:55
Message:
Logged In: YES
user_id=6380
Whoa. Interesting. Which Linux version is this?
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2001-08-17 00:21
Message:
Logged In: YES
user_id=21627
This fails because in glibc, fpos_t contains an mb_state
field, so that on restoring the file position, the
multibyte encoding state of the file can be restored.
I see two solutions here:
- Python could give up the guarantee that the ftell result
is a number, and return an object that embeds the fpos_t.
- Python could give up that guarantee that ftell/fseek
works in all cases, and only use ftell(o), which should
always return a number (atleast in Posix). If that
approach is taken, an additional fgetpos/fsetpos call may
be appropriate.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=451890&group_id=5470