file.read() doesn't read the whole file

Sreejith K sreejithemk at gmail.com
Tue Mar 24 03:59:47 EDT 2009


On Mar 24, 7:15 am, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Mon, 23 Mar 2009 21:37:14 -0300, R. David Murray  
> <rdmur... at bitdance.com> escribió:
>
>
>
> > Steve Holden <st... at holdenweb.com> wrote:
> >> Sreejith K wrote:
> >> >> Try and write an example that shows the problem in fifteen lines or
> >> >> less. Much easier for us to focus on the issue that way.
>
> >> > import os
> >> > def read(length, offset):
> >> >   os.chdir('/mnt/gfs_local/')
> >> >   snap = open('mango.txt_snaps/snap1/0','r')
> >> >   snap.seek(offset)
> >> >   data = snap.read(length)
> >> >   print data
>
> >> > read(4096,0)
>
> >> > This code shows what actually happens inside the code I've written.
> >> > This prints the 4096 bytes from the file '0' which is only 654 bytes.
> >> > When we run the code we get the whole file. That's right. I also get
> >> > it. But when this read() function becomes the file class read()
> >> > function in fuse, the data printed is not the whole but only a few
> >> > lines from the beginning.
>
> >> This is confusing. I presume you to mean that when you make this
> >> function a method of some class it stops operating correctly?
>
> >> But I am not sure.
>
> >> I am still struggling to understand your problem. Sorry,it's just a
> >> language thing. If we take our time we will understand each other in the
> >> end.
>
> > You may be asking this question for pedagogical reasons, Steve, but
> > in case not...the OP is apparently doing a 'less xxxx' where xxxx is
> > the name of a file in a fuse filesystem (that is, a mounted filesystem
> > whose back end is some application code written by the OP).
> [...]
> > There are several steps between that 'snap.read' and less displaying on
> > the terminal whatever bytes it got back from its read call in whatever
> > way it is less chooses to display them....
>
> And that's why everyone is asking for a *real* log. Assumptions like "foo  
> must be 0 here" aren't enough. One needs *evidence*: a log file showing  
> the value of "foo" right when it is used. Then, one can begin to infer  
> what happens -- first step would be to determine *which* layer is (or is  
> not) responsible for the misbehavior.
>
> In this case, I'd like to see file.tell(), the requested size and the  
> returned data length, *right*at*the*read()*call*.
>
> --
> Gabriel Genellina

R. David Murray understood the problem very well. As you've said I
logged the data returned and see that it actually the complete file.
But the problem is when 'less'ing only two lines are displayed. So its
not regarding the read() of python. Usually in fuse filesystems (as in
some examples), when some file read occurs fuse catch it and calls the
python-fuse's read(length, offset) function. For small files (when we
read using 'less'), this will be usually the first block i.e 4096 with
offset 0. We return what we read from the read() method of fuse-
python. But when I return the data I read, a 'less' operation in my
fuse filesystem shows some lines only, even if the returned data is
the whole file.

In my implementation of fuse-filesystem when a read is called, instead
of returning the data read from the original file, I return the data
read from another file ('0') which resides in <original-file>__snaps/
snap1 directory (I use this directory to store the blocks of original
files when write occurs. So each file here would be 4096 bytes). I'm
doing this because I want to make snapshots of files so that I can
restore the older file easily. The problem occurs when reading this
file and returning the read data.

Some flush/release functions are there in fuse to properly close the
opened file. When reading (less) the original file without snapshots,
there is no issue. But when reading the snapshot instead, the problem
occurs. I open the snapshot file with the same modes as the original
file. Is there anything I should do after read() like the flush() as
for the original file ? I tried it, but no success...

Log when reading from snapshot
=======
Read length: 4096 offset: 0
Snapshot 0 opened..
Snap read
===Data Begin===
Getting started -- pdb.set_trace()

To start, I'll show you the very simplest way to use the Python
debugger.

   1. Let's start with a simple program, epdb1.py.

              # epdb1.py -- experiment with the Python debugger, pdb
              a = "aaa"
              b = "bbb"
              c = "ccc"
              final = a + b + c
              print final


   2. Insert the following statement at the beginning of your Python
program. This statement imports the Python debugger module, pdb.

          import pdb

   3. Now find a spot where you would like tracing to begin, and
insert the following code:

          pdb.set_trace()
===Data End====
snap.tell(): 654
data size : 654
Original file flushed
Original file closed

data is the whole file, but 'less' gives only the two lines...



More information about the Python-list mailing list