Size in byte of a string ?

Michael Gilfix mgilfix at eecs.tufts.edu
Mon Jul 29 10:14:32 EDT 2002


  For the easy part, I suggest you simply layout a dict object as you
suggested:

     dict[header_name] = (start, end)

  and then you can do as you wish:

     start_at_byte, end_at_byte = dict[header_name]
     f.seek(start_at_byte)
     content = f.read(end_at_byte)

  Each char is 1 byte and len(string) will give you the number of bytes.
If your file format is consistent, you can probably do something like:

     get_first_position()
     for line in f.xreadlines():
       if is_header(line):
         record_last_position()
         make_entry()
         record_new_start()

  That should build up your dict and help you do what you want.

                          -- Mike

On Mon, Jul 29 @ 13:15, Shagshag13 wrote:
> "Shagshag13" <shagshag13 at yahoo.fr> a écrit dans le message de news: ai37ij$10tmvq$1 at ID-146704.news.dfncis.de...
> > [sorry for the previous incomplete post]
> >
> > hello,
> >
> > i had the following problem : i had huge files > 1,5 go. each of theses files contain some other small files and they are separate
> > by a header. i need to do some kind of random access to the small files so i wish to write a script which would build some kind of
> a
> > table of content for theses huges files.
> >
> > i would like to have something like
> > samllfile_header_name_i : start_at_byte : end_at_byte
> >
> > and i will do :
> > f.seek(start_at_byte)
> > samllfile_header_name_content = f.read(end_at_byte)
> >
> > but to do this i need to be able at the building time to know how much bytes take a string ?
> >
> > (i think there is something with the size in byte of CR/LF but how to measure it ? i hope i won't have to do some
> > start_at_line, end_at_line as it will need to always read file from start...).
> 
> i forget to write that theses are text files, that what i call "header" is more some kind of separator ("name_of_the_file\n").
> 
> thanks in advance,

-- 
Michael Gilfix
mgilfix at eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html




More information about the Python-list mailing list