64 bit offsets?

MRAB python at mrabarnett.plus.com
Thu Oct 7 11:40:39 EDT 2010


On 06/10/2010 22:41, jay thompson wrote:
> Hello everyone,
>
> I'm trying to extract some data from a large memory mapped file (the
> largest is ~30GB) with re.finditer() and re.start(). Pythons regular
> expression module is great but the size of re.start() is 32bits (signed
> so I can really only address 2GB).  I was wondering if any here had some
> suggestions on how to get the long offsets I need. btw... I can't break
> up the file because the pattern I'm looking for can occur anywhere and
> on any boundry.
>
> Also, is seek() limited to 32bit addresses?
>
> this is what I have in python 2.7 AMD64:
>
>
> with open(file_path, 'r+b') as file:
>
>      file_map = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
>      file_map.seek(0)
>
>      pattern = re.compile("pattern")
>
>      for iii in re.finditer(pattern, file_map):
>
>          offset = iii.start()
>
>          write_to_sqlite(offset)
>
I would've thought that a 64-bit version of Python would have 64-bit
offsets. Is that not the case?



More information about the Python-list mailing list