[ANN] Speed up Charmap codecs with fastcharmap module

Tony Nelson *firstname*nlsnews at georgea*lastname*.com
Sun Oct 16 13:20:29 EDT 2005


Fastcharmap is a python extension module that speeds up Charmap codecs 
by about 5 times.

<http://georgeanelson.com/fastcharmap.htm> 

Usage:

    import fastcharmap
    fastcharmap.hook('codec_name')

Fastcharmap will then speed up calls that use that codec, such as 
unicode(str, 'codec_name') and str.encode('codec_name'), and won't 
interfere with Charmap codecs that haven't been hooked.

Documentation is in PyDoc form:

    import fastcharmap
    help(fastcharmap)

Fastcharmap is available as a standard Python source tarball, binary 
tarball, and RPMs.  It isn't packaged for MSWindows yet (maybe soon), or 
for MOSX.  It is written in Python and Pyrex 0.9.3, but builds from .c 
source when Pyrex is not available.  A C compiler is required for source 
installs.  I have only used it with Python 2.3.4 on FC3 Linux, but it 
should work on Python 2.4 and on other platforms.

As fastcharmap is an extension module, it might not be available on a 
particular computer.  I handle that this way in a program of mine:

try:
    import fastcharmap
except ImportError:
    print "fastcharmap not available"
else:
    fastcharmap.hook('mac_roman')

This is done on document open, which on Gnome / GTK is chatty anyway.

I wrote fastcharmap when I found that decoding a large amount of text 
was taking 3 times as long as loading the document from a file.  Python 
should be fast!  The application is a simple card-file program that can 
also open mbox files as cards.  I am using a 50 MB test file from a Mac 
that loads in about 4 seconds on my computer (wow!), and was decoding in 
about 13 seconds.  Now it takes 2 seconds to decode or encode.

Python developers are working on faster Charmap codecs for a future 
version of Python.  Fastcharmap may be useful until then, and shouldn't 
cause any problems when the new codecs are available.

As this is my first Python module, I'd like some experienced module 
authors and packagers to comment on it, before I make it into cheese.
________________________________________________________________________
TonyN.:'                        *firstname*nlsnews at georgea*lastname*.com
      '                                  <http://www.georgeanelson.com/>



More information about the Python-list mailing list