ANN: PyTables 1.4 (A Hierarchical Database) released!

Francesc Altet faltet at carabos.com
Thu Dec 21 18:55:25 CET 2006


===========================
 Announcing PyTables 1.4
===========================

PyTables is a library for managing hierarchical datasets and designed to
efficiently cope with extremely large amounts of data with support for
full 64-bit file addressing.  It is based on the HDF5 library for doing
the I/O and leverages the numarray/NumPy/Numeric packages so as to
deliver the data to the end user in convenient in-memory containers.

This is a new major release of PyTables, and probably the last major one
of the 1.x series (i.e. with numarray at the core). On it, we have
implemented better code to deal with table buffers, enhanced the
capability for reading native HDF5 files, enhanced support for 64-bit
platforms (but not with Python 2.5: see ``Special Warning`` section
below), better support for AIX, optional automatic parent creation and
the traditional amount of bug fixes.

Go to the PyTables web site for downloading the beast:
http://www.pytables.org/

or keep reading for more info about the new features and bugs fixed.


Changes more in depth
=====================

Improvements:

- Table buffers code refactored: now each Row read iterator has its own
  buffers, completely independent of their table (although write
  iterators still share a single buffer in the same table). This
  separation makes the logic of buffering much more clear and less prone
  to errors (in fact, some of them have been solved).  Performance and
  memory consumption are more or less equal than before.

- When flushing the complete file (i.e. when calling File.flush()), only
  the buffers of those nodes that are alive (i.e. referenced from user
  code) are actually flushed. This brings much better efficiency (and
  also stability) to situations where one has to flush (and hence,
  close) files with many nodes on it.

- Better support for AIX by renaming the internal LONLONG_MAX C constant
  (it was used internally by the xlc compiler). Thanks to Brian Granger
  for the report.

- Added optional automatic parent creation support during node creation,
  copying and moving operations.  See the release notes for more
  information.

- Improved support for Python2.4 and 64-bit platforms (but beware, there
  are still known issues when using Python2.5 in combination with 64-bit
  platforms). Thanks to Gerard Vermeulen for his patches for Win64
  platforms.

- Implemented a workaround for a leak present in numarray --> Numeric
  conversions when using the array protocol, as can be seen in:

  http://comments.gmane.org/gmane.comp.python.numeric.general/12563

  The workaround can potentially be far slower than the array protocol
  (because a copy of the arrays is always made), but at least the new
  code doesn't leak anymore.

Bug fixes:

- Previously, when the size for memory compounds type was less than the
  size of the type on disk (for example, when one have padding or
  aligned fields), PyTables was unable to read info on them. This has
  been fixed. This allows reading general compound types in HDF5 files
  written with other tools than PyTables.

- When many tables with indexed columns were created simultaneously, a
  bug make PyTables to crash. This has been fixed (for more info, see
  bug #26).

- Fixed a typo in the code that prevented recognizing complex data in
  non-PyTables files.

- Table.createIndex() now refuses to index complex columns.

- Now, it is possible to index several nested columns that hangs from
  the same column parent. Fixes bug #24.

- Fixed a typo in nctoh5 utility that prevented using filters
  properly. Thanks to Lou Wicker for reporting this.

- When setting/appending an array in-memory to an Array (or descendant)
  object and they have mismatched byteorders, the array was set/appended
  without being byteswapped first. This has been fixed. Thanks to Elias
  Collas for the report.

Deprecated features:

- None

Backward-incompatible changes:

- Please, see ``RELEASE-NOTES.txt`` file.


Special Warning for Python 2.5 and 64-bit platforms users
=========================================================

Unfortunately, and due to problems with the combination numarray 1.5.2,
Python2.5 and 64-bit platforms, PyTables cannot be safely used yet in
such scenario.  This will be solved either when numarray can address
this issue (hopefully with numarray 1.5.3), or when PyTables 2.x series
(with NumPy at its core) will be out.


Important note for Windows users
================================

If you are willing to use PyTables with Python 2.4 or 2.5 in Windows
platforms, you will need to get the HDF5 library compiled for MSVC 7.1,
aka .NET 2003.  It can be found at:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win-net.ZIP

Users of Python 2.3 on Windows will have to download the version of HDF5
compiled with MSVC 6.0 available in:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win.ZIP


Platforms
=========

This version has been extensively checked on quite a few platforms, like
Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64
(Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC (and PowerPC64)
and MacOSX on PowerPC.  For other platforms, chances are that the code
can be easily compiled and run without further issues.  Please, contact
us in case you are experiencing problems.


Resources
=========

Go to the PyTables web site for more details:

http://www.pytables.org

About the HDF5 library:

http://hdf.ncsa.uiuc.edu/HDF5/

About numarray:

http://www.stsci.edu/resources/software_hardware/numarray

About NumPy:

http://numpy.scipy.org/

To know more about the company behind the PyTables development, see:

http://www.carabos.com/


Acknowledgments
===============

Thanks to various the users who provided feature improvements,
patches, bug reports, support and suggestions.  See the ``THANKS``
file in the distribution package for a (incomplete) list of
contributors.  Many thanks also to SourceForge who have helped to make
and distribute this package!  And last but not least, a big thank you
to Acusim (http://www.acusim.com/) for sponsoring many of the job done
for releasing this version of PyTables.


Share your experience
=====================

Let us know of any bugs, suggestions, gripes, kudos, etc. you may
have.


----

  **Enjoy data!**

  -- The PyTables Team



More information about the Python-announce-list mailing list