Try using Gamera, http://dkc.mse.jhu.edu/gamera/, a Python framework for document page analysis. Bill