[CentralOH] Comparing Databases for Test Suite
jep200404 at columbus.rr.com
jep200404 at columbus.rr.com
Sat Jan 5 02:52:58 CET 2013
What is a better way of easily comparing the data of some tables
of two databases?
I have Django code that modifies a Postgresql database. As part
of a test suite, I want to compare several tables of the
modified database to corresponding tables of a known good result
database, without writing much code.
Here's what I have so far. It works, but golly it's ugly.
I use the output of pgdump, of which the order of tables and rows
is not deterministic, so I use sort to impose the same order on
both dumps. There are timestamps in the tables, which can not
match, since the known good and test databases were modified
at different times, so I use sed to change timestamps to
a consistent string, 'timestamp-sentinel' that _will_ match.
[jep200404 at localhost ~]$ cat bin/snapshot
#!/usr/bin/env sh
# Takes a "snapshot" of a few tables.
pg_dump -a \
-t file \
-t author \
-t book \
-t article \
"$1"
[jep200404 at localhost ~]$
bin/snapshot known-good-catalog-result >known-good-catalog-result.pgdump
echo 'create database "catalog-test" with template "known-good-catalog-start";' \
| psql -f - -d foo
# foo is an empty database that is not used,
# except as something to appease psql with.
django-admin.py mogrify
bin/snapshot catalog-test >catalog-test-result.pgdump
date_of_known_good='2013-01-04'
today='2013-01-05'
sed -e 's/'"$date_of_known_good"\
' [012][0-9]:[0-5][0-9]:[0-5][0-9].[0-9]*+00/timestamp-sentinel/g' \
<known-good-catalog-result.pgdump | sort >known-good.glop
sed -e 's/'"$today"\
' [012][0-9]:[0-5][0-9]:[0-5][0-9].[0-9]*+00/timestamp-sentinel/g' \
<catalog-test-result.pgdump | sort >test.glop
if cmp known-good.glop test.glop; then
echo Good!
rm known-good.glop test.glop
exit 0
else
echo Bad!
exit 1
fi
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
http://pgfoundry.org/projects/pg-comparator/ looks interesting,
although I would likely have to write Django code to change the
timestamps to a single value before comparing. Two sed commands
are easier.
Separately and less importantly, can one easily compare the data
of tables of two databases in Python as opposed to the shell
stuff I did?
More information about the CentralOH
mailing list