Running queries on large data structure
Christoph Haas
email at christoph-haas.de
Thu Aug 3 13:22:12 EDT 2006
On Thursday 03 August 2006 17:45, hiaips wrote:
> Christoph,
>
> Several possibilities come to mind...
>
> From your description, maybe something like Postgres, MySql, or sqlite
> would not be the best option. (However, I'm wondering what your query
> requirements are
Imagine this example firewall rule:
| Source | Destination | Service | Action |
| 10.0.0.1 | 192.168.51.9 | tcp/22 | allow |
| group_internal | | tcp/23 | |
| 10.2.0.0/16 | | | |
| 10.4.0.0-10.4.9.255 | | | |
Where e.g. group_internal is a group consisting of several IPs. And
'10.4.0.0-10.4.9.255' is a range of IP addresses. SQL doesn't really know
such criteria (although a network match is possible with
PostgreSQL's "inet" data type). So I would probably need to read the rule
from the database and for each type (IP, network, group, IP range) run
some subroutine to determine whether a searched for IP address is part of
it.
> -- for example, if you really need the power of SQL,
> maybe you should just bite the bullet and map to an RDBMS schema, as
> painful as that may be.
That's how I do it now (in the old Perl program). I don't allow IP ranges
and expand the groups (like "group_internal") to all the members contained
within. Here the optimization lies in the database but it's just doing the
easy parts and not supporting the hard parts (e.g. IP ranges).
> A couple of other possibilities:
> 1. What about storing your data in XML and using XQuery to facilitate
> queries? If your data is deeply nested, as you say, this may be a good
> match.
I assume that XQuery can't to weird queries like IP ranges, or can it?
> 2. What about storing your data in the same syntax as a Python
> dictionary? Is that possibile? (If it is, then your data *is* your
> code, and you don't have to worry about parsing it.)
Oh, yes, that's perfect. Perhaps a bit slower because I do the query code
myself. But I would like that. Just how do I keep the dictionary somewhere
on disk so that another process can use it?
Thanks for the ideas.
Christoph
More information about the Python-list
mailing list