Category tree of some sort ...

Stephan Diehl stephan.diehl at gmx.net
Sat Dec 7 05:12:01 EST 2002


Thomas Weholt wrote:

> I got a bunch of content categories in form of a list of lists :
> 
> categories = [
>         ['computers','programming','python'],
>         ['computers','programming','c'],
>         ['computers','technical','os']
> ]
> 
> I need to put this into a tree-structure, a dictionary like this :
> 
> category_tree = {
>     'computers': ([], {
>         'programming': ([], {
>             'python': ( [], {}),
>             'c': ( [], {}),
>         },
>     'technical': ([], {
>         'os': ([], {}),
>         }
>     }
> 
> Each level holds a list of ids, pointing to resources on a local disk, the
> dict-part of each tuple is the base for a another, deeper level in the
> tree-structure.
> 
> If anybody has any other idea on how to effectivly storing information
> about files where the folders are names of categories and each file is
> assigned a uniq id for look-up, please let me know.
> 
> Thanks,
> Thomas
> 

Obviously, there are many ways to represent such a structure. You might be 
better off if you give your categories ids as well. This means that you 
either extend the dict structure or go for Tree objects anyway.
Assuming that you want to browse some resources by their categories, just 
make a list of all recources and attach some category information to it.

This could look like (taking your example):
category_tree = {
     'computers': (1, {
         'programming': (2, {
             'python': ( 3, {}),
             'c': ( 4, {}),
         }),
     'technical': (5, {
         'os': (6, {}),
         })
     }

resource = {
        100:(1,2,3),
            101:(5,6),
}

Here, the recources have ids starting at 100. Now, if you are looking for 
documents about "programming", you get the ones about 
"programming","python" and "c".

Another point, you have, in a way, just modelled a normal directory tree. 
If you really want to make something usefull, you should allow for 
independent categories. In this case (if its about computer books) this 
might be author, level,...
If you are about to build a real world system with thousands of entries, 
you'll need to use a real database to keep queriing times low.

Hope that helps

Stephan



More information about the Python-list mailing list