Storing a big amount of path names

srinivas devaki mr.eightnoteight at gmail.com
Fri Feb 12 01:16:41 EST 2016


On Feb 12, 2016 6:05 AM, "Paulo da Silva" <p_s_d_a_s_i_l_v_a_ns at netcabo.pt>
wrote:
>
> Hi!
>
> What is the best (shortest memory usage) way to store lots of pathnames
> in memory where:
>
> 1. Path names are pathname=(dirname,filename)
> 2. There many different dirnames but much less than pathnames
> 3. dirnames have in general many chars
>
> The idea is to share the common dirnames.
>
> More realistically not only the pathnames are stored but objects each
> object being a MyFile containing
> self.name - <base name>
> getPathname(self) - <full pathname>
> other stuff
>
> class MyFile:
>
>   __allfiles=[]
>
>   def __init__(self,dirname,filename):
>     self.dirname=dirname  # But I want to share this with other files
>     self.name=filename
>     MyFile.__allfiles.append(self)
>     ...
>
>   def getPathname(self):
>     return os.path.join(self.dirname,self.name)
>

what you want is Trie data structure, which won't use extra memory if the
basepath of your strings are common.

instead of having constructing a char Trie, try to make it as string Trie
i.e each directory name is a node and all the files and folders are it's
children, each node can be of two types a file and folder.

if you come to think about it this is most intuitive way to represent the
file structure in your program.

you can extract the directory name from the file object by traversing it's
parents.

I hope this helps.

Regards
Srinivas Devaki
Junior (3rd yr) student at Indian School of Mines,(IIT Dhanbad)
Computer Science and Engineering Department
ph: +91 9491 383 249
telegram_id: @eightnoteight



More information about the Python-list mailing list