Best way to find starting directory

Wed Mar 20 01:58:11 EDT 2013

On 19/03/2013 17:03, Dave Angel wrote:
> On 03/19/2013 10:29 AM, Frank Millman wrote:
>> On 19/03/2013 14:46, Dave Angel wrote:
>>
>> As you say, there is a variety of types of data that one might to store
>> externally. My current scenario is that, in my business/accounting
>> application, I use xml to store form definitions, report definitions,
>> etc, which are kept in the database (compressed). I am now constructing
>> some xml schemas to validate the xml files. I need to store the schemas
>> somewhere, so I have created a directory called 'schemas' under the main
>> directory. I need to access them from various parts of the application,
>> so I need a reliable way to locate the 'schemas' directory.
>
> Rather than having various parts of the code all figuring this sort of
> thing out for themself, let them all call a common place.  If there's
> nothing in common but the directory, then save that as a global in some
> module that multiple modules can import.  But if there's more that
> could/should be shared, then make a whole module, or maybe a class,
> implemmnting that behavior.  If nothing else, it then means there's only
> one place to change when you change your mind.
>
>>
>> In theory I could store them somewhere different, and use a parameter to
>> provide the path. But they are only used within the context of the
>> application, so I think it makes sense to keep them alongside the 'py'
>> files that make up the application.
>>
>
> In putting them there, you are making two assumptions.  One is that only
> one user will ever run this, and two is that the user will not need two
> sets of those 'schemas'.  If the user is tracking two different
> companies, each with the same code, but different xml and different
> database, this would be the wrong place to put it.  But it's up to you
> to decide those assumptions, not I.
>

Maybe I did not explain very well. I fully expect a large number of 
users, tracking a large number of companies, to access the same schema 
file at the same time.

In fact I use lxml to parse the xml once it has been read from the 
database and decompressed. There are a limited number of 'types' of xml 
file (form definition, service definition, report definition, etc), and 
each type has its own schema. lxml will use a validating schema if you 
pass the path to the xsd file as a parameter to the parser. I create 
separate parsers, one for each type, when the program starts. But I 
still need to tell it where to find the xsd file.

They are stored in a sub-directory called 'schemas'. Therefore when I 
create the parsers I have the following -

import os
import __main__
from lxml import etree

schema_path = os.path.join(
     os.path.dirname(__main__.__file__),
     'schemas')

form_schema=etree.XMLSchema(
     file=os.path.join(schema_path, 'form.xsd'))

form_parser = etree.XMLParser(
     schema=form_schema, attribute_defaults=True,
     remove_comments=True, remove_blank_text=True)

I hope that is clearer. If you can see anything wrong with this 
approach, please let me know. I would much rather find out now rather 
than when my 'large number of users' becomes a reality!

Thanks

Frank