Accessing Windows file metadata?

John Rhoads john.rhoads at philips.com
Thu Jan 12 16:06:50 EST 2006


Probably somebody more deeply into this will soon give you a better answer, 
but I'll
try to give a better-than-nothing answer.

The Properties metadata that you see in the shell can come from several places.
The oldest, commonest source is from Microsoft Office files. These have a 
compound
structure associated with OLE Object Linking and Embedding and the metadata 
is just one
of a lot of things stuck in there besides the Word text or Excel numbers 
you see in the
application.

Getting to it from Python is via COM. See Mark Hammond and Andy Robinson's 
book
"Python Programming in Win32" for the general approach. Perhaps the easiest 
way to
get this metadata is using DSOFILE.DLL that was released as an example a long
time ago by Microsoft (it's not an OS file, you have to install it yourself). 


The following article gives a VBScript example that you can
use as a template for your Python.

http://www.microsoft.com/technet/community/columns/scripts/sg0305.mspx

As I said, the OLE metadata has been around for a long time. With Windows 
2000, MS
extended the idea to any NTFS file so that you have an open-ended ability to 
associate named attributes like author, title, keywords or custom attributes 
of your
own to any file on an NTFS partition.

Unfortunately DSOFILE does not help with this (at least I don't think so!), 
so you'd need
to deal with the OS at a somewhat uglier level. And obviously you need Win2K 
or better.

Look at \win32com\test\testStorage.py in the win32 python package for an 
example.

Hope this helps,


John Rhoads


> I'm looking for a method by which to access Windows files metadata and
> have not been able to find anything in the standard modules or via
> Google - what is the standard approach?
> 
> Shamefully I really do not understand Windows file system - e.g. is
> properties metadata attached to the file?    if I change that metadata
> do I change the file's hash?  how is the metadata structured?  or is
> the "properties" metadata simply derived upon access?
> 
> Either way, is there a module or method to access this metadata (I'd
> hope there was a metadata dictionary for each file, but that may be a
> sign I've been spoiled by Python) ?
> 
> EP
> 





More information about the Python-list mailing list