reverse engineering Excel spreadsheet
irstas at gmail.com
irstas at gmail.com
Sun Apr 1 12:13:10 EDT 2007
On Apr 1, 6:59 pm, Duncan Smith <buzz... at urubu.freeserve.co.uk> wrote:
> Hello,
> I am currently implementing (mainly in Python) 'models' that come
> to me as Excel spreadsheets, with little additional information. I am
> expected to use these models in a web application. Some contain many
> worksheets and various macros.
>
> What I'd like to do is extract the data and business logic so that I can
> figure out exactly what these models actually do and code it up. An
> obvious (I think) idea is to generate an acyclic graph of the cell
> dependencies so that I can identify which cells contain only data (no
> parents) and those that depend on other cells. If I could also extract
> the relationships (functions), then I could feasibly produce something
> in pure Python that would mirror the functionality of the original
> spreadsheet (using e.g. Matplotlib for plots and more reliable RNGs /
> statistical functions).
>
> The final application will be running on a Linux server, but I can use a
> Windows box (i.e. win32all) for processing the spreadsheets (hopefully
> not manually). Any advice on the feasibility of this, and how I might
> achieve it would be appreciated.
>
> I assume there are plenty of people who have a better knowledge of e.g.
> COM than I do. I suppose an alternative would be to convert to Open
> Office and use PyUNO, but I have no experience with PyUNO and am not
> sure how much more reliable the statistical functions of Open Office
> are. At the end of the day, the business logic will not generally be
> complex, it's extracting it from the spreadsheet that's awkward. Any
> advice appreciated. TIA. Cheers.
>
> Duncan
I'm not sure I understood what kind of information you want to
get out of the Excel sheet, sorry. But I hope this'll get you started
(at least it has a few nice tokens that might help you in googling):
import win32com.client
class Excel:
def __init__(self, filename):
self.closed = True
self.xlApp =
win32com.client.dynamic.Dispatch('Excel.Application')
self.xlBook = self.xlApp.Workbooks.Open(filename)
self.closed = False
def sheet(self, sheetName):
return self.xlBook.Worksheets(sheetName)
def __del__(self):
if not self.closed:
self.close()
def close(self):
self.xlBook.Close(SaveChanges=1)
self.xlApp.Quit()
self.closed = True
excel = Excel('file.xls')
sheet = excel.sheet(1)
print sheet.Cells(6, 3)
I used it a few years ago to read and populate spreadsheet cells.
More information about the Python-list
mailing list