Using python to convert PDF document to MSWord documents

Timothy Grant timothy.grant at gmail.com
Tue Sep 28 12:31:04 EDT 2004


----- Original Message -----
From: JEET <hjeet_in at yahoo.com>
Date: Tue, 28 Sep 2004 17:13:17 +0100 (BST)
Subject: Using python to convert PDF document to MSWord documents
To: python-list at python.org


 
 
Hello All, 
  
Can anyone please suggest me if  there any python modules available to
convert PDF document to MSWord documents. If not then can you please
suggest how can i acheive this.
  
Many thanks in advance, 
  
Regards 
Deb

======

What you ask is quite difficult. My understanding is that PDF files
are simply Postscript files with some special wrapping. Depending on
the nature of the PDF (is it encrypted, are there other special
provisions?) you may be able to strip the raw text from the file and
create and RTF file from it. However you will lose all formatting in
this case. If the formatting is "standard" across all the PDFs you may
be able to infer from the text something that will allow you to
replace some or all of it.







-- 
Stand Fast,
    tjg.



More information about the Python-list mailing list