[Chennaipy] Chennaipy - Monday Module - 06 Jun 2022

selvi dct selvi.dct at gmail.com
Mon Jun 6 12:38:13 EDT 2022


It's always hard to parse the binary file to text. Today we will see rescue
module which will help us to convert docx to md file

Module: docx2md

Installation: pip install docx2md


Converts Microsoft Word document files (.docx extension) to Markdown files.


% python -m docx2md ~/Downloads/example.docx output.msd

# save output.msd

# save media/image1.png

# save media/image4.jpg

# save media/image3.gif

# save media/image2.png


% cat output.msd

<div class="break"></div>

# chapter 1

text of chapter 1

## section 1-1

text of section 1-1

### subsection 1-1-1

text of subsection 1-1-1

<div class="break"></div>

insert png

<img src="media/image1.png" id="image1">

insert bmp

<img src="media/image2.png" id="image2">

insert gif

<img src="media/image3.gif" id="image3">

insert jpg

<img src="media/image4.jpg" id="image4">

<div class="break"></div>

* aaaaa

* bbbbb

* ccccc

* ddddd

    * eeeee

* fffff

    * ggggg

* hhhhh

    * iiiii

* jjjjj

<table id="table1">


















-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/chennaipy/attachments/20220606/83666344/attachment.html>

More information about the Chennaipy mailing list