[Tutor] How to Scrape Text from PDFs

Alan Gauld alan.gauld at yahoo.co.uk
Mon Jun 17 18:38:57 EDT 2019


On 17/06/2019 06:30, Cem Vardar wrote:
> some PDF files that have links for some websites and I need to extract these links 

There is a module that may help: PyPDF2

Here is a post showing how to extract the text from a PDF which should
include the links.

https://stackoverflow.com/questions/34837707/how-to-extract-text-from-a-pdf-file

There may even be more specific extraction tools if you look more closely...




-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list