Merge pdf files using information from two files

dieter dieter at handshake.de
Sat Sep 9 02:49:12 EDT 2017


accessnewbie at gmail.com writes:
> I have two files (right now they are spreadsheets but I can export them to any format). 
>
> File1 has StoreID (unique) in one column and a pdf map location in the second column. (Names not really sequenced numerically)
>
> 1     C:/maps/map1.pdf
> 2     C:/maps/map2.pdf
> 3     C:/maps/map3.pdf
> 4     C:/maps/map4.pdf
>
> File2 has 3 columns. Column1 is the County name (unique), Column2 are the store IDs that fall in that county separated by commas, and Column3 is warehouse that services the store.
>
> County1    1,2    Warehouse1
> County2    1,3    Warehouse1
> County3    3      Warehouse4
> County4    2,4    Warehouse3
>
> Is it possible to compare both files and append the maps that belong in each county and naming it by the county_warehouse.pdf?

This will not be easy: PDF is a page layout oriented format, not
a format to facilitate the processing of general structural
data (such as e.g. XML).

You could use a package like "pdfminer" to get at the text content
of a PDF file. You will then need specialized code (developed by yourself)
to reconstruct the column information.
You could then use a PDF generating package such as "reportlab"
to generate a new PDF file.




More information about the Python-list mailing list