Merging pdf files based on a value in a field

accessnewbie at gmail.com accessnewbie at gmail.com
Fri Sep 15 12:06:18 EDT 2017


Suggestions to use pyPDF2 to append files did not pan out. I had to go with the arcpy module.  pyPDF2 does NOT merge correctly when trying to output multiple files based on a a similar value or key (essentially the group by concept). 

    import csv
    import arcpy
    from arcpy import env
    import shutil, os, glob

    # clear out files from destination directory
    files = glob.glob(r'C:\maps\JoinedMaps\*')
    for f in files:
        os.remove(f)

    # open csv file
    f = open("C:\maps\Maps.csv", "r+")
    ff = csv.reader(f)

    # set variable to establish previous row of csv file (for comaprrison)
    pre_line = ff.next()

    # Iterate through csv file

    for cur_line in ff:
        # new file name and location based on value in column (county name)
        newPdfFile = (r'C:\maps\JoinedMaps\County-' + cur_line[0] +'.pdf')
        # establish pdf files to be appended
        joinFile = pre_line[1]
        appendFile = cur_line[1]

        # If columns in both rows match
        if pre_line[0] == cur_line[0]: # <-- compare first column
            # If destnation file already exists, append file referenced in current row
            if os.path.exists(newPdfFile):
                tempPdfDoc = arcpy.mapping.PDFDocumentOpen(newPdfFile)
                tempPdfDoc.appendPages(appendFile)
            # Otherwise create destination and append files reference in both the previous and current row
            else:
                tempPdfDoc = arcpy.mapping.PDFDocumentCreate(newPdfFile)
                tempPdfDoc.appendPages(joinFile)
                tempPdfDoc.appendPages(appendFile)
            # save and delete temp file
            tempPdfDoc.saveAndClose()
            del tempPdfDoc
        else:
            # if no match, do not merge, just copy
            shutil.copyfile(appendFile,newPdfFile)

        # reset variable
        pre_line = cur_line

Final output looked like this:

County-County1 (2 pages - Map1 and Map2)
County-County2 (2 pages - Map1 and Map3)
County-County3 (1 page - Map3)
County-County2 (3 pages - Map2, Map3, and Map4)



More information about the Python-list mailing list