Using Python instead of Bash

Cem Karan cfkaran2 at gmail.com
Sun May 31 10:14:29 EDT 2015


> I help someone that has problems reading. For this I take photo's of
> text, use convert from ImageMagick to make a good contrast (original
> paper is grey) and use lpr to print it a little bigger.
> 
> Normally I would implement this in Bash, but I thought it a good idea
> to implement it in Python. This is my first try:
>    import glob
>    import subprocess
> 
>    treshold = 66
>    count = 0
>    for input in sorted(glob.glob('*.JPG')):
>        count += 1
>        output = '{0:02d}.png'.format(count)
>        print('Going to convert {0} to {1}'.format(input, output))
>        p = subprocess.Popen(['convert', '-threshold', '{0}%'.format(treshold), input, output])
>        p.wait()
>        print('Going to print {0}'.format(output))
>        p = subprocess.Popen(['lpr', '-o', 'fit-to-page', '-o', 'media=A4', output])
>        p.wait()
> 
> There have to be some improvements: display before printing,
> possibility to change threshold, … But is this a good start, or should
> I do it differently?


As a first try, I think its pretty good, but to really answer your question, I think we could use a little more information.  

- Are you using python 2, or python 3?  There are slightly easier ways to do this using concurrent.futures objects, but they are only available under python 3. (See https://docs.python.org/3/library/concurrent.futures.html)

- In either case, subprocess.call(), subprocess.check_call(), or subprocess.check_output() may be easier to use.  That said, your code is perfectly fine!  The only real difference is that subprocess.call() will automatically wait for the call to complete, so you don't need to use p.wait() from above.  (See https://docs.python.org/2.7/library/subprocess.html, and https://docs.python.org/3/library/subprocess.html) 



The following codes does the conversion in parallel, and submits the jobs to the printer serially.  That should ensure that the printed output is also in sorted order, but you might want to double check before relying on it too much.  The major problem with it is that you can't display the output before printing; since everything is running in parallel, you'll have race conditions if you try.  **I DID NOT TEST THIS CODE, I JUST TYPED IT OUT IN MY MAIL CLIENT!**  Please test it carefully before relying on it!

"""
import subprocess
import concurrent.futures
import glob
import os.path

_THRESHOLD = 66

def _collect_filenames():
    files = glob.glob('*.JPG')

    # I build a set of the real paths so that if you have 
    # symbolic links that all point to the same file, they
    # they are automatically collapsed to a single file
    real_files = {os.path.realpath(x) for x in files}
    base_files = [os.path.splitext(x)[0] for x in real_files]
    return base_files

def _convert(base_file_name):
    """
    This code is slightly different from your code.  Instead
    of using numbers as names, I use the base name of file and
    append '.png' to it.  You may need to adjust this to ensure
    you don't overwrite anything.
    """
    input = base_file_name + ".JPG"
    output = base_file_name + ".png"
    subprocess.call(['convert', '-threshold', '{0}%'.format(_THRESHOLD), input, output])

def _print_files_in_order(base_files):
    base_files.sort()
    for f in base_files:
        output = f + ".png"
        subprocess.call(['lpr', '-o', 'fit-to-page', '-o', 'media=A4', output])

def driver():
    base_files = _collect_filenames()

    # If you use an executor as a context manager, then the
    # executor will wait until all of the submitted jobs finish
    # before it returns.  The submitted jobs will execute in
    # parallel.
    with concurrent.futures.ProcessPoolExecutor() as executor:
        for f in base_files:
            executor.submit(_convert_and_print, f)

    _print_files_in_order(base_files)
"""

Thanks,
Cem Karan


More information about the Python-list mailing list