How to test input via subprocess.Popen with data from file

Roel Schroeven roel at roelschroeven.net
Fri Mar 11 04:30:44 EST 2022


Op 11/03/2022 om 10:11 schreef Roel Schroeven:
> Op 10/03/2022 om 13:16 schreef Loris Bennett:
>> Hi,
>>
>> I have a command which produces output like the
>> following:
>>
>>    Job ID: 9431211
>>    Cluster: curta
>>    User/Group: build/staff
>>    State: COMPLETED (exit code 0)
>>    Nodes: 1
>>    Cores per node: 8
>>    CPU Utilized: 01:30:53
>>    CPU Efficiency: 83.63% of 01:48:40 core-walltime
>>    Job Wall-clock time: 00:13:35
>>    Memory Utilized: 6.45 GB
>>    Memory Efficiency: 80.68% of 8.00 GB
>>
>> I want to parse this and am using subprocess.Popen and accessing the
>> contents via Popen.stdout.  However, for testing purposes I want to save
>> various possible outputs of the command as text files and use those as
>> inputs.
>>
>> What format should I use to pass data to the actual parsing function?
>>
> Is this a command you run, produces that output, and then stops (as 
> opposed to a long-running program that from time to time generates a 
> bunch of output)?
> Because in that case I would use subprocess.run() with 
> capture_output=True instead of subprocess.Popen(). subprocess.run() 
> returns a CompletedProcess instance wich has stdout and stderr members 
> that contain the captured output as byte sequences or strings, 
> depending on the parameters you passed.
>
> So in your case I would simply read the content of each text file as a 
> whole into a string, and use subprocess.run() to get the command's 
> output also as a string. Then you can have a parse function that 
> accepts such strings, and works exactly the same for the text files as 
> for the command output. Your parse function can then use splitlines() 
> to access the lines individually. The data size is very small so it's 
> not a problem to have it all in memory at the same time (i.e. no need 
> to worry about trying to stream it instead).
>
Very simple example:

     import subprocess
     from pprint import pprint

     def parse(state_data):
         lines = state_data.splitlines(keepends=False)
         state_dict = {}
         for line in lines:
             key, value = line.split(': ')
             state_dict[key] = value
         return state_dict

     def read_from_command():
         return subprocess.run(['./jobstate'], capture_output=True, 
check=True, encoding='UTF-8').stdout

     def read_from_file(fn):
         with open(fn, 'rt', encoding='UTF-8') as f:
             return f.read()

     pprint(parse(read_from_command()))
     pprint(parse(read_from_file('jobfile')))

-- 
"Iceland is the place you go to remind yourself that planet Earth is a
machine... and that all organic life that has ever existed amounts to a greasy
film that has survived on the exterior of that machine thanks to furious
improvisation."
         -- Sam Hughes, Ra



More information about the Python-list mailing list