Input from a file as a command line argument

Robin Munn rmunn at pobox.com
Sat Apr 5 12:56:13 EST 2003


Teemu Luojola <teemu.luojola at pbezone.net> wrote:
> I have read through Python documentation and tried several searches, but 
> I can't find a simple answer to this question.
> 
> If I make an executable script in linux environment, how can I pass 
> input to that script from a file on command line? Like the following:
> 
> $ script.py input.txt

In this case, the filename would be found as sys.argv[1]. (You'd have to
import the sys module in your code, of course). You would want to do
something like:

    try:
        filename = sys.argv[1]
    except IndexError:
        # No parameters at all: print usage information and exit
        usage()
        sys.exit(1)
    input_file = file(filename, 'r')

A couple words of explanation of the above code. First, sys.argv is a
list that works very much like the argv[] array in C code: sys.argv[0]
will be the name of the current process (usually the filename of your
"main" Python source file, but that's not 100% guaranteed), sys.argv[1]
will be the first command-line parameter, etc. There is no sys.argc
because you can get similar results by doing len(sys.argv). Beware --
len(sys.argv) will always be equal to (number_of_parameters + 1) because
sys.argv[0] is guaranteed to contain something.

Now: the "try: ... except IndexError:..." block is an example of Python
exception-handling. IndexError gets thrown if you try to access an index
that's off the end of an array. In this script, that will only happen if
the script was run with no arguments at all, in which case we should
print out some usage information and exit with an error code.

Next I create a file object with a call to file() (you can also call
open() -- the names "file" and "open" both refer to the same function).
I prefer to call file(), though, because it reminds me of a very
important point: do *not* use the name "file" for your own use! It will
cause subtle and unexpected bugs later in your code. Use names like
"input_file" or "output_file", which are more descriptive anyway.

By the way, if any of this seems insultingly obvious to you, I
apologize. I figure it's better to explain too much and be redundant
than run the risk of explaining too little and leaving the other person
just as confused as before.

> or perhaps
> 
> $ script.py < input.txt

In this case, the file would be available as sys.stdin, and it would
already be open, so you wouldn't need to call file().

> I make now a comparison to a Perl script. In Perl I would write the 
> following to handle the input (line by line) from a file:
> 
> while (<>) {
> 	## (do something to $_)
> }

In Python that would be done as:

    for line in input_file:
        # Do something with line

or, if you are using stdin:

    for line in sys.stdin:
        # Do something with line

> and I can execute the script simply with
> 
> $ script.perl < input.txt
> 
> and it would process each line in input.txt.
> 
> How would a similar Python code look like?

There you go. One last caveat: it's been over a year since I did any
Perl programming, so I don't remember whether the "while (<>)" idiom in
Perl strips newlines off the end of lines or not. In Python, newlines
are *not* stripped of the end of lines in the for loop. So take that
into account in your processing: if you want to strip the newlines off
the end, you might want to do something like this as the first line of
your loop:

    for line in sys.stdin:
        line = line.rstrip('\n')
        # Further processing

Hope this helps.

-- 
Robin Munn <rmunn at pobox.com>
http://www.rmunn.com/
PGP key ID: 0x6AFB6838    50FF 2478 CFFB 081A 8338  54F7 845D ACFD 6AFB 6838




More information about the Python-list mailing list