python noob, multiple file i/o

Jordan jordan.taylor2 at gmail.com
Fri Mar 16 08:36:13 EDT 2007


On Mar 16, 7:09 am, Laurent Rahuel <lrahuel.notg... at voila.fr> wrote:
> Maybe the walk method in os module is what you needhttp://docs.python.org/lib/os-file-dir.html
>
> Regards
>
> Jon Clements wrote:
> > On 16 Mar, 09:02, "Jon Clements" <jon... at googlemail.com> wrote:
> >> On 16 Mar, 03:56, "hiro" <Nun... at gmail.com> wrote:
>
> >> > Hi there,
>
> >> > I'm very new to python, the problem I need to solve is whats the "best/
> >> > simplest/cleanest" way to read in multiple files (ascii), do stuff to
> >> > them, and write them out(ascii).
>
> >> > --
> >> > import os
>
> >> > filePath = ('O:/spam/eggs/')
> >> > for file in os.listdir(filePath):   #straight from docs
> >> >     # iterate the function through all the files in the directory
> >> >     # write results to separate files  <- this is where I'm mostly
> >> > stuck.
>
> >> > --
> >> > For clarity's sake, the file naming conventions for the files I'm
> >> > reading from are file.1.txt -> file.nth.txt
>
> >> > It's been a long day, i'm at my wits end, so I apologize in advance if
> >> > I'm not making much sense here.
> >> > syntax would also be great if you can share some recipes.
>
> >> I'd try the glob module.
>
> >> [code]
> >> import glob
>
> >> # Get a list of filenames matching wildcard criteria
> >> # (note that path is relative to working directory of program)
> >> matching_file_list = glob.glob('O:/spam/eggs/*.txt')
>
> >> # For each file that matches, open it and process it in some way...
> >> for filename in matching_file_list:
> >>     infile = file(filename)
> >>     outfile = file(filename + '.out','w')
> >>     # Process the input file line by line...
> >>     for line in infile:
> >>         pass # Do something more useful here, change line and write to
> >> outfile?
> >>     # Be explicit with file closures
> >>     outfile.close()
> >>     infile.close()
> >> [/code]
>
> >> Of course, you can change the wild card criteria in the glob
> >> statement, and also then filter further using regular expressions to
> >> choose only files matching more specific criteria. This should be
> >> enough to get you started though.
>
> >> hth
>
> >> Jon.- Hide quoted text -
>
> >> - Show quoted text -
>
> > Okies; postcoding before finishing your early morning coffee is not
> > the greatest of ideas!
>
> > I forgot to mention that glob will return pathnames as well. You'll
> > need to check that os.path.isfile(filename) returns True before
> > processing it...
>
> > Jon.

Also, leaving the format as .out is not necessarily convenient.  You
had glob do a search for .txt, so how about doing:

Also, Python advises using open() over file() (although I admit to
using file() myself more often than not)
>>for filename in matching_file_list:
>>     infile = open(filename,'r') # add 'r' for clarity if nothing else
>>     outfile = open(filename[:-4] + '.out.txt','w') # assumes file ext of original file is .txt
>>     # Process the input file line by line...
>>     for line in infile:
>>         pass # do thing --> you don't have to iterate line by line, if you specified what you wanted to do to each file we could probably help out here if you need it.
>>     # Be explicit with file closures
>>     outfile.close()
>>     infile.close()

Might also add some try/except statements to be safe ;).

Cheers,
Jordan




More information about the Python-list mailing list