Struggling with os.path.join and fileinput (was 'Path, strings, and lines'

Malik Rumi malik.a.rumi at gmail.com
Mon Jun 15 22:00:52 EDT 2015


On Saturday, June 13, 2015 at 1:25:52 PM UTC-5, MRAB wrote:
> On 2015-06-13 05:48, Malik Rumi wrote:
> > On Friday, June 12, 2015 at 3:31:36 PM UTC-5, Ian wrote:
> >> On Fri, Jun 12, 2015 at 1:39 PM, Malik Rumi wrote:
> >> >  I am trying to find a list of strings in a directory of files. Here is my code:
> >> >
> >> > # -*- coding: utf-8 -*-
> >> > import os
> >> > import fileinput
> >> >
> >> > s2 = os.listdir('/home/malikarumi/Projects/P5/shortstories')
> >>
> >> Note that the filenames that will be returned here are not fully
> >> qualified: you'll just get filename.txt, not
> >> /home/.../shortstories/filename.txt.
> >>
> >
> > Yes, that is what I want.
> >
> >> > for line in lines:
> >> >      for item in fileinput.input(s2):
> >>
> >> fileinput doesn't have the context of the directory that you listed
> >> above, so it's just going to look in the current directory.
> >
> > Can you explain a little more what you mean by fileinput lacking the context of s4?
> >
> listdir returns the names of the files that are in the folder, not
> their paths.
> 
> If you give fileinput only the names of the files, it'll assume they're
> in the current folder (directory), which they (probably) aren't. You
> need to give fileinput the complete _paths_ of the files, not just
> their names.
> 
> >>
> >> >          if line in item:
> >> >             with open(line + '_list', 'a+') as l:
> >> >                 l.append(filename(), filelineno(), line)
> >>
> >> Although it's not the problem at hand, I think you'll find that you
> >> need to qualify the filename() and filelineno() function calls with
> >> the fileinput module.
> >
> > By 'qualify', do you mean something like
> > l.append(fileinput.filename())?
> >
> >>
> >> > FileNotFoundError: [Errno 2] No such file or directory: 'THE LAND OF LOST TOYS~'
> >>
> >> And here you can see that it's failing to find the file because it's
> >> looking in the wrong directory. You can use the os.path.join function
> >> to add the proper directory path to the filenames that you pass to
> >> fileinput.
> >
> > I tried new code:
> >
> > # -*- coding: utf-8 -*-
> > import os
> > import fileinput
> >
> >
> os.join _returns its result.
> 
> > os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories/')
> > s2 = os.listdir('/Projects/P5/shortstories/')
> 
> At this point, s2 contains a list of _names_.
> 
> You pass those names to fileinput.input, but where are they? In which
> folder? It assumes they're in the current folder (directory), but
> they're not!
> 
> > for item in fileinput.input(s2):
> >       if 'penelope' in item:
> >          print(item)
> >
> > But still got the same errors even though the assignment of the path variable seems to have worked:
> >
> [snip]
> 
> Try this:
> 
>      filenames = os.listdir('/Projects/P5/shortstories/')
>      paths = [os.join('/Projects/P5/shortstories/', name) for name in names]
>      for item in fileinput.input(paths):

I have struggled with this for several hours and not made much progress. I was not sure if your 'names' variable was supposed to be the same as 'filenames'. Also, it should be 'os.path.join', not os.join. Anyway, I thought you had some good ideas so I worked with them but as I say I keep getting stuck at one particular point. Here is the current version of my code:

# -*- coding: utf-8 -*-
import os
import fileinput

path1 = os.path.join('Projects', 'P5', 'shortstories', '/')
path2 = os.path.join('Projects', 'P5')
targets = os.listdir(path1)
path3 = ((path1 + target) for target in targets)
path4 = os.path.join(path2,'list_stories')

with open(path4) as arrows:
    quiver = arrows.readlines()
<snip>

And here is my error message:

In [112]: %run algo_h1.py
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/home/malikarumi/Projects/algo_h1.py in <module>()
      9 path4 = os.path.join(path2,'list_stories')
     10 
---> 11 with open(path4) as arrows:
     12     quiver = arrows.readlines()
     13     for arrow in quiver:

FileNotFoundError: [Errno 2] No such file or directory: 'Projects/P5/list_stories'

I have tried many different ways but can't get python to find list_stories, which is the list of story names I want to find in the texts contained in path3. When I got a lesser version of this to work I had all relevant files in the same directory. This is a more realistic situation, but I can't make it work. Suggestions?

On a slightly different but closely related issue:

As I continued to work on this on my own, I learned that I could use the class, fileinput.FileInput, instead of fileinput.input. The supposed advantage is that there can be many simultaneous instances with the class. http://stackoverflow.com/questions/21443601/runtimeerror-input-already-active-file-loop. I tried this with a very basic version of my code, one that had worked with fileinput.input, and FileInput worked just as well. Then I wanted to try a 'with' statement, because that would take care of closing the file objects for me. I took my formulation directly from the docs, https://docs.python.org/3.4/library/fileinput.html#fileinput.FileInput, but got a NameError:

In [81]: %run algo_g3.py
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/home/malikarumi/Projects/P5/shortstories/algo_g3.py in <module>()
      4 ss = os.listdir()
      5 
----> 6 with FileInput(files = ss) as input:
      7     if 'Penelope' in input:
      8         print(input)

NameError: name 'FileInput' is not defined

Well, I am pretty much stumped by that. If it isn't right in the docs, what hope do I have? What am I missing here? Why did I get this error?

I decided to try tinkering with the parens

with FileInput(files = (ss)) as input:

But that got me the same result

NameError: name 'FileInput' is not defined

Then I changed how FileInput was called:

with fileinput.FileInput(ss) as input:

This time I got nothing. Zip. Zero:

In [83]: %run algo_g5.py

In [84]: 

In [84]: %run algo_g5.py

In [85]: 

Then I ran a different function

In [85]: fileinput.filename()

and got

RuntimeError: no active input()

Which means the file object is closed. But when? How? As part of the with statement that got the NameError? And since it is closed, why didn't running the last iteration of my script re-open it?




More information about the Python-list mailing list