[Tutor] File handling Tab separated files

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Thu Apr 19 09:57:40 EDT 2018


On 04/19/2018 10:45 AM, Niharika Jakhar wrote:
> Hi
> I want to store a file from BioGRID database (tab separated file, big data)
> into a data structure(I prefer lists, please let me know if another would
> be better) and I am trying to print the objects.
> Here’s my code:
> class BioGRIDReader:
>      def __init__(self, filename):
>              with open('filename', 'r') as file_:
>              read_data = f.read()
>              for i in file_ :
>                  read_data = (i.split('\t'))
>                  return (objects[:100])
> 
> a = BioGRIDReader
> print (a.__init__(test_biogrid.txt))
> 

In addition to your immediate problem, which Steven explained already, 
you will run into more issues with the posted code:

1) in your open() call you have filename quoted
This is kind of the opposite of the mistake Steven points out.
Here, filename really is an identifier known to Python, but by quoting 
it it, you will always try to open a file literally named 'filename'

2) wrong indentation after the "with open( ..." line
with starts a block of indented code, in which you will have access to 
the opened file

3) file_ is the identifier under which you will have access to the input 
file's content, but on the very next line you're trying to use f.read().
f won't have any meaning for Python at that point

4) Even if you used file_.read() at that step, it would be wrong because 
in the subsequent for loop you are trying to consume file_ line by line.
However, if you read() all of file_ before, there won't be anything left 
to loop over.

5) You are reading the data into a list called read_data, but you are 
trying to return a slice of an identifier objects, which Python will not 
know about when it gets there

6) As Steven said, you shouldn't call __init__() directly,
but that also means that you should not return data from it.
Instead you might want to only parse the file contents in the __init__ 
method and store that data as an attribute in self (e.g., use self.data 
for this). Then you could use things like this:

a = BioGRIDReader('test_biogrid.txt')
print(a.data[:100])

Best,
Wolfgang




More information about the Tutor mailing list