[Tutor] IndexError: list index out of range

Alan Gauld alan.gauld at btinternet.com
Thu Jun 26 14:42:50 CEST 2014


On 26/06/14 09:18, Myunggyo Lee wrote:
> Hi all
> I just started to learn python language.

Welcome.

> I'm trying to figure out the reason of error but i couldn't find it.
> first imports short.txt(is attached to this mail)
> and read in dictionary named gpdic1
>

Others have pointed out the specific problems with your code.
However you general approach could be different too using some of 
Pythons features.

> # -*- coding: utf-8 -*-
> import string, os, sys, time, glob

You should not need to import string any more.
It is a remnant of very old Python code. Its functionality
is (almost) all in the built-in string objects nowadays.
Especially if you are using python v3 - which you don't
tell us...

> inf2 =open('short.txt','r')
>
> gpdic1={}
> while 1:
>          line= inf2.readline()
>          if not line: break

A better way of doing this in Python is to use a
for loop.

You could rewrite the lines above as one line:

for line in open('short.txt'):   # 'r' is the default mode

>          lines = line[:-1].split(',')

rather than stripping of the lst character its usually better to use the 
rstrip() method - there migfht be more than one character to be removed...

lines = line.rstrip().split(',')

Also if this is a comma separated file, as seems to be implied by the 
split(). You will find a csv module in the standard library that can 
make this much easier. It can read the data directly from the file
into a list of dictionaries via a DictReader object.


>          hgene = lines[1]
>          chr1 = lines[4]
>          hgstart = lines[5]
>          hgstop = lines[6]
>          tgene = lines[7]
>          chr2 = lines[10]
>          tgstart = lines[11]
>          tgstop = lines[12]
>
>          gpdic1["hgene"] = hgene
>          gpdic1["chr1"] = chr1
>          gpdic1["hgstart"] = hgstart
>          gpdic1["hgstop"] = hgstop
>          gpdic1["tgene"] = tgene
>          gpdic1["chr2"] = chr2
>          gpdic1["tgstart"] = tgstart
>          gpdic1["tgstop"] = tgstop

You could have done all of that directly and avoided the double assignments.

ie
gpdic1["hgene"] = lines[1]

etc...


The csv.DictReader would do it all for you and correctly
return a list of dicts instead of just the last one.

import csv
datafile = open(...)
data = csv.DictReader(datafile)

data is now a collection of dictionaries each of
which is like your gpdic1 above.
If the file does not include the required keys in the
first line you can provide them as a list of strings
to DictReader - see the manual...

Python has a wealth of modules in its library and part of
becoming a good Python programmer is in learning whats
there and how to use it. That takes time and experience
but its often a quick Google in case.

Finally, it looks a lot like bioscience. There are some
specific Python modules (and versions) designed for
that area. You may find a Google search for Python and
bioscience throws up something useful.

hth
-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos



More information about the Tutor mailing list