Reading by positions plain text files

javivd javiervandam at gmail.com
Tue Nov 30 21:03:50 EST 2010


On Nov 30, 11:43 pm, Tim Harig <user... at ilthio.net> wrote:
> On 2010-11-30, javivd <javiervan... at gmail.com> wrote:
>
> > I have a case now in wich another file has been provided (besides the
> > database) that tells me in wich column of the file is every variable,
> > because there isn't any blank or tab character that separates the
> > variables, they are stick together. This second file specify the
> > variable name and his position:
>
> > VARIABLE NAME      POSITION (COLUMN) IN FILE
> > var_name_1                 123-123
> > var_name_2                 124-125
> > var_name_3                 126-126
> > ..
> > ..
> > var_name_N                 512-513 (last positions)
>
> I am unclear on the format of these positions.  They do not look like
> what I would expect from absolute references in the data.  For instance,
> 123-123 may only contain one byte??? which could change for different
> encodings and how you mark line endings.  Frankly, the use of the
> world columns in the header suggests that the data *is* separated by
> line endings rather then absolute position and the position refers to
> the line number. In which case, you can use splitlines() to break up
> the data and then address the proper line by index.  Nevertheless,
> you can use file.seek() to move to an absolute offset in the file,
> if that really is what you are looking for.

I work in a survey research firm. the data im talking about has a lot
of 0-1 variables, meaning yes or no of a lot of questions. so only one
position of a character is needed (not byte), explaining the 123-123
kind of positions of a lot of variables.

and no, MRAB, it's not the similar problem (at least what i understood
of it). I have to associate the position this file give me with the
variable name this file give me for those positions.

thank you both and sorry for my english!

J



More information about the Python-list mailing list