Reading by positions plain text files

Tim Harig usernet at ilthio.net
Sun Dec 12 14:09:56 EST 2010


On 2010-12-12, javivd <javiervandam at gmail.com> wrote:
> On Dec 1, 7:15 am, Tim Harig <user... at ilthio.net> wrote:
>> On 2010-12-01, javivd <javiervan... at gmail.com> wrote:
>> > On Nov 30, 11:43 pm, Tim Harig <user... at ilthio.net> wrote:
>> >> encodings and how you mark line endings.  Frankly, the use of the
>> >> world columns in the header suggests that the data *is* separated by
>> >> line endings rather then absolute position and the position refers to
>> >> the line number. In which case, you can use splitlines() to break up
>> >> the data and then address the proper line by index.  Nevertheless,

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that I specifically questioned the use of absolute file position vs.
postion within a column.  These are two different things.  You use
different methods to extract each.

>> > I work in a survey research firm. the data im talking about has a lot
>> > of 0-1 variables, meaning yes or no of a lot of questions. so only one
>> > position of a character is needed (not byte), explaining the 123-123
>> > kind of positions of a lot of variables.
>>
>> Thenfile.seek() is what you are looking for; but, you need to be aware of
>> line endings and encodings as indicated.  Make sure that you open thefile
>> using whatever encoding was used when it was generated or you could have
>> problems with multibyte characters affecting the offsets.
>
> f = open(r'c:c:\somefile.txt', 'w')

I suspect you don't need to use the c: twice.

> f.write('0123456789\n0123456789\n0123456789')

Note that the file you a writing contains three lines.  Is the data that
you are looking for located at an absolute position in the file or on a
position within a individual line?  If the latter, not that line endings
may be composed of more then a single character.

> f.write('0123456789\n0123456789\n0123456789')
              ^ postion 3 using fseek()

> for line in f:

Perhaps you meant:
	for character in f.read():
or
	for line in f.read().splitlines()

>     f.seek(3,0)

This will always take you back to the exact fourth position in the file
(indicated above).

> I used .seek() in this manner, but is not working.

It is working the way it is supposed to.

If you want the absolution position 3 in a file then:
	
	f = open('somefile.txt', 'r')
	f.seek(3)
	variable = f.read(1)

If you want the absolute position in a column:
	f = open('somefile.txt', 'r').read().splitlines()
	for column in f:
		variable = column[3]



More information about the Python-list mailing list