Need help reading damaged file
Anton Vredegoor
anton at vredegoor.doge.nl
Mon Oct 14 12:29:10 EDT 2002
On Mon, 14 Oct 2002 11:21:49 -0400, PoulsenL at capanalysis.com wrote:
>I have about 100+ files that are a dump of old tape from a database. Most
>of the data is good, but it is interspersed with damage that contains
>backspace characters and I _believe_ EOF characters. When we try to import
>the data it only imports 1/3 or 1/10, etc of the data depending on the file.
>I can pull it up in WinEdt and see that it contains far more lines. I
>created a script that reads through the file and counts the lines (not the
>most efficient script in the world, I'm sure). The problem is that the
>script suffers from the same problem as the import utility. It stops far
>short of the the end of the file. Any help would be appreciated.
>
>Here is the script:
>
>import glob, os
>
>def countlines(a,b,c):
> for file in c:
> if os.path.isfile(b + '\\' + file):
> input = open(b + '\\' + file)
> print b + '\\' + file
> x = 0
> for y in input:
> x += 1
> print x
>
>os.path.walk('\\\\Server\\Dir\\',countlines, None)
>From this it seems you are counting bytes instead of lines. I am not
exactly sure what the situation is, but I would try:
> input = open(b + '\\' + file,'rb')
And check if this makes any difference.
Anton.
More information about the Python-list
mailing list