Replace high-bit characters in file.

Sun Mar 31 17:22:12 EST 2002

"Brian Quinlan" <brian at sweetapp.com> writes:

> > I need a script that replaces national characters in a (dBase)file.
> > 
> > The original file is encoded in Latin-1 but I need it to be in CP850
> > instead.
> 
> # This is not tested, of course
> file_contents = open('dbasefile', 'rb').read()
> file_contents = unicode(file_contents, 'latin-1').encode('cp850')
> # file_contents is now a string containing the contents of the 
> # file in cp850 format. You can write that string to a file, if you
> # want. 
> open('dbasefile.out', 'wb').write(file_contents)
> 
> Is that what you wanted?

Probably not. If a (dBase) file is a binary thing, it might well be
that you modify the non-text parts of it. You really have to
understand the structure of the file, and apply the transformation
only to the text fragments.

In general, this is more complicated, as the size of the string may
change under the conversion, but in this specific case, this can't
happen.

Regards,
Martin