Converting text file to different encoding.
Peter Otten
__peter__ at web.de
Fri Apr 17 11:06:33 EDT 2015
Chris Angelico wrote:
> On Sat, Apr 18, 2015 at 12:26 AM, <subhabrata.banerji at gmail.com> wrote:
>> I tried to do as follows,
>>>>> import codecs
>>>>> sourceEncoding = "iso-8859-1"
>>>>> targetEncoding = "utf-8"
>>>>> source = open("source1","w")
>>>>> string1="String type"
>>>>> str1=str(string1)
>>>>> source.write(str1)
>>>>> source.close()
>>>>> target = open("target", "w")
>>>>> source=open("source1","r")
>>>>> target.write(unicode(source.read(),
>>>>> sourceEncoding).encode(targetEncoding))
>>>>>
>>
>> am I going ok?
>
> Here's how I'd do it.
>
> $ python3
>>>> with open("source1", encoding="iso-8859-1") as source, open("target",
>>>> "w", encoding="utf-8") as target:
> ... target.write(source.read())
This approach is also viable in Python 2.6 and 2.7 if you use io.open()
instead of the builtin.
To limit memory consumption for big files you can replace
target.write(source.read())
with
shutil.copyfileobj(source, target)
If you want to be sure that line endings are preserved open both files with
io.open(..., newline="") # disable newline translation
More information about the Python-list
mailing list