[Tutor] Unknown encoded file types.

dn PyTutor at DancesWithMice.info
Sun Feb 7 03:32:56 EST 2021


On 07/02/2021 21.07, mhysnm1964 at gmail.com wrote:
> Hi all,
> 
> Windows 10, python 3.8 is what I am using.
> 
> I have 100's of small plain text files that are under 5k each. I am
> concatenating them into one big text file. The issue I am having is getting
> encoding errors. I have tried to open them with the encode parameter on the
> "with open" command. Some of the files are throwing encoding UTF errors.
> Looking like they are not in that format. The only reliable way I have
> managed to open the files  is in binary mode.
> 
> With open (filename, 'rb') as fp:
>               Content = fp.read()
> 
> I don't need to process the content thus why I am not using fp.readline()
> 
> Is there any way to identify the encoded format before opening to change the
> encoded format? I have seen some info on the net and don't understand it. 


Which OpSys is in-use (the source of the data-files)?

Which locale is set?

Which encode parameter(s) have been used?

Is it likely that the filenames and/or contents include European or
other characters outside the US-ASCII set?
-- 
Regards,
=dn


More information about the Tutor mailing list