UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to <undefined>

pjmclenon at gmail.com pjmclenon at gmail.com
Sat Oct 20 08:24:37 EDT 2018


On Saturday, October 13, 2018 at 7:24:14 PM UTC-4, MRAB wrote:
> On 2018-10-14 00:13, pjmclenon at gmail.com wrote:
> > On Wednesday, June 13, 2018 at 7:14:06 AM UTC-4, INADA Naoki wrote:
> >> ​> 1st is this script is from a library module online open source
> >> 
> >> If it's open source, why didn't you show the link to the soruce?
> >> I assume your code is this:
> >> 
> >> https://github.com/siddharth2010/String-Search/blob/6770c7a1e811a5d812e7f9f7c5c83a12e5b28877/createIndex.py
> >> 
> >> And self.collFile is opened here:
> >> 
> >> https://github.com/siddharth2010/String-Search/blob/6770c7a1e811a5d812e7f9f7c5c83a12e5b28877/createIndex.py#L91
> >> 
> >> You need to add `encoding='utf-8'` argument.
> > 
> > 
> > 
> > hello i used this recommandtion in one of my projects in python and it worked fine im wrting today cuz i have this same unicode error in a slighty differn file code line and i added encoding utf 8 but i still get the same error
> > 
> > here is my line of code
> > 
> > with open(join("docs", path)) as f:
> > 
> > where can i add the encoding="utf8" line??
> > does anyone on this forum happen to know??
> > 
> > ok thank you jessica
> > 
> with open(join("docs", path), encoding="utf-8") as f:

hello MRAB and google forum

i have a sort of decode error it seems now very close to the line in my script
which you solved for my 2 previous encode errors in python 3

the error now is 
**************
UnicodeDecodeError; 'utf-8' can't decode byte 0xb0 in position 83064: invalid start byte
*****************
and it seems to refer to my code line:
***********
data = f.read()
***************
which is part of this block of code
********************
# Read content of files
    for path in files:
        with open(join("docs", path), encoding="utf-8") as f:       
        #with open(join("docs", path)) as f:
            data = f.read()
            process_data(data)
***********************************************

would the solution fix be this?
**********************
data = f.read(), decoding = "utf-8"  #OR
data = f.read(), decoding = "ascii" # is this the right fix or previous or both wrong??

thxz for any solutions
im in python 3

jessica






More information about the Python-list mailing list