Are you sure all the characters in original text are in "gb2312" charset? Encoding with "utf8" seems work for this character (u'\xa0'), but I don't know if the result is correct. Could you give a subset of str_data in unicode?