Pyhon 2.x or 3.x, which is faster?

Steven D'Aprano steve at pearwood.info
Wed Mar 9 10:33:30 EST 2016


On Thu, 10 Mar 2016 01:54 am, Chris Angelico wrote:

> I have a source of occasional text files that basically just dumps
> stuff on me without any metadata, and I have to figure out (a) what
> the encoding is, and (b) what language the text is in.

https://pypi.python.org/pypi/chardet

> then I have two levels of heuristics to try to guess a
> most-likely encoding

I'm curious, what do you do?



(I stress that trying to guess the character set or encoding from the text
itself is a second-last ditch tactic, for when you really don't know and
can't find out what the encoding is. The final, last-ditch tactic is to
just say "bugger it, I'll pretend it's Latin-1" and get a mess of
moji-bake, but at least an ASCII characters will decode alright, and as an
English speaker, that's all that's important to me :-)



-- 
Steven




More information about the Python-list mailing list