[Chicago] Chardet help
Clyde Forrester
clydeforrester at gmail.com
Sun Mar 10 18:38:23 CET 2013
According to Firefox, the encoding is windows-1252.
On 3/10/2013 10:00 AM, Tathagata Dasgupta wrote:
> Good morning Chipy,
> Some encoding foo to spoil the Sunday morning motivational beverage!
>
> I am trying to read a file
> (https://dl.dropbox.com/u/18146922/uniq_words_in_corpus.txt) written
> in Italian - and after a bit of trial and error decided to go with
> chardet.
>
>
> def getEncoding(infile):
> import chardet
> rawdata = open(infile, "r").read()
> result = chardet.detect(rawdata)
> charenc = result['encoding']
> print charenc
>
> That gives me ISO-8859-2.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20130310/59f935b7/attachment.html>
More information about the Chicago
mailing list