Movie (MPAA) ratings and Python?
Ned Batchelder
ned at nedbatchelder.com
Wed Dec 11 13:35:17 EST 2013
On 12/10/13 6:50 PM, Dan Stromberg wrote:
>
> On Tue, Dec 10, 2013 at 1:07 PM, Petite Abeille
> <petite.abeille at gmail.com <mailto:petite.abeille at gmail.com>> wrote:
>
>
> On Dec 10, 2013, at 6:25 AM, Dan Stromberg <drsalists at gmail.com
> <mailto:drsalists at gmail.com>> wrote:
>
> > The IMDB flat text file probably came the closest, but it appears
> to have encoding issues; it's apparently nearly windows-1255, but
> not quite.
>
> It's ISO-8859-1.
>
> Thanks - that reads well from CPython 3.3.
>
> Now the question becomes: Why did chardet tell me it was windows-1255? :)
It probably told you it was Windows-1252 (I'm assuming the last 5 is a
typo).
Windows-1252 is a super-set of ISO-8859-1, so any text that is correct
ISO-8859-1 is also correct Windows-1252. In addition, it's not uncommon
to find text marked as ISO-8859-1 that in fact has characters that make
it Windows-1252.
--
Ned Batchelder, http://nedbatchelder.com
More information about the Python-list
mailing list