Movie (MPAA) ratings and Python?

Dan Stromberg drsalists at gmail.com
Wed Dec 11 18:07:35 EST 2013


On Wed, Dec 11, 2013 at 10:35 AM, Ned Batchelder <ned at nedbatchelder.com>wrote:

> On 12/10/13 6:50 PM, Dan Stromberg wrote:
> Now the question becomes: Why did chardet tell me it was windows-1255?  :)
>
> It probably told you it was Windows-1252 (I'm assuming the last 5 is a
> typo).
>
> Windows-1252 is a super-set of ISO-8859-1, so any text that is correct
> ISO-8859-1 is also correct Windows-1252.  In addition, it's not uncommon to
> find text marked as ISO-8859-1 that in fact has characters that make it
> Windows-1252.
>

 $ chardet mpaa-ratings-reasons.list
mpaa-ratings-reasons.list: windows-1255 (confidence: 0.97)

I'm aware that chardet is playing guessing games, though one would hope it
would guess well most of the time, and give a reasonable confidence rating.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20131211/ed9bf555/attachment.html>


More information about the Python-list mailing list