Python 3.x stuffing utf-8 into SQLite db

Zachary Ware zachary.ware+pylist at gmail.com
Mon Feb 9 15:05:43 EST 2015


On Mon, Feb 9, 2015 at 11:32 AM, Skip Montanaro
<skip.montanaro at gmail.com> wrote:
> LibreOffice spit out a CSV file
> (with those three odd bytes). My script sucked in the CSV file and
> inserted data into my SQLite db.

If all else fails, you can try ftfy to fix things:
http://ftfy.readthedocs.org/en/latest/

   >>> import ftfy
   >>> ftfy.fix_text('Anderson Barracuda Masters - 2010 St.
Patrick’s Day Swim Meet')
   "Anderson Barracuda Masters - 2010 St. Patrick's Day Swim Meet"

It also seems to agree that there was a bad (en|de)coding with cp1252
at some point.

   >>> ftfy.fixes.fix_encoding_and_explain('Anderson Barracuda Masters
- 2010 St. Patrick’s Day Swim Meet')
   ('Anderson Barracuda Masters - 2010 St. Patrick’s Day Swim Meet',
[('encode', 'sloppy-windows-1252'), ('decode', 'utf-8')])

-- 
Zach



More information about the Python-list mailing list