removing BOM prepended by codecs?

J. Bagg j.bagg at kent.ac.uk
Tue Sep 24 13:32:53 EDT 2013


My editor is JEdit. I use it on a Win 7 machine but have everything set 
up for *nix files as that is the machine I'm normally working on.

The files are mailed to me as updates. The library where the indexers 
work do use MS computers but this is restricted to EndNote with an 
exporter into the old Bib-Refer format which we use. I then run them 
through a Python program to check the unicode for new characters that 
also creates an ascii transliteration of the main fields and checks for 
errors.

The problem is occuring at the search stage. This stage creates a script 
with directives to search particular years and then puts the results 
into a file in /tmp. The process is left over from an old CGI version 
but is efficient and so has been kept. This has been done with a very 
old C program that a collegue wrote back in the 90s with more recent 
updates. I'm in the process of updating this to Python as it is getting 
too difficult to maintain.

J



More information about the Python-list mailing list