removing BOM prepended by codecs?

Piet van Oostrum piet at vanoostrum.org
Tue Sep 24 23:34:04 EDT 2013


"J. Bagg" <j.bagg at kent.ac.uk> writes:

> I've checked the original files using od and they don't have BOMs.
>
> I'll remove them in the servlet. The overhead is probably small enough
> unless somebody is doing a massive search. We have a limit anyway to
> prevent somebody stealing the entire set of data.
>
> I started writing the Python search because the ancient C search had
> started putting out BOMs. I'm actually mystified because our home Linux
> box does not add BOMs even though it runs 2.7 but my work one does even
> though it has the same version. The only difference is Fedora 18 v
> Fedora 17.
>
> The BOMs are certainly there:
>
> <86> <AD><FB>%R 10C0203z-621
> %A François-Xavier Le_Bourdonnec
>
> 0000000 206     255 373   %   R       1   0   C   0   2   0   3   z   -
>
That is not a BOM or SIG. It isn't even valid utf-8.
-- 
Piet van Oostrum <piet at vanoostrum.org>
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]



More information about the Python-list mailing list