Using utidylib, empty string returned in some cases

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Tue Jan 22 20:14:14 EST 2008


En Tue, 22 Jan 2008 15:35:16 -0200, Boris <savinovboris at gmail.com>  
escribió:

> I'm using debian linux, Python 2.4.4, and utidylib (http://
> utidylib.berlios.de/). I wrote simple functions to get a web page,
> convert it from windows-1251 to utf8 and then I'd like to clean html
> with it.

Why the intermediate conversion? I don't know utidylib, but can't you feed  
it with the original page, in the original encoding? If the page itself  
contains a "meta http-equiv" tag stating its content-type and charset, it  
won't be valid anymore if you reencode the page.

-- 
Gabriel Genellina




More information about the Python-list mailing list