unicode() vs. s.decode()

garabik-news-2005-05 at kassiopeia.juls.savba.sk garabik-news-2005-05 at kassiopeia.juls.savba.sk
Fri Aug 7 13:41:38 EDT 2009


Thorsten Kampe <thorsten at thorstenkampe.de> wrote:
 
> lines". That *is* *exactly* nothing.
> 
> Another guy claims he gets times between 2.9 and 6.2 seconds when 
> running decode/unicode in various manifestations over "18 million 


over a sample of 600000 words (sorry for not being able to explain
myself clear enough so that everyone understands)
while my current project is 18e6 words, that is the overall running time
will be 87 vs. 186 seconds, which is fairly noticeable.

> words" (or is it 600 million?) and says "the differences are pretty 
> significant". 

600 million is the size of the whole corpus, that translates to
48 minutes vs. 1h43min. That already is a huge difference (going to
lunch during noon or waiting another hour until it runs over - and 
you can bet it is _very_ noticeable when I am hungry :-)).

With 9 different versions of the corpus (that is, what we are really
using now) that goes to 7.2 hours (or even less with python3.1!) vs. 15
hours. Being able to re-run the whole corpus generation in one working
day (and then go on with the next issues) vs. working overtime or
delivering the corpus one day later is a huge difference. Like, being
one day behind the schedule.

> I think I don't have to comment on that.

Indeed, the numbers are self-explanatory.

> 
> If you increase the number of loops to one million or one billion or 
> whatever even the slightest completely negligible difference will occur. 
> The same thing will happen if you just increase the corpus of words to a 
> million, trillion or whatever. The performance implications of that are 
> exactly none.
> 

I am not sure I understood that. Must be my English :-)

-- 
 -----------------------------------------------------------
| Radovan Garabík http://kassiopeia.juls.savba.sk/~garabik/ |
| __..--^^^--..__    garabik @ kassiopeia.juls.savba.sk     |
 -----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!



More information about the Python-list mailing list