[Tutor] Is there a better way to get a current mid-rate Yen quote with Python?

Alan Gauld alan.gauld at btinternet.com
Sat Jul 26 01:11:50 CEST 2008


"Dick Moores" <rdm at rcblue.com> wrote 

>>Certainly Beautiful Soup will not be muh longer and a lot more 
>>elegant and probably more resilient.
> 
> Alan, expand a bit, please. Longer? Resilient?

Longer as in lines of code. BS is good for extracting several 
different parts from the soup, but just to pull out one very 
specific item the setup and so on may mean that the 
framework actually works out the same or longer than 
your code.

Resilient as in able to handle unexpected changes in 
the HTML used by the site or slight changes in formatting 
of the results etc.

>>But to extract a single piece of text in a well defined location 
>>then your approach although somewhat crude will work just fine.
> 
> Crude? What's crude about my code? 

Its just a little bit too hard coded for my liking, all that 
splitting and searching means theres a lot of work going on
to extract a small piece of data. You iterate over the whole 
page to do the first spliot, then you iterate over the whole 
thing again to find the line you want. Consider thios line:

if 'JPY' in x and '>USDJPY=X<' in x:

Sincy JPY is in both strings the first check is effectively 
redundant but still requires a string search over the line.
A well crafted regex would probably be faster than the 
double in test and provide better checking by including
allowances for extra spaces or case changes etc.

Then having found the string once with 'in' you them have 
to find it again with split(). You could just have done 
a find the first time and stored the index as a basis for 
the slicing later.

You also use lots of very specific slicing values to extract the 
data - thats where you lose resilience compared to a parser 
approach like BS. Again I suspect a regex might work better 
in extracting the value. And hard coding the url in the function 
also adds to its fragility.

Stylistically all those single character variable names hurts 
readability and maintainability too. 

> I want to improve, so please tell 

It will work as is, but it could be tidied up a bit is all.

Alan G.



More information about the Tutor mailing list