Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

Magnus Lycka lycka at carmen.se
Mon Aug 29 03:04:45 EDT 2005


Robert Kern wrote:
> If I may digress for a bit, my advisor is currently working on a project
> that is processing seafloor depth datasets starting from a few decades
> ago. A lot of this data was orginally to be processed using FORTRAN
> software, so in the idiom of much FORTRAN software from those days, 9999
> is often used to mark missing data. Unfortunately, 9999 is a perfectly
> valid datum in most of the unit systems used by the various datasets.
> 
> Now he has to find a grad student to traul through the datasets and
> clean up the really invalid 9999's (as well as other such fun tasks like
> deciding if a dataset that says it's using feet is actually using meters).

I'm afraid this didn't end with FORTRAN. It's not that long ago
that I wrote a program for my wife that combined a data editor
with a graph display, so that she could clean up time lines with
length and weight data for children (from an international research
project performed during the 90's). 99cm is not unreasonable as a
length, but if you see it in a graph with other length measurements,
it's easy to spot most of the false ones, just as mistyped year part
in a date (common in the beginning of a new year).

Perhaps graphics can help this grad student too? It's certainly much
easier to spot deviations in curves than in an endless line of
numbers if the curves would normally be reasonably smooth.



More information about the Python-list mailing list