My son wants me to teach him Python

Chris Angelico rosuav at gmail.com
Fri Jun 14 03:21:52 EDT 2013


On Fri, Jun 14, 2013 at 4:13 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Here's another Pepsi Challenge for you:
>
> There is a certain directory on your system containing 50 text files, and
> 50 non-text files. You know the location of the directory. You want to
> locate all the text files in this directory containing the word
> "halibut", then replace the word "halibut" with "trout", but only if the
> file name begins with a vowel.

That sounds extremely contrived, to be honest. Try this one: Massage a
set of MySQL dump files (text, pure SQL) so they can be imported into
PostgreSQL. I'll leave out my Wednesday's encoding headaches (MySQL
produced so-called "UTF-8" output that contained "don\222t" - boggle!)
and restrict this challenge to one thing:

CREATE TABLE blah
(
    blah INT(11) blah blah
);

All through the CREATE TABLE statements, integer fields are followed
by (11), and smallint fields by something else - (9) I think? - and
you have no guarantee that they'll be exactly these numbers, but they
will immediately follow the word INT.

Okay. I can hear some of you screaming "Regular expression!!", and
others yelling "Search across files, any good editor can do that!!". I
happened to use sed for the job. Bear in mind, there are heaps of
other files in the directory, so do this only on *.sql.

Any point-and-click solution to this is likely to end up cheating and
calling on some system that uses text strings (eg regexps). I'd like
to see any solution that proves me wrong, if only out of morbid
curiosity. I'm 100% confident that it won't be faster than me with
sed, or a Perl fanatic with a good regex.

ChrisA



More information about the Python-list mailing list