searching through a string and pulling characters

John Machin sjmachin at lexicon.net
Mon Aug 18 17:59:27 EDT 2008


On Aug 19, 6:40 am, Alexnb <alexnbr... at gmail.com> wrote:
> This is similar to my last post,

Oh, goodie goodie goodie, I love guessing games!

> but a little different. Here is what I would
> like to do.
>
> Lets say I have a text file. The contents look like this, only there is A
> LOT of the same thing.
>
> () A registry mark given by underwriters (as at Lloyd's) to ships in
> first-class condition. Inferior grades are indicated by A 2 and A 3.
> () The first three letters of the alphabet, used for the whole alphabet.
> () In church or chapel style; -- said of compositions sung in the old church
> style, without instrumental accompaniment; as, a mass a capella, i. e., a
> mass purely vocal.
> () Astride; with a part on each side; -- used specif. in designating the
> position of an army with the wings separated by some line of demarcation, as
> a river or road.

This looks like the "values" part of an abbreviation/acronym
dictionary ... what has happened to the "keys" part (A1, ABC, AC, ?
astride?, ...)

Does "()" appear always at the start of a line (perhaps preceded by
some whitespace), or can it appear in the middle of a line?

Are you sure about "A 2" and "A 3"? I would have expected "A2" and
"A3". In other words, is the above an exact copy of some input or have
you re-typed it?

"()" is a strange way of delimiting things ...

OK, here's my guess: You have acquired a database with two tables.
Table K maps e.g. "ABC" to 2. Table V maps 2 to "The first three
letters of the alphabet, used for the whole alphabet." You have used
some utility or done "select '() ' + column2 from V.

>
> Now, I am talking 1000's of these. I need to do something like this. I will
> have a number, and what I want to do is go through this text file, just like
> the example. The trick is this, those "()'s" are what I need to match, so if
> the number is 245 I need to find the 245th () and then get the all the text
> from after it until the next (). If you have an idea about the best way to
> do this I would love your help.

The best way to do this is to write a small simple Python script. I
suggest that you try this, and if you have difficulties, post your
attempt here together with a lucid description of the perceived
problem.

However searching through a large file (how many Mb?) looking for the
nth occurrence of "()" doesn't sound like a good idea after about the
10th time you do it. Perhaps it might be worth the extra effort to
process the text file once and insert the results in a (say) SQLite
data base so that later you can do "select column2 from V where
column1 = 245".

A really silly question: You say "I will have a number" (e.g. 245);
what is the source or provenance of this ordinal? A random number
generator? Inscription on a ticket passed through a wicket? "select
column2 from K where column1 = 'A1'"? IOW, perhaps you may need to
consider the larger problem.

Cheers,
John



More information about the Python-list mailing list