Suggestions for how to approach this problem?

Dave Hansen iddw at hotmail.com
Tue May 8 16:34:28 EDT 2007


On May 8, 3:00 pm, John Salerno <johnj... at NOSPAMgmail.com> wrote:
> Marc 'BlackJack' Rintsch wrote:
> > I think I have vague idea how the input looks like, but it would be
> > helpful if you show some example input and wanted output.
>
> Good idea. Here's what it looks like now:
>
> 1.  Levy, S.B. (1964)  Isologous interference with ultraviolet and X-ray
> irradiated
> bacteriophage T2.  J. Bacteriol. 87:1330-1338.
> 2.  Levy, S.B. and T. Watanabe (1966)  Mepacrine and transfer of R
> factor.  Lancet 2:1138.
> 3.  Takano, I., S. Sato, S.B. Levy and T. Watanabe (1966)  Episomic
> resistance factors in
> Enterobacteriaceae.  34.  The specific effects of the inhibitors of DNA
> synthesis on the
> transfer of R factor and F factor.  Med. Biol. (Tokyo)  73:79-83.

Questions:

1) Do the citation numbers always begin in column 1?

2) Are the citation numbers always followed by a period and then at
least one whitespace character?

If so, I'd probably use a regular expression like ^[0-9]+\.[ \t] to
find the beginning of each cite.  then I would output each cite
through a state machine that would reduce consecutive whitespace
characters (space, tab, newline) into a single character, separating
each cite with a newline.

Final formatting can be done with paragraph styles in Word.

HTH,
   -=Dave





More information about the Python-list mailing list