Suggestions for how to approach this problem?
Dave Hansen
iddw at hotmail.com
Tue May 8 16:34:28 EDT 2007
On May 8, 3:00 pm, John Salerno <johnj... at NOSPAMgmail.com> wrote:
> Marc 'BlackJack' Rintsch wrote:
> > I think I have vague idea how the input looks like, but it would be
> > helpful if you show some example input and wanted output.
>
> Good idea. Here's what it looks like now:
>
> 1. Levy, S.B. (1964) Isologous interference with ultraviolet and X-ray
> irradiated
> bacteriophage T2. J. Bacteriol. 87:1330-1338.
> 2. Levy, S.B. and T. Watanabe (1966) Mepacrine and transfer of R
> factor. Lancet 2:1138.
> 3. Takano, I., S. Sato, S.B. Levy and T. Watanabe (1966) Episomic
> resistance factors in
> Enterobacteriaceae. 34. The specific effects of the inhibitors of DNA
> synthesis on the
> transfer of R factor and F factor. Med. Biol. (Tokyo) 73:79-83.
Questions:
1) Do the citation numbers always begin in column 1?
2) Are the citation numbers always followed by a period and then at
least one whitespace character?
If so, I'd probably use a regular expression like ^[0-9]+\.[ \t] to
find the beginning of each cite. then I would output each cite
through a state machine that would reduce consecutive whitespace
characters (space, tab, newline) into a single character, separating
each cite with a newline.
Final formatting can be done with paragraph styles in Word.
HTH,
-=Dave
More information about the Python-list
mailing list