Suggestions for how to approach this problem?

James Stroud jstroud at mbi.ucla.edu
Wed May 9 18:44:31 EDT 2007


John Salerno wrote:
> John Salerno wrote:
> 
>> So I need to remove the line breaks too, but of course not *all* of 
>> them because each reference still needs a line break between it.
> 
> 
> After doing a bit of search and replace for tabs with my text editor, I
> think I've narrowed down the problem to just this:
> 
> I need to remove all newline characters that are not at the end of a
> citation (and replace them with a single space). That is, those that are
> not followed by the start of a new numbered citation. This seems to
> involve a look-ahead RE, but I'm not sure how to write those. This is
> what I came up with:
> 
> 
> \n(?=(\d)+)
> 
> (I can never remember if I need parentheses around '\d' or if the + 
> should be inside it or not!

I included code in my previous post that will parse the entire bib, 
making use of the numbering and eliminating the most probable, but still 
fairly rare, potential ambiguity. You might want to check out that code, 
as my testing it showed that it worked with your example.

James



More information about the Python-list mailing list