ascii txt to LaTeX
marco
marco.rossini at gmx.ch
Wed Apr 23 17:40:34 EDT 2003
i wrote a python program for converting an word wrapped(!) ascii text into
LaTeX format. it actually doesn't do much, just finds(!) and places
paragraphs (\n\n), replaces apostrophes and umlauts. it does routine stuff:
why do boring things myself if my computer can do them for me?
feel free to use it (GPL). also i would appreciate critique on my code.
probably someone has made such (or: a better) program before, but i don't
care, it was easy for me to do and it's all _I_ need.
marco
#!/usr/bin/python
# Converts a regular word-wrapped ascii text into formated LaTeX
# detects paragraphs, replaces apostrophes, replaces umlauts
# Program written by Marco Rossini <marco.rossini at gmx.ch>
# Copyright: GPL
from sys import argv
from sys import exit
from math import ceil
from string import find
from string import join
from string import strip
from string import replace
from string import whitespace
from string import punctuation
if len(argv) != 2: exit("txt2latex: Argument error!")
try: f = file(argv[1],"r")
except: exit("txt2latex: File not found!")
# GET THE MAXIMAL NUMBER OF CHARACTERS PER LINE
array = f.readlines()
linelength = 0
for i in range(len(array)):
array[i] = strip(array[i])
if len(array[i]) > linelength: linelength = len(array[i])
# GUESS IF IT'S A PARAGRAPH BREAK
for i in range(len(array)-1):
nleft = linelength - len(array[i])
nright = find(array[i+1]," ")
if nright == -1: nright = len(array[i+1])
# if it is, append \n\n to the line, else a space
if nright+1 <= nleft:
if len(array[i]) > 0: array[i] += '\n\n'
else:
array[i] += ' '
# the lines are joined, the text is NOT word wrapped anymore
text = join(array,"")
# Replace apostrophes intelligent(ly?)
i = 0;
while i < len(text):
# ... before a word
if text[i] == '\"' and find(whitespace,text[i-1]) >= 0:
text = text[:i] + "``" + text[i+1:]
i += 1
# ... before a word (single)
if text[i] == '\'' and find(whitespace,text[i-1]) >= 0:
text = text[:i] + "`" + text[i+1:]
# ... after a word, no punctuation
if text[i] == '\"' and find(whitespace,text[i+1]) >= 0:
text = text[:i] + "''" + text[i+1:]
i += 1
if text[i] == '\"' and find(punctuation,text[i+1])>= 0:
text = text[:i] + "''" + text[i+1:]
i += 1
i += 1
# Here replacements for umlauts. modify if you like to.
text = replace(text,"ä","\\\"{a}") -A
text = replace(text,"Ä","\\\"{A}") -A
text = replace(text,"ö","\\\"{o}") -A
text = replace(text,"Ö","\\\"{O}") -A
text = replace(text,"ü","\\\"{u}") -A
text = replace(text,"Ü","\\\"{U}") -A
# handle output
print text
More information about the Python-list
mailing list