cgi - cleaning tabs and returns out of textarea

Bengt Richter bokr at oz.net
Mon Jan 28 13:16:45 EST 2002


On Mon, 28 Jan 2002 09:17:41 -0500, Glenn Stauffer <java at dejazzd.com> wrote:

>
>I have a cgi utility which I wrote to process form data and save key-value
>pairs in a database.  Since it is a generic utility, I need to handle any
>type of data and store it in a form that can be converted into a
>tab-delimited download.
>
>In many browsers, tabs and carriage returns can be embedded in a textarea
>field.
>
>I wrote this function to strip these characters:
>
>def clean(text, tab_width):
>	text = text.strip() + ' '
>	return text.expandtabs(tab_width)
>
>The problem I've run into is that the return/linefeed characters are embedded
>within the value returned from the form and strip() won't work.  I wrote
>another function that tests each character and strips the carriage
>return/linefeeds, but I'm finding that  browsers often replace a return with
>a carriage return/linefeed and I end up with two spaces in the text.
>
>I'm working on figuring out the best way to do this, but thought I'd send to
>the list to see if someone could steer me in the right direction.

ISTM your tab results may not be what you want after eliminating line ends.
Or did you mean to preserve the lines and just clean them up one by one?

Anyway, look into the re module. You could do something like:

 >>> import re
 >>> def clean(text, tab_width):
 ...     return ' '.join([z for z in re.compile('[\r\n]+').split(text) if z]).expandtabs(tab_width)
 ...
 >>> s
 'Tab ->\t<- between arrows\ncr/lf at end of this line\r\ntwo spaces ->  <-between arrows\ntwo blank
 lines after this\n\n\nand the final line ends with \\n here.\n'
 >>> print s
 Tab ->  <- between arrows
 cr/lf at end of this line
 two spaces ->  <-between arrows
 two blank lines after this


 and the final line ends with \n here.

 >>> clean(s,6)
 'Tab ->      <- between arrows cr/lf at end of this line two spaces ->  <-between arrows two blank
 ines after this and the final line ends with \\n here.'
 >>> clean(s,4)
 'Tab ->  <- between arrows cr/lf at end of this line two spaces ->  <-between arrows two blank line
  after this and the final line ends with \\n here.'
 >>>

(The immediately above has obviously wrapped, maybe more than once, unless you have wrap turned off ;-)

HTH
Regards,
Bengt Richter




More information about the Python-list mailing list