[Tutor] stx, etx (\x02, \x03)

Danny Yoo dyoo at hashcollision.org
Mon Sep 28 01:55:32 CEST 2015


On Tue, Sep 22, 2015 at 5:37 AM, richard kappler <richkappler at gmail.com> wrote:
> I have a file with several lines. I need to prepend each line with \x02 and
> append each line with \x03 for reading into Splunk. I can get the \x02 at
> the beginning of each line, no problem, but can't get the \x03 to go on the
> end of the line. Instead it goes to the beginning of the next line.

Hi Richard,

Just to check: what operating system are you running your program in?
Also, what version of Python?

This detail may matter.  Your question is about line endings, and line
ending conventions are platform-specific.  Windows, Mac, and Linux use
different character sequences for line endings, which can be
infuriatingly non-uniform.

If you are using Python 3, you can use "universal newline" support by
default, so that you just have to consider '\n' as the line terminator
in your text files.

If you're in Python 2, you can open the file in universal newline mode.

##############################
with open('input/test.xml', 'rU') as f1: ...
##############################

See:

    https://docs.python.org/2/library/functions.html#open

for details.



> I have tried:
>
> #!/usr/bin/env python
>
> with open('input/test.xml', 'r') as f1:
>     with open('mod1.xml', 'a') as f2:
>         for line in f1:
>             s = ('\x02' + line + '\x03')
>             f2.write(s)
>
> as well as the same script but using .join, to no avail.

Question: can you explain why the program is opening 'mod1.xml' in 'a'
append mode?  Why not in 'w' write mode?


This may be important because multiple runs of the program will append
to the end of the file, so if you inspect the output file, you may be
confusing the output of prior runs of your program.  Also, it's likely
that the output file will be malformed, since there should just be one
XML document per file.  In summary: opening the output file in append
mode looks a bit dubious here.



To your other question:

> What am I missing?

Likely, the lines being returned from f1 still have a line terminator
at the end.  You'll want to interpose the '\x03' right before the line
terminator.  Mark Laurence's suggestion to use:

    s = '\x02' + line[:-1] + '\x03\n'

looks ok to me.


More information about the Tutor mailing list