Working with binary data, S-records (long)

Fri Mar 21 16:43:53 EST 2003

Hans-Joachim,

I amd looking at the PyTables modules for handing HDD binary sector
data.

http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&group=comp.lang.python.announce&selm=mailman.1048090984.682.clpa-moderators%40python.org

Cheers,

-Alan 

"Hans-Joachim Widmaier" <hjwidmaier at web.de> wrote in message news:<pan.2003.03.20.18.58.24.448261 at web.de>...
> We have a program to download (Flash-)Eprom images to an embedded device.
> It reads the image data from a file in Motorola S-record format, augments
> the image with some device specific data and sends the thing via a serial
> port to the device, usually after downloading a small piece of software,
> a second-stage bootloader, first, in the same manner. This program is
> currently written in C and runs under Linux and in a DOS shell under older
> Windows versions. Over the years said program accumulated a lot of cruft
> and grew command line options like weed. Now my boss asked me to make a
> new version which offers a GUI and runs under Windows 2000, XP as well.
> Which is perfectly fine with me, 'cause after finding pyserial that'll be
> more like a fun job (I've already written a similar thing in Python that
> reads an ELF program file and downloads the needed section to another
> device - it worked in a few hours; a C version would have taken weeks to
> write and debug).
> 
> Time to get to the topic. Reading S-records in Python is not all that much
> fun (ok, it's neither in C). I've thought about doing it in C and
> returning a string, but that would lose the address information. And
> creating more complex python data types in C is something I've never done.
> And I don't want to compile under Windows. Thus I wrote a pure Python
> reader, which looks like (this is the whole class so far):
> --------------------------
> import operator
> 
> class SRecord:
>     def __init__(self, init=0xff, checkcs=True):
>         self.udata  = []
>         self.data   = []
>         self.tail   = {}
>         self.offset = 0
>         self.size   = 0
>         self.start  = None
>         self.comm   = []
>         self.init   = init
>         self.check  = checkcs
> 
>     def readrecord(self, line):
>         """Lese eine Zeile als S-Record und gebe Adresse, Daten und Prüfsumme zurück."""
>         type = line[:2]
>         data = [int(line[i:i + 2], 16) for i in range(2, len(line), 2)]
>         cs   = (reduce(operator.add, data) + 1) & 0xff  # Muß 0 ergeben
>         if type in ('S1', 'S9'):
>             adr = (data[1] << 8) + data[2]
>             fd  = 3
>         elif type in ('S2', 'S8'):
>             adr = (data[1] << 16) + (data[2] << 8) + data[3]
>             fd  = 4
>         elif type in ('S3', 'S7'):
>             adr = (long(data[0]) << 24) + (data[2] << 16) + (data[3] << 8) + data[4]
>             fd  = 5
>         elif type == 'S0':      # Kommentar
>             return 'C', 0, data[3:-1], cs
>         else:
>             raise ValueError, "Kein gültiger S-Record"
>         if type > 'S6':         # Startadresse
>             type = 'S'
>         else:                   # Daten
>             type = 'D'
>         return type, adr, data[fd:-1], cs
> 
>     def readrecords(self, records):
>         """Eine Liste (Zeilen) von S-Records lesen."""
>         recno = -1
>         for line in records:
>             recno += 1
>             line = line.rstrip()
>             type, adr, data, cs = self.readrecord(line)
>             if cs and self.checkcs:
>                 raise ValueError, "Prüfsummenfehler in Record %d" % recno
>             if type == 'D':
>                 self.udata.append((adr, data))
>             elif type == 'S':
>                 self.start = adr
>             else:
>                 self.comm.append("".join(map(chr, data)))
>         if not self.udata:
>             return
>         self.udata.sort()
>         loadr = self.udata[0][0]
>         hiadr = self.udata[-1][0] + len(self.udata[-1][1])
>         size  = hiadr - loadr
>         self.data = [self.init] * size
>         for adr, data in self.udata:
>             dlen  = len(data)
>             adr  -= loadr
>             self.data[adr:adr + dlen] = data
>         self.offset = loadr
>         self.size   = size
> 
> -----------------------
> On my development machine (1.7 GHz) it's reasonably fast with a file worth
> 100 KB. But I'm afraid it'll suck on our production machines, which run at
> 166 MHz (give or take some). I thought about using array, but it's lacking
> a method to create a big array without creating a list or string first.
> 
> Anyway, does anyone see a way to speed this up? I'm not going to inline
> readrecord(), as I don't care about 10 %. I'm asking if you see a real
> flaw in my algorithm.
> 
> <pipe-dreaming mode on>
> Whenever I play with binary data in Python, a dream of a mutable string
> data type crops up. Doing byte fiddling with strings is quite ok as long
> as the data is comparably small. But when the thing gets largish, the
> slicing, copying and reassembling are getting increasingly inelegant, not
> to say "un-pythonic." Even if that hypothetical mutable string type
> wouldn't be returned by read() and wouldn't be accepted by write(),
> conversion from and to normal immutable strings should be cheap.
> <pipe-dreaming mode off>
> 
> Hope you're not distracted by the german comments, and just presuming you
> know what S-records are,
> 
> Hans-Joachim