Parsing a serial stream too slowly

Nick Dokos nicholas.dokos at hp.com
Mon Jan 23 18:28:24 EST 2012


M.Pekala <mcdpekala at gmail.com> wrote:

> On Jan 23, 5:00 pm, Jon Clements <jon... at googlemail.com> wrote:
> > On Jan 23, 9:48 pm, "M.Pekala" <mcdpek... at gmail.com> wrote:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > > Hello, I am having some trouble with a serial stream on a project I am
> > > working on. I have an external board that is attached to a set of
> > > sensors. The board polls the sensors, filters them, formats the
> > > values, and sends the formatted values over a serial bus. The serial
> > > stream comes out like $A1234$$B-10$$C987$,  where "$A.*$" is a sensor
> > > value, "$B.*$" is a sensor value, "$C.*$" is a sensor value, ect...
> >
> > > When one sensor is running my python script grabs the data just fine,
> > > removes the formatting, and throws it into a text control box. However
> > > when 3 or more sensors are running, I get output like the following:
> >
> > > Sensor 1: 373
> > > Sensor 2: 112$$M-160$G373
> > > Sensor 3: 763$$A892$
> >
> > > I am fairly certain this means that my code is running too slow to
> > > catch all the '$' markers. Below is the snippet of code I believe is
> > > the cause of this problem...
> >
> > > def OnSerialRead(self, event):
> > >         text = event.data
> > >         self.sensorabuffer = self.sensorabuffer + text
> > >         self.sensorbbuffer = self.sensorbbuffer + text
> > >         self.sensorcbuffer = self.sensorcbuffer + text
> >
> > >         if sensoraenable:
> > >                 sensorresult = re.search(r'\$A.*\$.*', self.sensorabuffer )
> > >                         if sensorresult:
> > >                                 s = sensorresult.group(0)
> > >                                 s = s[2:-1]
> > >                                 if self.sensor_enable_chkbox.GetValue():
> > >                                         self.SensorAValue = s
> > >                                 self.sensorabuffer = ''
> >
> > >         if sensorbenable:
> > >                 sensorresult = re.search(r'\$A.*\$.*', self.sensorbenable)
> > >                         if sensorresult:
> > >                                 s = sensorresult.group(0)
> > >                                 s = s[2:-1]
> > >                                 if self.sensor_enable_chkbox.GetValue():
> > >                                         self.SensorBValue = s
> > >                                 self.sensorbenable= ''
> >
> > >         if sensorcenable:
> > >                 sensorresult = re.search(r'\$A.*\$.*', self.sensorcenable)
> > >                         if sensorresult:
> > >                                 s = sensorresult.group(0)
> > >                                 s = s[2:-1]
> > >                                 if self.sensor_enable_chkbox.GetValue():
> > >                                         self.SensorCValue = s
> > >                                 self.sensorcenable= ''
> >
> > >         self.DisplaySensorReadings()
> >
> > > I think that regex is too slow for this operation, but I'm uncertain
> > > of another method in python that could be faster. A little help would
> > > be appreciated.
> >
> > You sure that's your code? Your re.search()'s are all the same.
> 
> Whoops you are right. the search for the second should be re.search(r'\
> $B.*\$.*', self.sensorbbuffer ), for the third re.search(r'\$C.*\$.*',
> self.sensorcbuffer )
> 

The regex is probably still wrong: r'\$A.*\$.*' will e.g. match all of
your initial example "$A1234$$B-10$$C987$", so s will lose the initial
and final '$' and end up as "1234$$B-10$$C987" - I doubt that's what you
want:

>>> sensor_result = "$A123$$B456$$C789$$A456$"
>>> r = re.search(r'\$A.*\$.*', sensor_result)
>>> s = r.group(0)
>>> s = s[2:-1]
>>> s
'123$$B456$$C789$$A456'

Is this perhaps closer to what you want?

>>> r = re.search(r'\$A[^$]+\$', sensor_result)
>>> r.group(0)
'$A123$'
>>> 

I'm sure there are more problems too - e.g. why are there three buffers?
If they all start empty and get modified the same way, they will all
contain the same string - are they modified differently in the part of
the program you have not shown? They will presumably need to be trimmed
appropriately to indicate which part has been consumed already. And, as
somebody pointed out already, the searches should probably be against
the *buffer* variables rather than the *enable* variables.

Nick




More information about the Python-list mailing list