suggestions for VIN parsing

Denis McMahon denismfmcmahon at gmail.com
Fri Dec 26 14:15:53 EST 2014


On Thu, 25 Dec 2014 17:02:33 -0700, Vincent Davis wrote:

> I would like to parse the VIN, frame and engine numbers found on this
> page (below).

First of all you need to define the number of different patterns that are 
possible:

eg:

H + 3-5 digits -> twins Unit 350cc & 500cc '57 - '69

3 (or more?) digits + N -> twins Pre-Unit 500cc & 650cc '50
3-5 digits + NA -> twins Pre-Unit 500cc & 650cc '51 - '52
4-6 digits -> twins Pre-Unit 500cc & 650cc '52 - '60
D + 3-5 digits -> twins Pre-Unit 500cc & 650cc '60 - '62

DU + 3-5 digits -> twins Unit 650cc '63 - '69

etc etc etc

You need to define these closely enough so that there is no ambiguity 
between two different expressions.

Then you create regular expressions for each pattern, and test a given 
engine / frame number against each re in turn until you get a match.

You may then need to extract the digits and letters from the pattern to 
determine the actual month / year data. Here's an example algorithm:

if matches H + 3-5 digits:
	get integer value of numeric part
	if num >= 101 and num <= 760:
		print "Unit 350cc & 500cc, 1957"
	if num >= 761 and num <= 5484:
		print "Unit 350cc & 500cc, 1958"
.....
if matches DU + 3-5 digits:
	get integer value of numeric part
	if num >= 101 and num <= 5824:
		print "Unit 650cc, 1963"
	if num >= 5825 and num <= 13374:
		print "Unit 650cc, 1964"
	
etc etc etc

Note, I think the 1981 model year ran KCA - DCA prefixes, not as shown on 
the website you quoted.

-- 
Denis McMahon, denismfmcmahon at gmail.com



More information about the Python-list mailing list