Problem with Lexical Scope

Sat Dec 17 03:59:44 EST 2005

>Note that NotImplemented is not the same as NotImplementedError -- and I wasn't suggesting
>raising an exception, just returning a distinguishable "True" value, so that
>a test suite (which I think you said the above was from) can test that the "when"
>guard logic is working vs just passing back a True which can't distinguish whether
>the overall test passed or was inapplicable.

Ahh, sorry about that. I got ya. that's not a bad idea.

>BTW, is this data coming from an actual DBMS system? If so, wouldn't
>there be native facilities to do all this validation? Or is Python
>just too much more fun than SQL? ;-)

Well, the "first usage" is going to come from a RDBMS, yeap. We use
MySQL 4.1.x right now. I know with Oracle/MSSQL, it's easy to put all
kinds of fancy constraints on data. I'm aware of no such facilities in
MySQL though I would love it if there was one to apply these kinds of
flexible validation routines.

Worst case scenario I can use it to do testing of what is pulled out of
the database :)

>Not to distract you, but I'm wondering what your rules would look like represented in simple
>rule-per-line text, if you assume that all names are data base field names
>unless otherwise declared, and just e.g. "name!" would be short for "have(name)",
>then just use some format like (assuming 'rules' is a name for a particular group of rules)

>    functions:: "is_ssn" # predeclare user-defined function names
>    operators:: "and", "<" # might be implicit

>    rules: first_name!; last_name!; ssn! and is_ssn(ssn); birth_date! and hire_date! and (birth_date < hire_date)

>or with identation, and messages after a separating comma a la assert

That's really a great idea!

When I was trying to figure out how to combine these rules and figure
out what kind of flexibility I wanted, I actually did write some
declarative strings. It was a lot more verbose than what you wrote, but
it helped with the concepts. Your approach is good.

My little pseduo-lang was like:

REQUIRE last_name or first_name
WHEN HAVE last_name, surname THEN REQUIRE last_name != surname

Well, I didn't go on and work with the error strings. I decided that
this thing wouldn't be very useful without being able to work with
hierarchical data.

So I abstracted out the idea of referencing the record by key and
implemented a little "data resolution" protocol using PLY. I like the
protocol because I think it will be something that can be reused fairly
often, especially for mini-langs.

def testNestedData(self):
		data = {
			'first_name' : 'John',
			'last_name' : 'Smith',
			'surname' : '',
			'phone' : '205-442-0841',
			'addresses' : {
				'home' : {
					'street' : '123 Elm St',
					'city' : 'Birmingham',
					'state' : 'AL',
					'postal' : '35434'
				},
				'work' : {
					'street' : '101 Century Square',
					'city' : 'Centre',
					'state' : 'AL',
					'postal' : '35344'
				}
			},
			'aliases' : [
				'John Smith',
				'Frank Beanz'
			]
		}

		rule = when(have('addresses.work'), all(
						have('addresses.work.city'),
						have('addresses.work.state'),
						have('addresses.work.postal')))

		assert repr(rule(data))
		data['addresses']['work']['city'] = ''
		assert rule(data) == False
		data['addresses']['work']['city'] = 'Centre'

It works the same as the rules do:

def match(test, field):
    """ match uses a function that takes a single string and returns a
boolean
    and converts it into a validation rule.
    """
    field = drep(field)
    def rule(record):
        value = field(record)
        if value:
            return bool(test(value))
        else:
            return True
    return rule

def check(test, fields):
    """ check uses a function that takes a variable arguments and
applies the given
    fields to that function for the test.
    """
    fields = [drep(field) for field in fields]
    def rule(record):
        values = [field(record) for field in fields]
        return test(*values)
    return rule

The data resolution algorithm uses dots and square braces. Dot's
resolve on dicts and square braces resolve on lists. It doesn't support
slicing, but it easily could. There is a special [:] notation that
turns the return into a list. all rules after the [:] are then applied
successively to each list entry.

  def testDeep(self):
        data = {
            'a' : 'eh-yah',
            'b' : {'x' : 'xylophone',
                    'y': 'yo-yo',
                    'z': 'zebra',
                    'A': [
                        1, 2, 3, 4
                    ]}
        }
        assert drep('b')(data) == {'y': 'yo-yo', 'x': 'xylophone', 'z':
'zebra', 'A': [1, 2, 3, 4]}
        assert drep('b.x')(data) == 'xylophone'
        assert drep('b.A[:]')(data) == [1, 2, 3, 4]
        assert drep('b.A')(data) == [1, 2, 3, 4]
        assert drep('b.A[1]')(data) == 2

   def testListResolution(self):
        data = {"foo" : [
                  {"bar" : 1},
                  {"bar" : 2},
                  {"bar" : 3}
                 ]}
        assert drep("foo[:].bar")(data) == [1, 2, 3]
        assert drep("foo[2].bar")(data) == 3

I know the code COULD also use some funky thing to break apart dicts as
well, but I didn't think of that when I was writing the lexer/parser.

Combining this with a little mini-language like the one you described
would be nice.