[Tutor] Initialize values from a text input file

Steven D'Aprano steve at pearwood.info
Tue Jan 4 00:46:58 CET 2011


Tim Johnson wrote:
> I'm just have a little fun here, but I bet that comments will help
> further elighten me on the subtleties of python.
> 
> consider the following console session:
>>>> L = ['foo','bar']
>>>> locals()[L[0]] = L[1]

This will not do what you think it does. In general, local variables are 
not writable except the normal way by `name = value` assignment. Try 
this function:

def spam():
     x = 1
     print(x)
     locals()['x'] = 2
     print(x)


The fact that assignment to locals() works in the top-level global scope 
is an accident of implementation.



>>>> foo
> 'bar'
>>>> 'foobar' in locals()
> False
>>>> 'foo' in locals()
> True
>>>> locals()
> {'__builtins__': <module '__builtin__' (built-in)>, 'L': ['foo',
> 'bar'], '__package__': None, '__name__': '__main__', 'foo': 'bar',
> '__doc__': None}
> 
> Given that I have a text file as input with names and values
> with a seperator (TAB, for instance)
> I could initialize variables in a local scope, or I could build a
> dictionary. Actually the dictionary is preferable for the targeted
> application.

Have you considered using the ConfigParser module?


> However, for the sake of edification, I would welcome comments on
> the session above, and let's pretend that `L' is the result of a
> line read from a file and split on '\t'.

(1) This won't work except in the global scope.

(2) Even if it did work, do you trust the source of the text? Taking 
external data provided by arbitrary untrusted users and turning it into 
variables is a good way to have your computer owned by bad guys. This 
technique would make your software vulnerable to code injection attacks. 
You *might* be safe, if you never eval() or exec() the strings coming 
out of the file. But I wouldn't bet the security of my application on 
that -- especially if it were running in a web browser or on a server.

Actually, no, that's wrong -- even if you never use eval() or exec(), 
you're still vulnerable to Denial Of Service attacks if the attacker can 
put junk you don't expect into the file. Something as trivial as:

number = abc

in the input file could crash your application, if you aren't performing 
sufficient data validation when you use "number". Or they can shadow 
builtins and stop your code from working:

len = 123
# later...
len(some_list)  # this will fail

This isn't a theoretical threat, something like 3 or 4 out of 5 security 
vulnerabilities these days are code injection attacks, and there are a 
LOT of security vulnerabilities. (The majority of these are SQL injections.)

At the very least, you should pass the items through a white list of 
allowed variable names, and validate the values. Do not use a black list 
of forbidden variables! The problem with the black list idea is that it 
assumes that you can think of every possible vulnerability ahead of 
time. You can't.


-- 
Steven


More information about the Tutor mailing list