[Tutor] Parsing /etc/passwd

Hugo Arts hugo.yoshi at gmail.com
Wed Oct 12 15:49:43 CEST 2011


On Wed, Oct 12, 2011 at 3:41 PM, Gerhardus Geldenhuis
<gerhardus.geldenhuis at gmail.com> wrote:
> Hi
> I wrote the following code:
>   f = open('/etc/passwd', 'r')
>   users = f.read()
>   userelements = re.findall(r'(\w+):(\w+):(\w+):(\w+):(\w+):(\w+):(\w+)',
> users)
>   print userelements
>   for user in userelements:
>     (username, encrypwd, uid, gid, gecos, homedir, usershell) = user  #
> unpack the tuple into 7 vars
>     print username
>
> but I get no results so my parsing must be wrong but I am not sure why.
> Incidentally while googling I found the
> module http://docs.python.org/library/pwd.html which I will eventually use
> but I am first curious to fix and understand the problem before I throw away
> this code.
> Regards
>

the homedir and usershell parts are paths. Paths will contain slashes.
The \w character class captures only [A-Za-z0-9_],  that is, letters,
numbers, and the underscore. That means slashes will not match, and so
the entire match fails.

On another note, the structure of the /etc/passwd file is pretty
simple, I don't think you need regexes. Simply use split:

users = f.readlines()
for user in users:
    (username, encrypwd, uid, gid, gecos, homedir, usershell) = user.split(':')

HTH,
Hugo


More information about the Tutor mailing list