How to test characters of a string

De ongekruisigde ongekruisigde at news.eternal-september.org
Wed Jun 8 14:57:27 EDT 2022


On 2022-06-08, 2QdxY4RzWzUUiLuE at potatochowder.com <2QdxY4RzWzUUiLuE at potatochowder.com> wrote:
> On 2022-06-09 at 04:15:46 +1000,
> Chris Angelico <rosuav at gmail.com> wrote:
>
>> On Thu, 9 Jun 2022 at 04:14, <2QdxY4RzWzUUiLuE at potatochowder.com> wrote:
>> >
>> > On 2022-06-09 at 03:18:56 +1000,
>> > Chris Angelico <rosuav at gmail.com> wrote:
>> >
>> > > On Thu, 9 Jun 2022 at 03:15, <2QdxY4RzWzUUiLuE at potatochowder.com> wrote:
>> > > >
>> > > > On 2022-06-08 at 08:07:40 -0000,
>> > > > De ongekruisigde <ongekruisigde at news.eternal-september.org> wrote:
>> > > >
>> > > > > Depending on the problem a regular expression may be the much simpler
>> > > > > solution. I love them for e.g. text parsing and use them all the time.
>> > > > > Unrivaled when e.g. parts of text have to be extracted, e.g. from lines
>> > > > > like these:
>> > > > >
>> > > > >   root:x:0:0:System administrator:/root:/run/current-system/sw/bin/bash
>> > > > >   dhcpcd:x:995:991::/var/empty:/run/current-system/sw/bin/nologin
>> > > > >   nm-iodine:x:996:57::/var/empty:/run/current-system/sw/bin/nologin
>> > > > >   avahi:x:997:996:avahi-daemon privilege separation user:/var/empty:/run/current-system/sw/bin/nologin
>> > > > >   sshd:x:998:993:SSH privilege separation user:/var/empty:/run/current-system/sw/bin/nologin
>> > > > >   geoclue:x:999:998:Geoinformation service:/var/lib/geoclue:/run/current-system/sw/bin/nologin
>> > > > >
>> > > > > Compare a regexp solution like this:
>> > > > >
>> > > > >   >>> g = re.search(r'([^:]*):([^:]*):(\d+):(\d+):([^:]*):([^:]*):(.*)$' , s)
>> > > > >   >>> print(g.groups())
>> > > > >   ('geoclue', 'x', '999', '998', 'Geoinformation service', '/var/lib/geoclue', '/run/current-system/sw/bin/nologin')
>> > > > >
>> > > > > to the code one would require to process it manually, with all the edge
>> > > > > cases. The regexp surely reads much simpler (?).
>> > > >
>> > > > Uh...
>> > > >
>> > > >     >>> import pwd # https://docs.python.org/3/library/pwd.html
>> > > >     >>> [x for x in pwd.getpwall() if x[0] == 'geoclue']
>> > > >     [pwd.struct_passwd(pw_name='geoclue', pw_passwd='x', pw_uid=992, pw_gid=992, pw_gecos='Geoinformation service', pw_dir='/var/lib/geoclue', pw_shell='/sbin/nologin')]
>> > >
>> > > That's great if the lines are specifically coming from your system's
>> > > own /etc/passwd, but not so much if you're trying to compare passwd
>> > > files from different systems, where you simply have the files
>> > > themselves.
>> >
>> > In addition to pwent to get specific entries from the local password
>> > database, POSIX has fpwent to get a specific entry from a stream that
>> > looks like /etc/passwd.  So even POSIX agrees that if you think you have
>> > to process this data manually, you're doing it wrong.  Python exposes
>> > neither functon directly (at least not in the pwd module or the os
>> > module; I didn't dig around or check PyPI).
>> 
>> So...... we can go find some other way of calling fpwent, or we can
>> just parse the file ourselves. It's a very VERY simple format.
>
> If you insist:
>
>     >>> s = 'nm-iodine:x:996:57::/var/empty:/run/current-system/sw/bin/nologin'
>     >>> print(s.split(':'))
>     ['nm-iodine', 'x', '996', '57', '', '/var/empty', '/run/current-system/sw/bin/nologin']
>
> Hesitantly, because this is the Python mailing list, I claim (a) ':' is
> simpler than r'([^:]*):([^:]*):(\d+):(\d+):([^:]*):([^:]*):(.*)$', and
> (b) string.split covers pretty much the same edge cases as re.search.

Ah, but you don't catch the be numeric of fields (0-based) 2 and 3! But
agreed, it's not the best of examples.


-- 
<StevenK> You're rewriting parts of Quake in *Python*?
<knghtbrd> MUAHAHAHA


More information about the Python-list mailing list