[python-ldap] ANN: python-ldap 2.5.2

Raphaël Barrois raphael.barrois at m4x.org
Mon Nov 20 15:18:49 EST 2017


On Mon, 20 Nov 2017 20:23:34 +0100
Michael Ströder <michael at stroeder.com> wrote:

> Christian Heimes wrote:
> > The option for Python 3 support in python-ldap is fantastic news.
> > Raphaël Barrois, Petr, Miro, and me have been working on a Python 3 fork
> > [1] of python-ldap for a while. The fork has been shipped in Fedora for
> > several releases and is used in production, e.g. FreeIPA. The fork
> > stayed as close to python-ldap as possible because we never lost hope to
> > get all Python 3 changes into upstream one day. I'm sure that my
> > co-maintainers gladly agree to get rid of the fork ASAP.  
> 
> Hmm. The main obstacle for back-porting pyldap is that I'd like to keep
> python-ldap binary-only and still let the calling app do the Unicode
> decode/encode stuff if needed. It seems you're endorsing the opposite way.
> 
> Ciao, Michael.
> 

Hi,

The question of bytes/unicode in LDAP is rather complicated, and depends on the objects we're considering.

Per the LDAP RFCs, a few fields *MUST* be valid UTF-8 strings:
- An object's distinguishedName ;
- An attribute name — but not its value ;
- A server URI ;
- And a couple of others.

This is the choice we made in pyldap:
- If the field is mandated to be a valid UTF-8 string, expose it as a "unicode" object ;
- Otherwise, it's bytes.

In other words, a typical query would be (with explicit type markers for readability):

  >>> conn.search_ext_s(base=u'ou=people,dc=example,dc=org', scope=ldap.SCOPE_ONELEVEL, filterstr=u'(objectClass=inetOrgPerson')
  [
    (u'uid=rbarrois,ou=people,dc=example,dc=org', {
      u'cn': [b"Rapha\xc3\xabl"],
      u'sn': [b"Barrois"],
      u'memberOf': [b"cn=test,ou=groups,dc=example,dc=org", b"cn=admin,ou=groups,dc=example,dc=org"],
    }),
  ]

In the python-ldap style, each and every program calling `search_ext_s` would have to:
- UTF-8 encode the `base` and `filterstr` parameters (although they must always be valid UTF-8) ; 
- UTF-8 decode each object's DN (although it can only be a valid UTF-8 string) ;
- UTF-8 decode each attribute name (only UTF-8 is allowed as well).

Beyond that, reading an attribute *value* depends on the underlying schema, and — as you designed in python-ldap — *MUST* be
handled specifically by the application: the only common type here is "bytes".


-- 
Raphaël



More information about the python-ldap mailing list