[ python-Bugs-1583946 ] SSL "issuer" and "server" names cannot be parsed

SourceForge.net noreply at sourceforge.net
Fri Oct 27 14:54:33 CEST 2006


Bugs item #1583946, was opened at 2006-10-24 14:32
Message generated for change (Comment added) made by akuchling
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1583946&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: John Nagle (nagle)
Assigned to: Nobody/Anonymous (nobody)
Summary: SSL "issuer" and "server" names cannot be parsed

Initial Comment:
(Python 2.5 library)

    The Python SSL object offers two methods from
obtaining the info from an SSL certificate, "server()"
and "issuer()".  These return strings.

    The actual values in the certificate are a series
of key /value pairs in ASN.1 binary format.  But what
"server()" and "issuer()" return are single strings,
with the key/value pairs separated by "/". 

    However, "/" is a valid character in certificate
data. So parsing such strings is ambiguous, and
potentially exploitable.

    This is more than a theoretical problem.  The
issuer field of Verisign certificates has a "/" in the
middle of a text field:

"/O=VeriSign Trust Network/OU=VeriSign,
Inc./OU=VeriSign International Server CA - Class
3/OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY
LTD.(c)97 VeriSign".

Note the 

  "OU=Terms of use at www.verisign.com/rpa (c)00"

with a "/" in the middle of the value field.  Oops.

    Worse, this is potentially exploitable.  By
ordering a low-level certificate with a "/" in the
right place, you can create the illusion (at least for
flawed implementations like this one) that the
certificate belongs to someone else.  Just order a
certificate from GoDaddy, enter something like this in
the "Name" field

    "Myphonyname/C=US/ST=California/L=San Jose/O=eBay
Inc./OU=Site Operations/CN=signin.ebay.com"

and Python code will be spoofed into thinking you're eBay.

   Fortunately, browsers don't use Python code.

   The actual bug is in

    python/trunk/Modules/_ssl.c

at

    if ((self->server_cert =
SSL_get_peer_certificate(self->ssl))) {
       
X509_NAME_oneline(X509_get_subject_name(self->server_cert),
                  self->server, X509_NAME_MAXLEN);
       
X509_NAME_oneline(X509_get_issuer_name(self->server_cert),
                  self->issuer, X509_NAME_MAXLEN);

The "X509_name_oneline" function takes an X509_NAME
structure, which is the certificate system's
representation of a list, and flattens it into a
printable string.  This is a debug function, not one
for use in production code.  The SSL documentation for
"X509_name_oneline" says:   

    "The functions X509_NAME_oneline() and
X509_NAME_print() are legacy functions which produce a
non standard output form, they don't handle multi
character fields and have various quirks and
inconsistencies.  Their use is strongly discouraged in
new applications."

What OpenSSL callers are supposed to do is call
X509_NAME_entry_count() to get the number of entries in
an X509_NAME structure, then get each entry with
X509_NAME_get_entry().  A few more calls will obtain
the name/value pair from the entry, as UTF8 strings,
which should be converted to Python UNICODE strings.
OpenSSL has all the proper support, but Python's shim
doesn't interface to it correctly. 

X509_NAME_oneline() doesn't handle Unicode; it converts
non-ASCII values to "\xnn" format. Again, it's for
debug output only.

So what's needed are two new functions for Python's SSL
sockets to replace "issuer" and "server".  The new
functions should return lists of Unicode strings
representing the key/value pairs. (A list is needed,
not a dictionary; two strings with the same key
are both possible and common.)

The reason this now matters is that new "high
assurance" certs, the ones that tell you how much a
site can be trusted, are now being deployed, and to use
them effectively, you need that info.  Support for them
is in Internet Explorer 7, so they're going to be
widespread soon. Python needs to catch up.

And, of course, this needs to be fixed as part of
Unicode support.  


                John Nagle
                Animats


----------------------------------------------------------------------

>Comment By: A.M. Kuchling (akuchling)
Date: 2006-10-27 08:54

Message:
Logged In: YES 
user_id=11375

I've reworded the description in the documentation to say
something like this: "Returns a string describing the issuer
of the server's certificate.
Useful for debugging purposes; do not parse the content of
this string
because its format can't be parsed unambiguously."

For adding new features: please submit a patch.  Python's
maintainers probably don't use SSL in 
any sophisticated way and therefore have no idea what shape
better SSL/X.509 support would take.



----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2006-10-25 14:05

Message:
Logged In: YES 
user_id=21627

Notice that RFC 2253 has been superceded by RFC 4514 (see my
earlier message). However, I really see no reason to fix this:
even if the ambiguity problems were fixed, you *still*
should not
use the issuer and subject names in a security-relevant context.


----------------------------------------------------------------------

Comment By: John Nagle (nagle)
Date: 2006-10-25 13:26

Message:
Logged In: YES 
user_id=5571

Actually, they don't do what they're "designed to do". 
According to the Python library documentation for SSL
objects, the server method "Returns a string containing the
ASN.1 distinguished name identifying the server's
certificate. (See below for an example showing what
distinguished names look like.)" The example "below" is
missing from the documentation, so the documentation gives
us no clue of what to expect.  

There are several standardized representations for ASN.1
information.  See
"http://www.oss.com/asn1/tutorial/Explain.html"  Most are
binary. The only standard textual form is "XER", which is an
XML representation of ASN.1 encoded information.  It's
essentially the same representation used for parameters in
SOAP. 

So, given the documentation and the standard, what should be
coming out is the XML representation of that data. 

Here's an entire X.509 certificate in XML:

http://www.gnu.org/software/gnutls/manual/html_node/An-X_002e509-certificate.html

The "issuer" field can be seen in there.  It's awfully
bulky.  And making SSL dependent on the SOAP module probably
isn't desireable.  But that's an ASN.1 distinguished name in
XML format, per the standard. 

That's probably not what's wanted by most users, although
the ability to retrieve an entire certificate in XML format
would be useful.

However, there's another standard string encoding, which is
defined in RFC2253.  This is comma-separated UTF-8 with
backslash escapes for special characters.  That's reliably
parseable. There's an openSSL function,
"X509_NAME_print_ex", which does this formatting, but it
doesn't output to a string.  That's the right mechanism if
it can be invoked in some way to yield a string.  It should
be invoked with flags = ASN1_STRFLGS_RFC2253, which yields a
UTF8 string, which of course should become a Python Unicode
string.

Now if someone can figure out how to get a string, instead
of file output, out of OpenSSL's "X509_NAME_print_ex", we're
home. 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2006-10-25 04:38

Message:
Logged In: YES 
user_id=21627

The bug is not in the the server() and issuer() methods
(which do exactly what they are meant to do); the bug is in
applications which assume that the result of these methods
can be parsed. As you point out, it cannot. The functions,
as is, don't present a security problem. If their result is
presented as-is to the user, the user can determine herself
whether she recognizes the entity referred-to in the
distinguished name.

Notice that it is certainly possible to produce an
unambigous string representation of a distinguished name;
RFC 4514 specifies an algorithm to do so (for use within LDAP).

Also notice that that the SSL module does little to actually
support trust: there is no verification of server-side
certs, no access to extensions of a certificate, etc. So an
application and a user should *not* trust the issuer name it
received, anyway (unless 
there is an independent verification that the server
certificate can be trusted).

All that said: If you think you need this functionality,
please provide a patch to implement it.

----------------------------------------------------------------------

Comment By: John Nagle (nagle)
Date: 2006-10-24 18:40

Message:
Logged In: YES 
user_id=5571

The problem isn't in the version of OpenSSL used in Python,
which is at 0.9.8a.  OpenSSL has had the necessary functions
for years.  But Python isn't using them.

It's in  "python/trunk/Modules/_ssl.c", as described above.  

----------------------------------------------------------------------

Comment By: Gregory P. Smith (greg)
Date: 2006-10-24 18:05

Message:
Logged In: YES 
user_id=413

Yes OpenSSL 0.9.8d or later should be used for a new binary
release.

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-4343

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-3738

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-2940

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-2937


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1583946&group_id=5470


More information about the Python-bugs-list mailing list