[Python-Dev] XML codec?

"Martin v. Löwis" martin at v.loewis.de
Fri Nov 9 13:01:57 CET 2007


> Because you can force the encoder to use a specified encoding. If you do
> this and the unicode string starts with an XML declaration

So what if the unicode string doesn't start with an XML declaration?
Will it add one? If so, what version number will it use?

>>> OK, so should I put the C code into a _xml module?
>> I don't see the need for C code at all.
> 
> Doing the bit fiddling for
> Modules/_codecsmodule.c::detect_xml_encoding_str() in C felt like the
> right thing to do.

Hmm. I don't think a sequence like

+    if (strlen>0)
+    {
+        if (*str++ != '<')
+            return 1;
+        if (strlen>1)
+        {
+            if (*str++ != '?')
+                return 1;
+            if (strlen>2)
+            {
+                if (*str++ != 'x')
+                    return 1;
+                if (strlen>3)
+                {
+                    if (*str++ != 'm')
+                        return 1;
+                    if (strlen>4)
+                    {
+                        if (*str++ != 'l')
+                            return 1;
+                        if (strlen>5)
+                        {
+                            if (*str != ' ' && *str != '\t' && *str !=
'\r' && *str != '\n')
+                                return 1;

is well-maintainable C. I feel it is much better writing

  if not s.startswith("<=?xml"):
     return 1

What bit fiddling are you referring to specifically that you think
is better done in C than in Python?

Regards,
Martin


More information about the Python-Dev mailing list