[Python-checkins] r42537 - in python/branches/release24-maint: Misc/NEWS Parser/tokenizer.c

neal.norwitz python-checkins at python.org
Tue Feb 21 10:19:47 CET 2006


Author: neal.norwitz
Date: Tue Feb 21 10:19:45 2006
New Revision: 42537

Modified:
   python/branches/release24-maint/Misc/NEWS
   python/branches/release24-maint/Parser/tokenizer.c
Log:
Backport 41753:
  Bug #1378022, UTF-8 files with a leading BOM crashed the interpreter.
  Also bug #1435487 (dup).


Modified: python/branches/release24-maint/Misc/NEWS
==============================================================================
--- python/branches/release24-maint/Misc/NEWS	(original)
+++ python/branches/release24-maint/Misc/NEWS	Tue Feb 21 10:19:45 2006
@@ -12,6 +12,8 @@
 Core and builtins
 -----------------
 
+- Bug #1378022, UTF-8 files with a leading BOM crashed the interpreter.
+
 - Patch #1400181, fix unicode string formatting to not use the locale.
   This is how string objects work.  u'%f' could use , instead of .
   for the decimal point.  Now both strings and unicode always use periods.

Modified: python/branches/release24-maint/Parser/tokenizer.c
==============================================================================
--- python/branches/release24-maint/Parser/tokenizer.c	(original)
+++ python/branches/release24-maint/Parser/tokenizer.c	Tue Feb 21 10:19:45 2006
@@ -289,6 +289,12 @@
 			PyMem_DEL(cs);
 		}
 	}
+	if (!r) {
+		cs = tok->encoding;
+		if (!cs)
+			cs = "with BOM";
+		PyErr_Format(PyExc_SyntaxError, "encoding problem: %s", cs);
+	}
 	return r;
 }
 


More information about the Python-checkins mailing list