[Python-Dev] beta1 coming real soon
"Martin v. Löwis"
martin at v.loewis.de
Tue Jun 13 20:33:59 CEST 2006
Walter Dörwald wrote:
> And passing MB_ERR_INVALID_CHARS in a call to MultiByteToWideChar()
> doesn't help either, because AFAICT there's no information about the
> error location. What could work would be to try MultiByteToWideChar()
> with various string lengths to try to determine whether the error is due
> to an incomplete byte sequence or invalid data. But that sounds ugly and
> slow to me.
That's all true, yes.
>> but can't possibly work for ISO-2022.
>
> So does that mean that IsDBCSLeadByte() returns garbage in this case?
IsDBCSLeadByteEx is documented to only validate lead bytes for selected
code pages; MSDN versions differ in what these code pages are. The
current online version says
"This function validates leading byte values only in the following code
pages: 932, 936, 949, 950, and 1361."
whereas my January 2006 MSDN (DVD version) says
"IsDBCSLeadByteEx does not validate any lead byte in multi-byte
character set (MBCS) code pages, for example, code pages 52696, 54936,
51949 and 5022x."
Whether or not this is relevant for IsDBCSLeadByte also, I cannot tell:
- maybe they forgot to document the limitation there as well
- maybe you can't use one of the unsupported code pages as CP_ACP,
so the problem cannot occur
- maybe IsDBCSLeadByte does indeed work correctly in these cases, when
IsDBCSLeadByteEx doesn't
The latter is difficult to believe, though, as IsDBCSLeadByte is likely
implemented as
BOOL IsDBCSLeadByte(BYTE TestChar)
{
return IsDBCLeadByteEx(GetACP(), TestChar);
}
Regards,
Martin
More information about the Python-Dev
mailing list