unicode encoding usablilty problem

Mon Feb 21 00:31:41 EST 2005

On Sat, 19 Feb 2005 18:44:27 +0100, Fredrik Lundh <fredrik at pythonware.com>  
wrote:

> "aurora" <aurora00 at gmail.com> wrote:
>
>> I don't want to mix them. But how could I find them? How do I know  
>> this  statement can be
>> potential problem
>>
>>   if a==b:
>>
>> where a and b can be instantiated individually far away from this line  
>> of  code that put them
>> together?
>
> if you don't know what a and b comes from, how can you be sure that
> your program works at all?  how can you be sure they're both strings?
>
> ("a op b" can fail in many ways, depending on what "a", "b", and "op"  
> are)
>

a and b are both string. The issue is 8-bit string or unicode string.

>> Things works fine, unit tests pass, all until the first non-ASCII  
>> characters
>> come in and then the program breaks.
>
> if you have unit tests, why don't they include Unicode tests?
>
> </F>

How do I structure the test cases to guarantee coverage? It is not  
practical to test every combinations of unicode/8-bit strings. Adding  
non-ascii characters to test data probably make problem pop up earlier.  
But it is arduous and it is hard to spot if you left out any.