byte count unicode string

Wed Sep 20 04:26:20 EDT 2006

MonkeeSage schrieb:
> John Machin wrote:
>> The answer is, "You can't", and the rationale would have to be that
>> nobody thought of a use case for counting the length of the UTF-8  form
>> but not creating the UTF-8 form. What is your use case?
> 
> Playing DA here, what if you need to send the byte-count on a server
> via a header, but need the utf8 representation for the actual data?

So what - you need it in the end, don't you?

The runtime complexity of the calculation will be the same - you have to 
consider each character, so its O(n).

Of course you will roughly double the memory consumption - the original 
unicode being represented as UCS2 or UCS4.

But then - if that really is a problem, how would you work with that 
string anyway?

So you have to resort to slicing and computing the size of the parts, 
which will remedy that easily.

Diez