About size of Unicode string
Fredrik Lundh
fredrik at pythonware.com
Mon Jun 13 06:48:07 EDT 2005
Frank Abel Cancio Bello wrote:
> Can I get how many bytes have a string object independently of its encoding?
strings hold characters, not bytes. an encoding is used to convert a
stream of characters to a stream of bytes. if you need to know the
number of bytes needed to hold an encoded string, you need to know
the encoding.
(and in some cases, including UTF-8, you need to *do* the encoding
before you can tell how many bytes you get)
> Is the "len" function the right way of get it?
len() on the encoded string, yes.
> Laci look the following code:
>
> import urllib2
> request = urllib2.Request(url= 'http://localhost:6000')
> data = 'data to send\n'.encode('utf_8')
> request.add_data(data)
> request.add_header('content-length', str(len(data)))
> request.add_header('content-encoding', 'UTF-8')
> file = urllib2.urlopen(request)
>
> Is always true that "the size of the entity-body" is "len(data)"
> independently of the encoding of "data"?
your data variable contains bytes, not characters, so the answer is "yes".
on the other hand, that add_header line isn't really needed -- if you leave
it out, urllib2 will add the content-length header all by itself.
</F>
More information about the Python-list
mailing list