[Python-Dev] Should ftplib use UTF-8 instead of latin-1 encoding?

Brett Cannon bcannon at gmail.com
Fri Jan 23 19:15:18 CET 2009


On Fri, Jan 23, 2009 at 00:31, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Giampaolo Rodola' wrote:
>> Hi,
>> while attempting to port pyftpdlib [1] to Python 3 I have noticed that
>> ftplib differs from the previous 2.x version in that it uses latin-1
>> to encode everything it's sent over the FTP command channel, but by
>> reading RFC-2640 [2] it seems that UTF-8 should be preferred instead.
>> I'm far from being an expert of encodings, plus the RFC is quite hard
>> to understand, so sorry in advance if I have misunderstood the whole
>> thing.
>
> I read it that a conforming client MUST issue a FEAT command, to
> determine whether the server supports UTF8. One would have to go
> back to the original FTP RFC, but it seams that, in the absence
> of server UTF8 support, all path names must be 7-bit clean (which
> means that ASCII should be the default encoding).
>
> In any case, Brett changed the encoding to latin1 in r58378, maybe
> he can comment.
>

If I remember correctly something along Martin's comment about 7-bit
clean is needed, but some servers don't follow the standard, so I
swapped it to Latin-1. But that was so long ago I don't remember where
I gleaned the details from in the RFC. If I misread the RFC and it is
UTF-8 then all the better to make more of the world move over to
Unicode.

-Brett


More information about the Python-Dev mailing list