[ python-Bugs-1257525 ] Encodings iso8859_1 and latin_1 are redundant
SourceForge.net
noreply at sourceforge.net
Fri Aug 12 16:30:25 CEST 2005
Bugs item #1257525, was opened at 2005-08-12 14:22
Message generated for change (Comment added) made by lemburg
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1257525&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: liturgist (liturgist)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Encodings iso8859_1 and latin_1 are redundant
Initial Comment:
./lib/encodings contains both:
iso8859_1.py
latin_1.py
Only one should be present. Martin says that latin_1
is faster. Using the 'iso' name would correlate better
with the other ISO encodings provided.
If the latin_1 code is faster, then it should be in the
iso8859_1.py file. If an automated process produces
the 'iso*' encodings, then it should either produce the
faster code or stop producing iso8859_1.
Regardless, one of the files should be removed.
----------------------------------------------------------------------
>Comment By: M.-A. Lemburg (lemburg)
Date: 2005-08-12 16:30
Message:
Logged In: YES
user_id=38388
To answer your questions:
Yes, the encoding is the same for both latin-1 and iso8859-1.
Specifying latin-1 instead of iso8859-1 will allow the code
to use short-cuts.
You have to grep for 'latin-1'.
----------------------------------------------------------------------
Comment By: liturgist (liturgist)
Date: 2005-08-12 16:01
Message:
Logged In: YES
user_id=197677
Where could one see some of the "shortcuts" in the Unicode
integration code that make using "latin_1" faster in the
runtime? I greped *.py and *.c, but could not readily
identify any candidates.
----------------------------------------------------------------------
Comment By: liturgist (liturgist)
Date: 2005-08-12 15:12
Message:
Logged In: YES
user_id=197677
Ok. How about if we specify iso8859_1 as "(see latin_1)" in
the documentation?
The code will work the same regardless of which encoding
name the developer uses. Right?
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2005-08-12 14:49
Message:
Logged In: YES
user_id=38388
Good point.
The iso8859_1.py codec should be removed and added as alias
to latin-1.
Martin is right: the latin-1 codec is not only faster, but
the Unicode integration code also has a lot of short-cuts
for the "latin-1" encoding, so overall performance is better
if you use that name for the encoding.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1257525&group_id=5470
More information about the Python-bugs-list
mailing list