[issue10581] Review and document string format accepted in numeric data type constructors

Marc-Andre Lemburg report at bugs.python.org
Fri Jun 14 09:53:40 CEST 2013


Marc-Andre Lemburg added the comment:

On 14.06.2013 03:43, Alexander Belopolsky wrote:
> 
> Alexander Belopolsky added the comment:
> 
> PEP 393 implementation has already added the fast path to decimal encoding:
> 
> http://hg.python.org/cpython/diff/8beaa9a37387/Objects/unicodeobject.c#l1.3735
> 
> What we can do, however, is improve performance of converting non-ascii numerals by looking up only the first digit's value and converting the rest using simple:
> 
> value = code - (first_code - first_value)
> if not 0 <= value < 10:
>    raise or fall back to UCD lookup

I'm not sure whether just relying on PEP 393 is good enough.

Of course, you can special case the conversion based on the
kind, but that's only one form of optimization.

Slicing operations don't recheck the max code point
used in the substring. As a result, a slice may very well
be of the UCS2 kind, even though the text itself is ASCII.

Apart from the fast-path based on the string kind,
I think the decimal encoder would also have to scan the
string for non-ASCII code points. If it finds non-ASCII
code points, it would have to call the normalizer and
restart the scan based on the normalized string.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 14 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-07-01: EuroPython 2013, Florence, Italy ...           17 days to go
2013-07-16: Python Meeting Duesseldorf ...                 32 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10581>
_______________________________________


More information about the Python-bugs-list mailing list