[Python] Pourquoi certaines comparaisons float <integer quatre fois plus lent que d'autres?


Answers

En utilisant gmpy2 avec des flottants et des entiers de précision arbitraires, il est possible d'obtenir des performances de comparaison plus uniformes:

~ $ ptipython
Python 3.5.1 |Anaconda 4.0.0 (64-bit)| (default, Dec  7 2015, 11:16:01) 
Type "copyright", "credits" or "license" for more information.

IPython 4.1.2 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import gmpy2

In [2]: from gmpy2 import mpfr

In [3]: from gmpy2 import mpz

In [4]: gmpy2.get_context().precision=200

In [5]: i1=562949953421000

In [6]: i2=562949953422000

In [7]: f=562949953420000.7

In [8]: i11=mpz('562949953421000')

In [9]: i12=mpz('562949953422000')

In [10]: f1=mpfr('562949953420000.7')

In [11]: f<i1
Out[11]: True

In [12]: f<i2
Out[12]: True

In [13]: f1<i11
Out[13]: True

In [14]: f1<i12
Out[14]: True

In [15]: %timeit f<i1
The slowest run took 10.15 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 441 ns per loop

In [16]: %timeit f<i2
The slowest run took 12.55 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 152 ns per loop

In [17]: %timeit f1<i11
The slowest run took 32.04 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 269 ns per loop

In [18]: %timeit f1<i12
The slowest run took 36.81 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 231 ns per loop

In [19]: %timeit f<i11
The slowest run took 78.26 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 156 ns per loop

In [20]: %timeit f<i12
The slowest run took 21.24 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 194 ns per loop

In [21]: %timeit f1<i1
The slowest run took 37.61 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 275 ns per loop

In [22]: %timeit f1<i2
The slowest run took 39.03 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 259 ns per loop
Question

Lors de la comparaison de flottants à des entiers, certaines paires de valeurs prennent beaucoup plus de temps à être évaluées que d'autres valeurs d'une amplitude similaire.

Par exemple:

>>> import timeit
>>> timeit.timeit("562949953420000.7 < 562949953421000") # run 1 million times
0.5387085462592742

Mais si le flottant ou l'entier est plus petit ou plus grand d'une certaine quantité, la comparaison s'exécute beaucoup plus rapidement:

>>> timeit.timeit("562949953420000.7 < 562949953422000") # integer increased by 1000
0.1481498428446173
>>> timeit.timeit("562949953423001.8 < 562949953421000") # float increased by 3001.1
0.1459577925548956

Changer l'opérateur de comparaison (par exemple en utilisant == ou > place) n'affecte pas les heures de manière notable.

Ce n'est pas seulement lié à l'ampleur, car choisir des valeurs plus grandes ou plus petites peut entraîner des comparaisons plus rapides, donc je soupçonne que c'est un peu malheureux que les bits s'alignent.

Clairement, la comparaison de ces valeurs est plus que suffisante pour la plupart des cas d'utilisation. Je suis simplement curieux de savoir pourquoi Python semble avoir plus de difficultés avec certaines paires de valeurs qu'avec d'autres.