python - एन के नीचे सभी प्राइम्स सूचीबद्ध करने का सबसे तेज़ तरीका




math optimization (20)

यह सबसे अच्छा एल्गोरिदम है जो मैं आ सकता हूं।

def get_primes(n):
    numbers = set(range(n, 1, -1))
    primes = []
    while numbers:
        p = numbers.pop()
        primes.append(p)
        numbers.difference_update(set(range(p*2, n+1, p)))
    return primes

>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import   get_primes').timeit(1)
1.1499958793645562

क्या इसे और भी तेज बनाया जा सकता है?

इस कोड में एक दोष है: चूंकि numbers एक असाधारण सेट है, इसलिए कोई गारंटी नहीं है कि numbers.pop() सेट से सबसे कम संख्या को हटा देगा। फिर भी, यह कुछ इनपुट संख्याओं के लिए (कम से कम मेरे लिए) काम करता है:

>>> sum(get_primes(2000000))
142913828922L
#That's the correct sum of all numbers below 2 million
>>> 529 in get_primes(1000)
False
>>> 529 in get_primes(530)
True

अपने स्वयं के प्राइम फाइंडिंग कोड को लिखना निर्देशक है, लेकिन हाथ में तेजी से विश्वसनीय लाइब्रेरी रखना भी उपयोगी है। मैंने सी ++ लाइब्रेरी प्राइमवीव के चारों ओर एक रैपर लिखा, जिसका नाम primesieve-python

इसे pip install primesieve कोशिश करें

import primesieve
primes = primesieve.generate_primes(10**8)

मैं तुलना की गति को देखने के लिए उत्सुक होगा।


एन <9,080,191 धारणा पर मिलर-राबिन की प्रारंभिक परीक्षा का एक निर्धारिक कार्यान्वयन

import sys
import random

def miller_rabin_pass(a, n):
    d = n - 1
    s = 0
    while d % 2 == 0:
        d >>= 1
        s += 1

    a_to_power = pow(a, d, n)
    if a_to_power == 1:
        return True
    for i in xrange(s-1):
        if a_to_power == n - 1:
            return True
        a_to_power = (a_to_power * a_to_power) % n
    return a_to_power == n - 1


def miller_rabin(n):
    for a in [2, 3, 37, 73]:
      if not miller_rabin_pass(a, n):
        return False
    return True


n = int(sys.argv[1])
primes = [2]
for p in range(3,n,2):
  if miller_rabin(p):
    primes.append(p)
print len(primes)

विकिपीडिया ( http://en.wikipedia.org/wiki/Miller–Rabin_primality_test ) पर आलेख के मुताबिक एन = 9 380, 1 9 1 के लिए = 2,3,37, और 73 यह तय करने के लिए पर्याप्त है कि एन संयुक्त है या नहीं।

और मैंने मूल मिलर-राबिन के परीक्षण के संभाव्य कार्यान्वयन से स्रोत कोड को अनुकूलित किया: http://en.literateprograms.org/Miller-Rabin_primality_test_(Python)


पर्याप्त रूप से बड़े एन के साथ वास्तव में सबसे तेज़ समाधान के लिए प्राइम की पूर्व-गणना की सूची डाउनलोड करना होगा, इसे एक टुपल के रूप में स्टोर करें और ऐसा कुछ करें:

for pos,i in enumerate(primes):
    if i > N:
        print primes[:pos]

यदि N > primes[-1] केवल तभी अधिक प्राइम की गणना करें और अपनी कोड में नई सूची को सहेजें, तो अगली बार यह उतनी ही तेज होगी।

हमेशा बॉक्स के बाहर सोचो।


पायथन 3 के लिए

def rwh_primes2(n):
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n//3)
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)//3)      ::2*k]=[False]*((n//6-(k*k)//6-1)//k+1)
        sieve[(k*k+4*k-2*k*(i&1))//3::2*k]=[False]*((n//6-(k*k+4*k-2*k*(i&1))//6-1)//k+1)
    return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]]

यदि आप पहिया को फिर से शुरू नहीं करना चाहते हैं, तो आप प्रतीकात्मक गणित लाइब्रेरी sympy स्थापित कर सकते हैं (हाँ यह पाइथन 3 संगत है)

pip install sympy

और primerange फ़ंक्शन का उपयोग करें

from sympy import sieve
primes = list(sieve.primerange(1, 10**6))

यदि आपके पास एन पर नियंत्रण है, तो सभी प्राइम्स को सूचीबद्ध करने का सबसे तेज़ तरीका उनको प्रीकंप्यूट करना है। गंभीरता से। प्रीकंप्यूटिंग एक नजरअंदाज ऑप्टिमाइज़ेशन है।


यहां वह कोड है जिसका उपयोग मैं आमतौर पर पायथन में प्राइम उत्पन्न करने के लिए करता हूं:

$ python -mtimeit -s'import sieve' 'sieve.sieve(1000000)' 
10 loops, best of 3: 445 msec per loop
$ cat sieve.py
from math import sqrt

def sieve(size):
 prime=[True]*size
 rng=xrange
 limit=int(sqrt(size))

 for i in rng(3,limit+1,+2):
  if prime[i]:
   prime[i*i::+i]=[False]*len(prime[i*i::+i])

 return [2]+[i for i in rng(3,size,+2) if prime[i]]

if __name__=='__main__':
 print sieve(100)

यह यहां पोस्ट किए गए तेज़ समाधानों से प्रतिस्पर्धा नहीं कर सकता है, लेकिन कम से कम यह शुद्ध पायथन है।

इस सवाल को पोस्ट करने के लिए धन्यवाद। मैंने वास्तव में बहुत कुछ सीखा।


संबंधित प्रश्न (प्राइम जेनरेटर और बेंचमार्क सहित) से निपटना:
पायथन में बिटस्टिंग / बिट ऑपरेशंस को गति दें?

तेज़ और अधिक मेमोरी-वार शुद्ध पायथन कोड:

def primes(n):
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1)
    return [2] + [i for i in xrange(3,n,2) if sieve[i]]

या आधे चलनी से शुरू करें

def primes1(n):
    """ Returns  a list of primes < n """
    sieve = [True] * (n/2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

तेज और अधिक स्मृति-आधारित numpy कोड:

import numpy
def primesfrom3to(n):
    """ Returns a array of primes, 3 <= p < n """
    sieve = numpy.ones(n/2, dtype=numpy.bool)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = False
    return 2*numpy.nonzero(sieve)[0][1::]+1

एक चलनी के एक तिहाई से शुरू एक तेज विविधता:

import numpy
def primesfrom2to(n):
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = numpy.ones(n/3 + (n%6==2), dtype=numpy.bool)
    for i in xrange(1,int(n**0.5)/3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[       k*k/3     ::2*k] = False
            sieve[k*(k-2*(i&1)+4)/3::2*k] = False
    return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)]

उपरोक्त कोड का एक (हार्ड-टू-कोड) शुद्ध-पायथन संस्करण होगा:

def primes2(n):
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    n, correction = n-n%6+6, 2-(n%6>1)
    sieve = [True] * (n/3)
    for i in xrange(1,int(n**0.5)/3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      k*k/3      ::2*k] = [False] * ((n/6-k*k/6-1)/k+1)
        sieve[k*(k-2*(i&1)+4)/3::2*k] = [False] * ((n/6-k*(k-2*(i&1)+4)/6-1)/k+1)
    return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]

दुर्भाग्यवश शुद्ध-पायथन असाइनमेंट करने के सरल और तेज संख्यात्मक तरीके को अपनाते नहीं हैं, और लूप के अंदर लेंस () को कॉल करते हैं [False]*len(sieve[((k*k)/3)::2*k]) बहुत धीमी है। इसलिए मुझे इनपुट को सही करने के लिए सुधार करना पड़ा (और अधिक गणित से बचें) और कुछ चरम (दर्दनाक) गणित-जादू करें।
व्यक्तिगत रूप से मुझे लगता है कि यह एक शर्म की बात है कि numpy (जो कि व्यापक रूप से उपयोग किया जाता है) पायथन मानक पुस्तकालय (पायथन 3 रिलीज के बाद 2 साल और कोई numpy संगतता) का हिस्सा नहीं है, और वाक्यविन्यास और गति में सुधार पूरी तरह से अनदेखा प्रतीत होता है पायथन डेवलपर्स।


सुंदरम की चलनी का उपयोग करते हुए, मुझे लगता है कि मैंने शुद्ध-पायथन के रिकॉर्ड को तोड़ दिया:

def sundaram3(max_n):
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

Comparasion:

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.get_primes_erat(1000000)"
10 loops, best of 3: 710 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.daniel_sieve_2(1000000)"
10 loops, best of 3: 435 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.sundaram3(1000000)"
10 loops, best of 3: 327 msec per loop

चेतावनी: हार्डवेयर या पायथन के संस्करण में मतभेदों के कारण समय-समय पर परिणाम भिन्न हो सकते हैं।

नीचे एक स्क्रिप्ट है जो कई कार्यान्वयन की तुलना करती है:

मेरे ध्यान में sieve_wheel_30 लाने के लिए stephan लिए बहुत धन्यवाद। प्रिम्स रॉमर्ट rwh_primes को primesfrom2to, primesfrom3to, rwh_primes, rwh_primes1, और rwh_primes2 के लिए जाता है।

सादा पायथन विधियों का परीक्षण, psyco के साथ , n = 1000000 के लिए, rwh_primes1 सबसे तेज़ परीक्षण था।

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes1         | 43.0  |
| sieveOfAtkin        | 46.4  |
| rwh_primes          | 57.4  |
| sieve_wheel_30      | 63.0  |
| rwh_primes2         | 67.8  |    
| sieveOfEratosthenes | 147.0 |
| ambi_sieve_plain    | 152.0 |
| sundaram3           | 194.0 |
+---------------------+-------+

सादा पायथन विधियों का परीक्षण, बिना psyco के , n = 1000000 के लिए, rwh_primes2 सबसे तेज़ था।

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes2         | 68.1  |
| rwh_primes1         | 93.7  |
| rwh_primes          | 94.6  |
| sieve_wheel_30      | 97.4  |
| sieveOfEratosthenes | 178.0 |
| ambi_sieve_plain    | 286.0 |
| sieveOfAtkin        | 314.0 |
| sundaram3           | 416.0 |
+---------------------+-------+

परीक्षण की सभी विधियों में से, n = 1000000 के लिए numpy की अनुमति , primesfrom2to सबसे तेज़ परीक्षण किया गया था।

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| primesfrom2to       | 15.9  |
| primesfrom3to       | 18.4  |
| ambi_sieve          | 29.3  |
+---------------------+-------+

समय कमांड का उपयोग करके मापा गया था:

python -mtimeit -s"import primes" "primes.{method}(1000000)"

{method} साथ प्रत्येक विधि नामों द्वारा प्रतिस्थापित किया गया है।

primes.py:

#!/usr/bin/env python
import psyco; psyco.full()
from math import sqrt, ceil
import numpy as np

def rwh_primes(n):
    # https://.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1)
    return [2] + [i for i in xrange(3,n,2) if sieve[i]]

def rwh_primes1(n):
    # https://.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n/2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

def rwh_primes2(n):
    # https://.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n/3)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)/3)      ::2*k]=[False]*((n/6-(k*k)/6-1)/k+1)
        sieve[(k*k+4*k-2*k*(i&1))/3::2*k]=[False]*((n/6-(k*k+4*k-2*k*(i&1))/6-1)/k+1)
    return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]

def sieve_wheel_30(N):
    # http://zerovolt.com/?p=88
    ''' Returns a list of primes <= N using wheel criterion 2*3*5 = 30

Copyright 2009 by zerovolt.com
This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.
If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com.'''
    __smallp = ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
    61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139,
    149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227,
    229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311,
    313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401,
    409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,
    499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599,
    601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683,
    691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797,
    809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887,
    907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997)

    wheel = (2, 3, 5)
    const = 30
    if N < 2:
        return []
    if N <= const:
        pos = 0
        while __smallp[pos] <= N:
            pos += 1
        return list(__smallp[:pos])
    # make the offsets list
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    # prepare the list
    p = [2, 3, 5]
    dim = 2 + N // const
    tk1  = [True] * dim
    tk7  = [True] * dim
    tk11 = [True] * dim
    tk13 = [True] * dim
    tk17 = [True] * dim
    tk19 = [True] * dim
    tk23 = [True] * dim
    tk29 = [True] * dim
    tk1[0] = False
    # help dictionary d
    # d[a , b] = c  ==> if I want to find the smallest useful multiple of (30*pos)+a
    # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]
    # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]
    d = {}
    for x in offsets:
        for y in offsets:
            res = (x*y) % const
            if res in offsets:
                d[(x, res)] = y
    # another help dictionary: gives tkx calling tmptk[x]
    tmptk = {1:tk1, 7:tk7, 11:tk11, 13:tk13, 17:tk17, 19:tk19, 23:tk23, 29:tk29}
    pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))
    # inner functions definition
    def del_mult(tk, start, step):
        for k in xrange(start, len(tk), step):
            tk[k] = False
    # end of inner functions definition
    cpos = const * pos
    while prime < stop:
        # 30k + 7
        if tk7[pos]:
            prime = cpos + 7
            p.append(prime)
            lastadded = 7
            for off in offsets:
                tmp = d[(7, off)]
                start = (pos + prime) if off == 7 else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 11
        if tk11[pos]:
            prime = cpos + 11
            p.append(prime)
            lastadded = 11
            for off in offsets:
                tmp = d[(11, off)]
                start = (pos + prime) if off == 11 else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 13
        if tk13[pos]:
            prime = cpos + 13
            p.append(prime)
            lastadded = 13
            for off in offsets:
                tmp = d[(13, off)]
                start = (pos + prime) if off == 13 else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 17
        if tk17[pos]:
            prime = cpos + 17
            p.append(prime)
            lastadded = 17
            for off in offsets:
                tmp = d[(17, off)]
                start = (pos + prime) if off == 17 else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 19
        if tk19[pos]:
            prime = cpos + 19
            p.append(prime)
            lastadded = 19
            for off in offsets:
                tmp = d[(19, off)]
                start = (pos + prime) if off == 19 else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 23
        if tk23[pos]:
            prime = cpos + 23
            p.append(prime)
            lastadded = 23
            for off in offsets:
                tmp = d[(23, off)]
                start = (pos + prime) if off == 23 else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 29
        if tk29[pos]:
            prime = cpos + 29
            p.append(prime)
            lastadded = 29
            for off in offsets:
                tmp = d[(29, off)]
                start = (pos + prime) if off == 29 else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # now we go back to top tk1, so we need to increase pos by 1
        pos += 1
        cpos = const * pos
        # 30k + 1
        if tk1[pos]:
            prime = cpos + 1
            p.append(prime)
            lastadded = 1
            for off in offsets:
                tmp = d[(1, off)]
                start = (pos + prime) if off == 1 else (prime * (const * pos + tmp) )//const
                del_mult(tmptk[off], start, prime)
    # time to add remaining primes
    # if lastadded == 1, remove last element and start adding them from tk1
    # this way we don't need an "if" within the last while
    if lastadded == 1:
        p.pop()
    # now complete for every other possible prime
    while pos < len(tk1):
        cpos = const * pos
        if tk1[pos]: p.append(cpos + 1)
        if tk7[pos]: p.append(cpos + 7)
        if tk11[pos]: p.append(cpos + 11)
        if tk13[pos]: p.append(cpos + 13)
        if tk17[pos]: p.append(cpos + 17)
        if tk19[pos]: p.append(cpos + 19)
        if tk23[pos]: p.append(cpos + 23)
        if tk29[pos]: p.append(cpos + 29)
        pos += 1
    # remove exceeding if present
    pos = len(p) - 1
    while p[pos] > N:
        pos -= 1
    if pos < len(p) - 1:
        del p[pos+1:]
    # return p list
    return p

def sieveOfEratosthenes(n):
    """sieveOfEratosthenes(n): return the list of the primes < n."""
    # Code from: <[email protected]>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = range(3, n, 2)
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si*si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]

def sieveOfAtkin(end):
    """sieveOfAtkin(end): return a list of all the prime numbers <end
    using the Sieve of Atkin."""
    # Code by Steve Krenzel, <[email protected]>, improved
    # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0
    lng = ((end-1) // 2)
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end-1)/4.0)), 0, 4
    for xd in xrange(4, 8*x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end-1) / 3.0)), 0, 3
    for xd in xrange(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not(n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4-8*(1-end)))/4), -1, 0, 3
    for x in xrange(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end: y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x*x + x) << 1) - 1, (((x-1) << 1) - 2) << 1
        for d in xrange(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[:max(0,end-2)]

    for n in xrange(5 >> 1, (int(sqrt(end))+1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in xrange(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s  = int(sqrt(end)) + 1
    if s  % 2 == 0:
        s += 1
    primes.extend([i for i in xrange(s, end, 2) if sieve[i >> 1]])

    return primes

def ambi_sieve_plain(n):
    s = range(3, n, 2)
    for m in xrange(3, int(n**0.5)+1, 2): 
        if s[(m-3)/2]: 
            for t in xrange((m*m-3)/2,(n>>1)-1,m):
                s[t]=0
    return [2]+[t for t in s if t>0]

def sundaram3(max_n):
    # https://.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

################################################################################
# Using Numpy:
def ambi_sieve(n):
    # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html
    s = np.arange(3, n, 2)
    for m in xrange(3, int(n ** 0.5)+1, 2): 
        if s[(m-3)/2]: 
            s[(m*m-3)/2::m]=0
    return np.r_[2, s[s>0]]

def primesfrom3to(n):
    # https://.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns a array of primes, p < n """
    assert n>=2
    sieve = np.ones(n/2, dtype=np.bool)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = False
    return np.r_[2, 2*np.nonzero(sieve)[0][1::]+1]    

def primesfrom2to(n):
    # https://.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = np.ones(n/3 + (n%6==2), dtype=np.bool)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[      ((k*k)/3)      ::2*k] = False
            sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False
    return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]

if __name__=='__main__':
    import itertools
    import sys

    def test(f1,f2,num):
        print('Testing {f1} and {f2} return same results'.format(
            f1=f1.func_name,
            f2=f2.func_name))
        if not all([a==b for a,b in itertools.izip_longest(f1(num),f2(num))]):
            sys.exit("Error: %s(%s) != %s(%s)"%(f1.func_name,num,f2.func_name,num))

    n=1000000
    test(sieveOfAtkin,sieveOfEratosthenes,n)
    test(sieveOfAtkin,ambi_sieve,n)
    test(sieveOfAtkin,ambi_sieve_plain,n) 
    test(sieveOfAtkin,sundaram3,n)
    test(sieveOfAtkin,sieve_wheel_30,n)
    test(sieveOfAtkin,primesfrom3to,n)
    test(sieveOfAtkin,primesfrom2to,n)
    test(sieveOfAtkin,rwh_primes,n)
    test(sieveOfAtkin,rwh_primes1,n)         
    test(sieveOfAtkin,rwh_primes2,n)

स्क्रिप्ट परीक्षण चलाना कि सभी कार्यान्वयन एक ही परिणाम देते हैं।


First time using python, so some of the methods I use in this might seem a bit cumbersome. I just straight converted my c++ code to python and this is what I have (albeit a tad bit slowww in python)

#!/usr/bin/env python
import time

def GetPrimes(n):

    Sieve = [1 for x in xrange(n)]

    Done = False
    w = 3

    while not Done:

        for q in xrange (3, n, 2):
            Prod = w*q
            if Prod < n:
                Sieve[Prod] = 0
            else:
                break

        if w > (n/2):
            Done = True
        w += 2

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes.py

Found 664579 primes in 12.799119 seconds!

#!/usr/bin/env python
import time

def GetPrimes2(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*3, n, k*2):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes2(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes2.py

Found 664579 primes in 10.230172 seconds!

#!/usr/bin/env python
import time

def GetPrimes3(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*k, n, k << 1):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes3(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

python Primes2.py

Found 664579 primes in 7.113776 seconds!


Here is an interesting technique to generate prime numbers (yet not the most efficient) using python's list comprehensions:

noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)]
primes = [x for x in range(2, 50) if x not in noprimes]

You can find the example and some explanations right here


I collected several prime number sieves over time. The fastest on my computer is this:

from time import time
# 175 ms for all the primes up to the value 10**6
def primes_sieve(limit):
    a = [True] * limit
    a[0] = a[1] = False
    #a[2] = True
    for n in xrange(4, limit, 2):
        a[n] = False
    root_limit = int(limit**.5)+1
    for i in xrange(3,root_limit):
        if a[i]:
            for n in xrange(i*i, limit, 2*i):
                a[n] = False
    return a

LIMIT = 10**6
s=time()
primes = primes_sieve(LIMIT)
print time()-s

I know the competition is closed for some years. ...

Nonetheless this is my suggestion for a pure python prime sieve, based on omitting the multiples of 2, 3 and 5 by using appropriate steps while processing the sieve forward. Nonetheless it is actually slower for N<10^9 than @Robert William Hanks superior solutions rwh_primes2 and rwh_primes1. By using a ctypes.c_ushort sieve array above 1.5* 10^8 it is somehow adaptive to memory limits.

10^6

$ python -mtimeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.primeSieveSeq(1000000)" 10 loops, best of 3: 46.7 msec per loop

to compare:$ python -mtimeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes1(1000000)" 10 loops, best of 3: 43.2 msec per loop to compare: $ python -m timeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes2(1000000)" 10 loops, best of 3: 34.5 msec per loop

10^7

$ python -mtimeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.primeSieveSeq(10000000)" 10 loops, best of 3: 530 msec per loop

to compare:$ python -mtimeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes1(10000000)" 10 loops, best of 3: 494 msec per loop to compare: $ python -m timeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes2(10000000)" 10 loops, best of 3: 375 msec per loop

10^8

$ python -mtimeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.primeSieveSeq(100000000)" 10 loops, best of 3: 5.55 sec per loop

to compare: $ python -mtimeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes1(100000000)" 10 loops, best of 3: 5.33 sec per loop to compare: $ python -m timeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes2(100000000)" 10 loops, best of 3: 3.95 sec per loop

10^9

$ python -mtimeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.primeSieveSeq(1000000000)" 10 loops, best of 3: 61.2 sec per loop

to compare: $ python -mtimeit -n 3 -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes1(1000000000)" 3 loops, best of 3: 97.8 sec per loop

to compare: $ python -m timeit -s"import primeSieveSpeedComp" "primeSieveSpeedComp.rwh_primes2(1000000000)" 10 loops, best of 3: 41.9 sec per loop

You may copy the code below into ubuntus primeSieveSpeedComp to review this tests.

def primeSieveSeq(MAX_Int):
    if MAX_Int > 5*10**8:
        import ctypes
        int16Array = ctypes.c_ushort * (MAX_Int >> 1)
        sieve = int16Array()
        #print 'uses ctypes "unsigned short int Array"'
    else:
        sieve = (MAX_Int >> 1) * [False]
        #print 'uses python list() of long long int'
    if MAX_Int < 10**8:
        sieve[4::3] = [True]*((MAX_Int - 8)/6+1)
        sieve[12::5] = [True]*((MAX_Int - 24)/10+1)
    r = [2, 3, 5]
    n = 0
    for i in xrange(int(MAX_Int**0.5)/30+1):
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
    if MAX_Int < 10**8:
        return [2, 3, 5]+[(p << 1) + 1 for p in [n for n in xrange(3, MAX_Int >> 1) if not sieve[n]]]
    n = n >> 1
    try:
        for i in xrange((MAX_Int-2*n)/30 + 1):
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
    except:
        pass
    return r

I tested some active , i computed it with hungred millions number

The winners are the functions that use numpy library,

Note : It would also interesting make a memory utilization test :)

Sample code

Complete code on my github repository

#!/usr/bin/env python

import lib
import timeit
import sys
import math
import datetime

import prettyplotlib as ppl
import numpy as np

import matplotlib.pyplot as plt
from prettyplotlib import brewer2mpl

primenumbers_gen = [
    'sieveOfEratosthenes',
    'ambi_sieve',
    'ambi_sieve_plain',
    'sundaram3',
    'sieve_wheel_30',
    'primesfrom3to',
    'primesfrom2to',
    'rwh_primes',
    'rwh_primes1',
    'rwh_primes2',
]

def human_format(num):
    # https://.com/questions/579310/formatting-long-numbers-as-strings-in-python?answertab=active#tab-top
    magnitude = 0
    while abs(num) >= 1000:
        magnitude += 1
        num /= 1000.0
    # add more suffixes if you need them
    return '%.2f%s' % (num, ['', 'K', 'M', 'G', 'T', 'P'][magnitude])


if __name__=='__main__':

    # Vars
    n = 10000000 # number itereration generator
    nbcol = 5 # For decompose prime number generator
    nb_benchloop = 3 # Eliminate false positive value during the test (bench average time)
    datetimeformat = '%Y-%m-%d %H:%M:%S.%f'
    config = 'from __main__ import n; import lib'
    primenumbers_gen = {
        'sieveOfEratosthenes': {'color': 'b'},
        'ambi_sieve': {'color': 'b'},
        'ambi_sieve_plain': {'color': 'b'},
         'sundaram3': {'color': 'b'},
        'sieve_wheel_30': {'color': 'b'},
# # #        'primesfrom2to': {'color': 'b'},
        'primesfrom3to': {'color': 'b'},
        # 'rwh_primes': {'color': 'b'},
        # 'rwh_primes1': {'color': 'b'},
        'rwh_primes2': {'color': 'b'},
    }


    # Get n in command line
    if len(sys.argv)>1:
        n = int(sys.argv[1])

    step = int(math.ceil(n / float(nbcol)))
    nbs = np.array([i * step for i in range(1, int(nbcol) + 1)])
    set2 = brewer2mpl.get_map('Paired', 'qualitative', 12).mpl_colors

    print datetime.datetime.now().strftime(datetimeformat)
    print("Compute prime number to %(n)s" % locals())
    print("")

    results = dict()
    for pgen in primenumbers_gen:
        results[pgen] = dict()
        benchtimes = list()
        for n in nbs:
            t = timeit.Timer("lib.%(pgen)s(n)" % locals(), setup=config)
            execute_times = t.repeat(repeat=nb_benchloop,number=1)
            benchtime = np.mean(execute_times)
            benchtimes.append(benchtime)
        results[pgen] = {'benchtimes':np.array(benchtimes)}

fig, ax = plt.subplots(1)
plt.ylabel('Computation time (in second)')
plt.xlabel('Numbers computed')
i = 0
for pgen in primenumbers_gen:

    bench = results[pgen]['benchtimes']
    avgs = np.divide(bench,nbs)
    avg = np.average(bench, weights=nbs)

    # Compute linear regression
    A = np.vstack([nbs, np.ones(len(nbs))]).T
    a, b = np.linalg.lstsq(A, nbs*avgs)[0]

    # Plot
    i += 1
    #label="%(pgen)s" % locals()
    #ppl.plot(nbs, nbs*avgs, label=label, lw=1, linestyle='--', color=set2[i % 12])
    label="%(pgen)s avg" % locals()
    ppl.plot(nbs, a * nbs + b, label=label, lw=2, color=set2[i % 12])
print datetime.datetime.now().strftime(datetimeformat)

ppl.legend(ax, loc='upper left', ncol=4)

# Change x axis label
ax.get_xaxis().get_major_formatter().set_scientific(False)
fig.canvas.draw()
labels = [human_format(int(item.get_text())) for item in ax.get_xticklabels()]

ax.set_xticklabels(labels)
ax = plt.gca()

plt.show()

I'm slow responding to this question but it seemed like a fun exercise. I'm using numpy which might be cheating and I doubt this method is the fastest but it should be clear. It sieves a Boolean array referring to its indices only and elicits prime numbers from the indices of all True values. No modulo needed.

import numpy as np
def ajs_primes3a(upto):
    mat = np.ones((upto), dtype=bool)
    mat[0] = False
    mat[1] = False
    mat[4::2] = False
    for idx in range(3, int(upto ** 0.5)+1, 2):
        mat[idx*2::idx] = False
    return np.where(mat == True)[0]

It's all written and tested. So there is no need to reinvent the wheel.

python -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

gives us a record breaking 12.2 msec !

10 loops, best of 10: 12.2 msec per loop

If this is not fast enough, you can try PyPy:

pypy -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

which results in:

10 loops, best of 10: 2.03 msec per loop

The answer with 247 up-votes lists 15.9 ms for the best solution. Compare this!!!


My guess is that the fastest of all ways is to hard code the primes in your code.

So why not just write a slow script that generates another source file that has all numbers hardwired in it, and then import that source file when you run your actual program.

Of course, this works only if you know the upper bound of N at compile time, but thus is the case for (almost) all project Euler problems.

PS: I might be wrong though iff parsing the source with hard-wired primes is slower than computing them in the first place, but as far I know Python runs from compiled .pyc files so reading a binary array with all primes up to N should be bloody fast in that case.


The fastest method I've tried so far is based on the here function:

import itertools as it
def erat2a( ):
    D = {  }
    yield 2
    for q in it.islice(it.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = q + 2*p
            while x in D:
                x += 2*p
            D[x] = p

See this answer for an explanation of the speeding-up.


This is an elegant and simpler solution to find primes using a stored list. Starts with a 4 variables, you only have to test odd primes for divisors, and you only have to test up to a half of what number you are testing as a prime (no point in testing whether 9, 11, 13 divide into 17). It tests previously stored primes as divisors.`

    # Program to calculate Primes
 primes = [1,3,5,7]
for n in range(9,100000,2):
    for x in range(1,(len(primes)/2)):
        if n % primes[x] == 0:
            break
    else:
        primes.append(n)
print primes




primes