Image Credits: Github
The Art Of O(1) Primality Testing
August 01, 2020
What we are going to discuss is not a new algorithm, but a great C++ trick.
Check out the following program:
class Sieve { public: bool isPrime[MAX+1] = {}; constexpr Sieve() { isPrime[2] = true; for(int i = 3; i <= MAX; i += 2) isPrime[i] = true; for(int i = 3; i*i <= MAX; i++) { if(isPrime[i]) { for(int j = i*i; j <= MAX; j += i) isPrime[j] = false; } } } };
What we see above is an implementation of the traditional Sieve of Eratosthenes with a slight modification for optimization.
The constexpr keyword
constexpr
is a C++ 11 feature that instructs the compiler to evaluate objects or functions declared as constexpr
at the time of compilation, enabling the evaluated expression to be used by other constant objects or functions!
In simple words, this implies instructing the compiler to perform the computations related to the given function or object during the compilation stage, rather than at runtime!
How exactly is compile-time evaluation advantageous over runtime evaluation? After all, the evaluation is going to take the same time whether done at compile time or runtime…
Good question.
What we need to observe is that once the code is finalized, it is compiled once and the resultant binary file/executable is executed multiple times.
Ergo, if there is a computation that requires to be done every time we execute the file, wouldn’t it be better if we could do it once and store the results so that we don’t have to compute it every time the file is run?
That’s exactly what we achieve with the help of the constexpr
keyword. The idea is to spend time during the compilation stage and cut down on the time spent during the execution phase.
But hey, memory is allocated while compiling the code. Where are all the computed primes stored?
Again, good question.
The code snippet shown above, computes the sieve for numbers up to MAX
during compilation and stores in a table 1s and 0s, depending on the primality of the numbers, right in the binary file.
To know how that looks like, refer to this: https://godbolt.org/z/h7r3K7
But with freedom, comes its limitations. To name a few:
- Functions declared as
constexpr
cannot be ofvoid
return type. - Certain operators, like the prefix increment operator, cannot be used inside
constexpr
functions. - Most importantly, there is a cap on the number of iterations that a loop, declared inside of a
contexpr
function, can support. The limit has been set in the neighborhood of 2.6x104 iterations.
The last restriction restricts us to compute sieves which involve no more than the allowed number of iterations. We can, however, work around this restriction by using the -fconstexpr-loop-limit
parameter during the compilation stage!
Putting it all together
The following piece of code compares the difference between compile-time and runtime evaluation of the sieve:
#include <bits/stdc++.h>
#define MAX 200000
using namespace std;
class Sieve
{
public:
bool isPrime[MAX+1] = {};
constexpr Sieve()
{
isPrime[2] = true;
for(int i = 3; i <= MAX; i += 2)
isPrime[i] = true;
for(int i = 3; i*i <= MAX; i++)
{
if(isPrime[i])
{
for(int j = i*i; j <= MAX; j += i)
isPrime[j] = false;
}
}
}
};
bool slowerSieve(int n)
{
return (Sieve{}).isPrime[n];
}
bool fasterSieve(int n)
{
static constexpr Sieve s;
return s.isPrime[n];
}
int main()
{
srand(time(NULL));
int size = 1000;
vector<int> numbers(size);
for(int i = 0; i < size; i++)
numbers[i] = rand() % MAX;
int ans = 0;
auto start_time = chrono::high_resolution_clock::now();
for(auto i : numbers)
ans += slowerSieve(i);
auto end_time = chrono::high_resolution_clock::now();
cout << "\t\tRuntime Evaluation: \nNumber of Prime numbers: " << ans << "\nTime elapsed (ms): " << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() <<"\n\n";
ans = 0;
start_time = chrono::high_resolution_clock::now();
for(auto i : numbers)
ans += fasterSieve(i);
end_time = chrono::high_resolution_clock::now();
cout << "\t\tCompiletime Evaluation: \nNumber of Prime numbers: "<< ans << "\nTime elapsed (ms): " << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() <<"\n";
return 0;
}
Please note: calling slowerSieve
for each integer in the vector causes a fresh calculation of the sieve each time.
Of course, recomputing the sieve a thousand times doesn't make sense, but this has been done intentionally, for the sole purpose of showing that the algorithm is not computing the sieve each time a call is made to the fasterSieve
function.
Output:
Runtime Evaluation:
Number of Prime numbers: 109
Time elapsed (ms): 1315
Compile-time Evaluation:
Number of Prime numbers: 109
Time elapsed (ms): 0
Conclusion
To conclude, yes it does check for primality in O(1) time complexity, but it comes at a cost. It not only increases the time needed to compile the program, but also causes a significant increase in the size of the built executable. Even though this trick finds little practical use, it's fascinating to know about it.
Thank you for stopping by!
If you find anything to be incorrect or have any suggestions, I would love to hear from you. Please send me an email at sarthak1607@gmail.com .