So we all know that a prime number has only two factors, being one and itself? A first attempt at writing a function to tell if a given number is prime might look like this:
constexpr bool isPrime(int n) {
for (int i = 2; (i * i) < n; ++i) {
if (!(n % i)) {
return false;
}
}
return true;
}
The structure is straightforward, we loop over the semi-open range [2,√n) and break on the first occurrence (if any) of the loop counter being a factor of n.
However this is inefficient as non-prime numbers are needlessly checked for being factors, to avoid this we would need a list of prime numbers handy, again up to √n. (We’ve also restricted ourselves to checking int
s.) Of course, being C++ programmers we don’t say global state, we say data encapsulated by a class. Here is a sketch of a suitable class:
template <typename T>
class Primes<T> {
vector<T> primes_cache{ 2, 3, 5, 7, 11 /* ... */ };
public:
bool isPrime(T n) {
for (const auto& p : primes_cache) {
if (!(n % p)) {
return false;
}
if ((p * p) >= n) {
return true;
}
}
throw runtime_error("Insufficient primes in cache");
}
};
Here we loop over the primes_cache
vector using a range-for loop, again checking for factors. Importantly, we use a generic class introduced with the template
keyword; a correct definition of a Primes object would thus be:
Primes<unsigned> primes{};
A way to use the single member function would be:
bool twenty_five_isprime = primes.isPrime(25);
However we’re still not done, as we need a way to populate the cache as needed to avoid throwing an exception. To avoid code duplication, this would need to call isPrime()
which could result in infinite recursion if we’re not careful. Here is a program which demonstrates the complete class:
#include <vector>
#include <stdexcept>
#include <iostream>
using namespace std;
template <typename T = unsigned>
class Primes {
vector<T> primes_cache{ 2, 3 };
typename vector<T>::size_type cache_step;
void generate(typename vector<T>::size_type step) {
auto new_size = primes_cache.size() + step;
auto p = primes_cache.back();
while (primes_cache.size() < new_size) {
if (isPrime(++p, false)) {
primes_cache.push_back(p);
}
}
}
public:
Primes(typename vector<T>::size_type step = 1000)
: cache_step{ step } {
generate((step > 2) ? step - 2 : step);
}
template <typename U>
bool isPrime(const U& n, bool gen = true) {
while (gen && ((primes_cache.back() * primes_cache.back()) <= n)) {
generate(cache_step);
}
for (const auto& p : primes_cache) {
if (!(n % p)) {
return false;
}
if ((p * p) >= n) {
return true;
}
}
throw runtime_error("Insufficient primes in cache");
}
};
int main() {
Primes<unsigned long long> primes{};
for (unsigned i = 2; i < 200; ++i) {
if (primes.isPrime(i)) {
cout << i << ' ';
}
}
cout << '\n';
}
Line 10 sets the default type for the primes cache; note this can be different from the type of the number being checked which is line 28. Lines 14-22 define a new private member function called generate()
which populates the primes cache, calling isPrime()
as needed with the generate flag set to false. Lines 29-42 are similar to code seen before, with primes_cache.back()
being the value of the largest prime in the cache.
For maximum safety (that is, minimal risk of integer overflow) no types are hardcoded into the definition of this class, in particular the local variables are declared auto
and cache_step
uses type vector<T>::size_type
. This is good C++ practice, and something you should emulate in your own code; any incompatibilities can most likely be resolved at compile-time. Of course, we are free to use arbitrary precision number types instead of unsigned
or unsigned long long
, but while correct these will probably not be useful in practice with current hardware as they will be too slow.
The other consideration is multithreading: this class is not thread safe as there are no checks against writing to primes_cache
. This could be solved quite easily using a mutex member variable and a lock guard for the while-loop in the generate()
function, thus allowing multiple threads to call isPrime()
and potentially achieving quicker performance on multi-core machines.
If you’re interested in creating a list of known primes another way, there is the Sieve of Eratosthenes which could be a use case for std::vector<bool>
. Of course, it is always good practice to benchmark against an alternative method with regards to space and time.
Resources: Download or browse the source code