Numbers to English words

A classic course assignment is to write a program that prints out an arbitrarily-long decimal number in units, thousands, millions… and so on. This article looks at how to implement this efficiently in Modern C++, using head recursion. This is where the logic of a function is generalized so that it can usefully call itself a (finite) number of times. The more commonly found (and simpler to understand) form of recursion is tail recursion, where the result of the function (the return value) can be used to compute the final required value. In case you’re not up to speed on tail recursion, the classic example is a function which computes n! (that is, the factorial function from Mathematics).

With head recursion on the other hand, the result of the recursive function cannot usually be passed “down the line” using the return value, so some other method of storage is needed such as global state or a reference parameter. The following C++ fragment is a recursive function (using global state, ie. std::cout) that begins to give us what we are looking for:

unsigned long number = 987654321;
const char *triplets[] = { "", "thousand", "million", "billion" };

void number_to_words(unsigned long n, const unsigned triplet = 0) {
    if (n / 1000) {
        number_to_words(n / 1000, triplet + 1); // recursive call
    }
    if (n %= 1000) {
        std::cout << n << triplets[triplet];
    }
}

A complete program would use import std.core; (or #include <iostream>) and have a suitable main() program. Such a program is left as an exercise, and produces the output:

987million654thousand321

Often, writing recursive functions requires a “leap-of-faith” at the point of generalization, and with head recursion this has to occur at the start of the function. The last time the function is called (with arguments 987 and 2) produces the first output “987million” (the operation n %= 1000 has no effect as the value of n is under 1000 for this call). When this call returns, the next output is “654thousand”; the call returns again before outputting “321”. If at this point you see both the light and the logic, and know how to produce a working program, stop reading now and go right ahead! If, on the other hand, you’re head hurts or you’re struggling to see the logic, then maybe sit down with a pen and paper and step through the recursive calls, using enclosing rectangles for the call states of the function number_to_words().

Hopefully we can agree that we also need a function that outputs any number between one and nine hundred ninety-nine (inclusive), which we’ll call a triplet as it handles up to three digits. Here is a version which returns a std::string, and also calls itself (non recursively) in order to avoid unnecessary code duplication:

string number_triplet(unsigned n) {
    string s{};
    if (n / 100) {
        s += number_triplet(n / 100) + " hundred" + (n % 100 ? " " : "");
    }
    n %= 100;
    if (n < 20) {
        switch (n) {
            case 1: s += "one"; break;
            case 2: s += "two"; break;
            case 3: s += "three"; break;
            case 4: s += "four"; break;
            case 5: s += "five"; break;
            case 6: s += "six"; break;
            case 7: s += "seven"; break;
            case 8: s += "eight"; break;
            case 9: s += "nine"; break;
            case 10: s += "ten"; break;
            case 11: s += "eleven"; break;
            case 12: s += "twelve"; break;
            case 13: s += "thirteen"; break;
            case 14: s += "fourteen"; break;
            case 15: s += "fifteen"; break;
            case 16: s += "sixteen"; break;
            case 17: s += "seventeen"; break;
            case 18: s += "eighteen"; break;
            case 19: s += "nineteen"; break;
        }
    }
    else {
        switch (n / 10) {
            case 2: s += "twenty"; break;
            case 3: s += "thirty"; break;
            case 4: s += "forty"; break;
            case 5: s += "fifty"; break;
            case 6: s += "sixty"; break;
            case 7: s += "seventy"; break;
            case 8: s += "eighty"; break;
            case 9: s += "ninety"; break;
        }
        if (n % 10) {
            s += '-' + number_triplet(n % 10);
        }
    }
    return s;
}

The logic is fairly simple, number words are appended to an initially empty string, starting with hundreds (if any), tens (if any), and units in the style “one hundred twenty-three”; this follows American English convention. It is correct for all possible valid inputs (it also handles numbers from 1100 to 9999 as “eleven hundred” etc. but this is an unintended bonus which we don’t need, and won’t use).

So that’s us pretty much done, all that’s needed is to handle spaces between the millions and thousands correctly and have number_to_words() accept a std::string reference parameter. Also, we’ve made the triplets array a std::vector so that range-checked at() access is possible (this meaning the program won’t produce garbage output, ie. undefined behavior if the number is too big). To lessen the risk of this, numbers up to vigintillions are allowed, and to fully test this functionality, support for Boost’s cpp_int type is included via a macro test:

// numbers_as_words.cpp : output any positive integer in American usage English
// Version 1.01 (2020/09/12), MIT License, (c) Richard Spencer 2019

// Optionally compile with BOOST_CPP_INT defined to use arbitrary precision
// instead of long long as the Integer type

// Usage: Filename as single argument to test output from "one" to "one million"
//        Run without arguments to enter interactive mode

#include <iostream>
#include <fstream>
#include <string>
#include <vector>

#ifdef BOOST_CPP_INT
#include <boost/multiprecision/cpp_int.hpp>
using Integer = boost::multiprecision::cpp_int;
#else
using Integer = unsigned long long int;
#endif

using std::cin;
using std::cout;
using std::ofstream;
using std::string;
using std::vector;

string number_triplet(unsigned n) {
    string s{};
    if (n / 100) {
        s += number_triplet(n / 100) + " hundred" + (n % 100 ? " " : "");
    }
    n %= 100;
    if (n < 20) {
        switch (n) {
            case 1: s += "one"; break;
            case 2: s += "two"; break;
            case 3: s += "three"; break;
            case 4: s += "four"; break;
            case 5: s += "five"; break;
            case 6: s += "six"; break;
            case 7: s += "seven"; break;
            case 8: s += "eight"; break;
            case 9: s += "nine"; break;
            case 10: s += "ten"; break;
            case 11: s += "eleven"; break;
            case 12: s += "twelve"; break;
            case 13: s += "thirteen"; break;
            case 14: s += "fourteen"; break;
            case 15: s += "fifteen"; break;
            case 16: s += "sixteen"; break;
            case 17: s += "seventeen"; break;
            case 18: s += "eighteen"; break;
            case 19: s += "nineteen"; break;
        }
    }
    else {
        switch (n / 10) {
            case 2: s += "twenty"; break;
            case 3: s += "thirty"; break;
            case 4: s += "forty"; break;
            case 5: s += "fifty"; break;
            case 6: s += "sixty"; break;
            case 7: s += "seventy"; break;
            case 8: s += "eighty"; break;
            case 9: s += "ninety"; break;
        }
        if (n % 10) {
            s += '-' + number_triplet(n % 10);
        }
    }
    return s;
}

template<typename Number>
void number_to_words(Number n, string& s, const unsigned triplet = 0) {
    if (Number m = n / 1000; m) {
        number_to_words(m, s, triplet + 1);
    }
    const vector triplets = { "", "thousand", "million", "billion", "trillion", "quadrillion", "quintillion", "sextillion", "septillion", "octillion", "nonillion", "decillion", "undecillion", "duodecillion", "tredecillion", "quattuordecillion", "quindecillion", "sexdecillion", "octodecillion", "novemdecillion", "vigintillion" };
    if (unsigned t = static_cast<unsigned>(n % 1000); t) {
        s += (s.empty() ? "" : " ") + number_triplet(t) + (triplet ? " " : "") + triplets.at(triplet);
    }
}

int main(const int argc, const char *argv[]) {
    if (argc == 2) {
        ofstream file{argv[1]};
        for (Integer i = 1; i <= 1000000; ++i) {
            string s;
            number_to_words(i, s);
            file << s << '\n';
        }
        return 0;
    }
    Integer i{};
    cout << "Please enter a positive number (zero to quit): ";
    cin >> i;
    while (i) {
        string s;
        number_to_words(i, s);
        cout << s << '\n';
        cout << "Please enter another number: ";
        cin >> i;
    }
}

That’s almost there. Some changes would be needed to handle zero, negative numbers, British English usage (or Australian, South African) “one hundred and twenty-three”, or the common American preference for “thirty-seven hundred” over “three thousand seven hundred” as an example. Also you may wish to experiment with another arbitrary-precision integer type, which is possible as the function number_to_words() has been written as a template (generic) function.

Resources: Download or browse the source code

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s