What future for C++?

For the first new blog post of 2022 I thought I’d compare Modern C++ with some of its contemporary languages in terms of syntax and library support. No language exists in a vacuum, and (almost) all languages borrow ideas from each other; the other four languages I picked were (in no particular order): Rust, D, Swift and Kotlin.

All four of these are very much C family languages (curly braces and semi-colons) targeted to different domains. Rust is pitched as a systems programming language and a competitor to C in terms of performance (and is currently being reviewed as a possible second implementation language for the Linux kernel). D is a competitor to C++, being similarly a development of plain C into a true object-oriented language, albeit begun decades after C++ was born (and with a design decision for D2 to abandon backwards compatibility of syntax). Swift is a recent managed (commercial) language from Apple pitched as the successor to Objective-C, which compiles via SIL (Swift Intermediate Language) and the LLVM infrastructure to object code. Finally, Kotlin is a language which targets the JVM meaning it is ideally suited to developing Android apps, this language and eco-system being fully supported by Android Studio.

Continue reading “What future for C++?”

Writing a pseudocode compiler (6) – Front end

With the scanner and parser classes complete, we need to create a main program which uses them both. The parser needs to know about the scanner (the parser class has a member variable which is a pointer to the scanner object), so the scanner needs to be created first. A C++ flex scanner reads by default from the standard input, and whilst member functions for switching streams are available, it was decided that using member function std::istream::rdbuf() to link a disk-file to std::cin was a better solution, and similarly for std::cout where class Tree‘s member output object is hard-wired:

Continue reading “Writing a pseudocode compiler (6) – Front end”

Writing a pseudocode compiler (5) – Rules, statements and expressions

In this part we’re going to look at how the bison grammar rules, specified in the file “grammar.y”, can be made to match against the stream of tokens which the parser requests (this process being called syntax directed translation). Ultimately we want to output correctly generated code by through walking the abstract syntax tree, created in memory by the parser program we wrote. We’ll also take a look at how the global symbol table plays a part in early type checking, as the syntax tree is being constructed.

Continue reading “Writing a pseudocode compiler (5) – Rules, statements and expressions”

Writing a pseudocode compiler (4) – Generating a parser

Using a “parser generator” such as bison means that the framework for creating this key part of the front-end is more rigidly imposed that if it were written from scratch. On the other hand, it also means that some of the hard work is already done for us. If the compiler back-end has to output assembly language or object code, the temptation would be to rewrite the compiler in its input language. (In fact this action is known as “bootstrapping”, and many professionals would only take a programming language (and its compiler) seriously, if such an action is feasible.) However it may be desirable to keep the original “bootstrap” compiler up-to-date as the language is developed further.

The bison program “grammar.y”, written for this project is quite large, at just under 700 lines, so documenting every part in detail is beyond the scope of this mini-series. However, we’ll pick out a few key parts to describe and the remainder should be able to be understood from examining this file.

Continue reading “Writing a pseudocode compiler (4) – Generating a parser”

Writing a pseudocode compiler (3) – Generating a scanner

This article takes a look at the (only) part of the compiler which directly processes the source text (or “source code”), this being the “scanner”. Some amount of theory lies behind the pattern-matching actions of this part of a compiler, however unless you have a need to implement from scratch (something which can be true for commercial-grade compilers) you can safely follow best practices by employing a “scanner generator”.

The purpose of a (hand-written or semi-auto-generated) scanner is to convert (or reduce) textual patterns into a stream of numerical tokens. It really is as simple as that. (Well, almost!) A working knowledge of regular expressions (“regex”) is really a prerequisite, although patterns for common usages, such as floating-point representation of numbers, can be found and utilized without the need to produce them off the top of your head. The textual patterns which are matched by the regex(es) have a special name: lexemes.

Continue reading “Writing a pseudocode compiler (3) – Generating a scanner”

Writing a pseudocode compiler (2) – Abstract syntax tree

In this article we’ll look at some of the design decisions to be made when implementing an abstract syntax tree in C++, called “abstract” because of being a (slight) simplification of the source text. There is no bison (or flex) code involved, just pure C++ and textual output of Javascript; the AST has to contain sufficient information about the source to enable generation of the output code (this is almost true, usually at least a symbol table is needed as well, this is covered in a later article).

Firstly to name of the base class of the class hierarchy I chose class Tree as that is what it represents, other common names could be Node or AST. I decided against using smart pointers, or even a destructor, so memory claimed via new is only released when the program exits. (This would probably be the case even if we used the obvious choice of std::shared_ptr, so little would be gained except to slow the compiler down slightly.) Here is the relevant part of the class definition:

Continue reading “Writing a pseudocode compiler (2) – Abstract syntax tree”

Writing a pseudocode compiler (1) – Setting the scene

Having an interest in computer science usually means that one gravitates towards use and implementation of compiled languages. (Writing programs that write programs is more fun than writing programs!) Having explored the use of flex and bison by generating an expression evaluator (calculator program) in a previous mini-series I wanted to progress onto implementing a full compiler. At the time of writing the compiler is feature-complete (for the pseudocode specification which was used – see under “Resources” below), however it has several known (and probably more unknown) bugs and needs a slight code clean-up. The source code (with Linux and Windows build scripts) and a 32-bit Windows .exe are available to download from this page.

Continue reading “Writing a pseudocode compiler (1) – Setting the scene”

A use case for perfect forwarding

The Standard Library classes (such as std::string and std::vector) are not usually regarded as being able to be derived from. They are little black boxes whose functionality is clearly specified in the C++ Standard and which should not be redefined. However, let us put this to one side for a moment and consider that it should be possible, if not always desirable.

Say we want to add two new member functions to std::string, being utf8at() (which takes a index and returns a char32_t), and an overload to append() (which takes a char32_t). There are a number of ways of doing this:

Continue reading “A use case for perfect forwarding”

Calling C++ code from Kotlin

In case you didn’t already know, Kotlin is a fairly new yet surprisingly mature programming language that targets the JVM (Java Virtual Machine). Simply put, Kotlin compiles to the same bytecode that Java compiles to (compilation to JavaScript is also supported). Kotlin is fully supported for writing apps using Android Studio, hence its surge in popularity, and can call native C/C++ code in the same way that Java can, through the JNI (Java Native Interface), being the subject of this article.

The header file jni.h is a prerequisite for creating a suitable C++ module (actually a DLL .dll under Windows or a Shared Library .so under Linux/Android). This is in fact a C header, which is fully compatible with C++ too. However, the goal being C++ called from Kotlin, you will be pleased to learn that no actual C or Java code is required. However, lets take a look at the client (caller) code before looking at the native (callee) code (apologies for lack of syntax highlighting):

Continue reading “Calling C++ code from Kotlin”

A std::format primer

As mentioned in a previous post, the popular C++ {fmt} library has been added to Modern C++ (in the form of the standard header <format>). This library’s development into standardization has seen a number of considerable improvements over the original, which of course remains available to non-bleeding-edge versions of compilers. The latest Microsoft C++ compiler (version 19.29 at the time of writing) supports std::format and supporting functions (by using either #include <format> or import std.core; with /std:c++latest in the options). This compiler was used to test the example code below, also requiring using namespace std;.

Continue reading “A std::format primer”

C++ folding expressions

Modern C++ has support for passing parameter packs to functions. What may not be apparent is that in addition to forwarding them to other functions, or recursively to the same function, they can be manipulated in a type-safe manner by the same operation applied to each one in turn. The concept of folding (as this application is known) may be familiar to programmers coming from a functional programming background, or to those with a background in Math.

In C++, folding expressions come in left and right forms, both unary and binary. They all operate on a parameter pack, with the expansion being implied by the use of an ellipsis (...). The binary left form can be demonstrated using std::cout inside a variadic function:

Continue reading “C++ folding expressions”

Templates in C++ primer (4)

In this article we’re going to look at an application of Substitution Failure Is Not An Error (SFINAE). The use case described actually uses SFINAE twice, firstly to create a traits template very similar to the one already seen in this mini-series, and secondly to switch a function template instantiation on or off.

So let’s first define the problem. Other languages allow a class to define a .toString() method (or similarly named, Python uses .__str__()) in order to allow output of objects of that class. As a starting point we might try to define a template function along the lines of:

template<typename T>
ostream& operator<< (ostream& os, const T& obj) {
    return os << obj.to_string();
}

This naïve approach might work depending on your library implementation and code structure; problems arise when (as you might guess) an ambiguity occurs as to whether template instantiation should be attempted.

Continue reading “Templates in C++ primer (4)”

Templates in C++ primer (3)

In this article we’re going to look at how to output a std::tuple to a std::ostream, such as the familiar std::cout. The method used is an example of TMP (template metaprogramming), where the compiler generates code for us at run-time. In this case we need it to output every element of a std::tuple (with a separator) in a fully generic (any combination of types and tuple sizes) and type-safe way (as we would expect from C++).

But let’s not get ahead of ourselves. Firstly, let’s appreciate that templates can be used to output a std::pair (the element type for all associative containers, thus giving this code a use case). The actual effort involved is not great, literally just a one-line function (template):

Continue reading “Templates in C++ primer (3)”

Templates in C++ primer (2)

In the first part of this mini-series we looked at the different types of template parameter and the syntax involved in using them. In this part we’re going to look at two idioms that involve use of templates in C++, both of which have acronyms: CRTP and SFINAE.

Curious Recurring Template Pattern

From the earliest days of the availability of templates for C++ (around the mid-nineties, shortly before the publication of the C++98 standard), CRTP has been so named and known about. The pattern it describes is not difficult to comprehend – a base class has a template type parameter which is the type of a (single) derived class. What does take some explaining is that this has a use case in fully describing compile-time polymorphism (as opposed to run-time polymorphism which uses virtual functions, or compile-time “duck-typing” which is the usual behavior of template type parameters).

Continue reading “Templates in C++ primer (2)”

Templates in C++ primer (1)

Types of template

There are three different types of templated entity in C++, they are:

  • Class templates
  • Function templates
  • Variable templates

Of these, class templates are the most flexible as they can optionally be both fully or partially specialized. Function templates (including member function templates) can be optionally (only) fully specialized. A template declaration or definition differs from a normal declaration or definition by being prefixed with: template<parameters...>

Types of template parameters

There are (again) three different types of template parameters in C++, they are:

Continue reading “Templates in C++ primer (1)”