Modules Are Not the Move (Yet?)

In the post where I outlined my goals in refactoring my engine, one of my goals was to try out cutting edge language features. Specifically, I wanted to try out Modules, an intriguing new language feature, standardized in C++20, that will bring compilation/linking of C++ programs into the modern era. After messing around with them for a while using MSVC, I’m not convinced they are ready for adoption. In this post I’ll talk about the features I was excited for and the reasons I will not be be using Modules yet.

What Are Modules?

The best, least jargony, description of Modules I have seen so far is from a vector of bools blog post. It says,

C++ modules are slated to be the biggest change in C++ since its inception. The design of modules has several essential goals in mind:

  1. Top-down isolation - The “importer” of a module cannot affect the content of the module being imported. The state of the compiler (preprocessor) in the importing source has no bearing on the processing of the imported code.
  2. Bottom-up isolation - The content of a module does not affect the state of the preprocessor in the importing code.
  3. Lateral isolation - If two modules are imported by the same file, there is no “cross-talk” between them. The ordering of the import statements is insignificant.
  4. Physical encapsulation - Only entities which are explicitly declared as exported by a module will be visible to consumers. Non-exported entities within a module will not affect name lookup in other modules (barring some possible strangeness with ADL. Long story…)
  5. Modular interfaces - The current module design enforces that for any given module, the entire public interface of that module is declared in a single TU called the “module-interface unit” (MIU). The implementation of subsets of the module interface may be defined in different TUs called “partitions.”

To paraphrase, Modules will move us away from copy and pasting header files into every other file that includes them, and that has a lot of implications!

The Good

For my use cases, the main benefit of Modules will be optimizations they enable when working with libraries and containers implemented entirely in headers.

Working with external libraries when writing C++ is often a major headache. To ensure ABI compatibility, you typically have to build libaries from source which can involve painful compilation errors and dealing with very limited documentation to help when things go awry. In response to this, a lot of libraries opt to instead be “header only”, meaning the entire library is written in header files and does not need to be compiled. You can just #include it where needed and it will “just work”.

The nlohmann json library is a perfect example of this. This library makes adding json support to any C++ project a breeze. Just download a couple headers and json support is only a #include away. There’s a cost to this, though. The library’s json.hpp file is just under 25k lines of code at the time I am writing this. Certainly we can use the json_fwd.cpp file to forward declare types from the library and limit how many places we include the full 25k line header, but we probably are going to have multiple translation units that want to handle json. Each TU will then need to include the full header, parse it, and compile it. This is not great for compile times.

What if we only had to parse the header once? What if the templated code could be compiled to an abstract syntax tree-like format to be reused by each TU?

Modules offered the promise of providing both of these, drastically speeding up compilation times.

The Bad

Very Limited Compiler Support

Modules were standardized in 2020, but at the time of writing this (June 2024), only one compiler, MSVC, is declared to have full support for Modules. If you take the plunge and modularize your projects, you are locked to only being able to use the compilers that support them. While I’m certain that other compilers will soon have support for Modules as well, it is looking like it will be quite a while until the feature matures and works consistently across all compilers.

Code Browsing Around Modules is a Work in Progress

As projects become large, a good indexer becomes a godsend. Having the ability to confidently identify usages of functions/variables, step into functions, refactor code, etc makes a big difference in developer productivity. Using MSVC, I was frequently underwhelmed by Intellisense’s performance around modularized code. It struggled to recognize symbols defined in modules, provide auto complete suggestions, and sometimes required rescanning files for it to stop flagging usage of external code as undefined. Maybe it’s a minor complaint, but if I’m going to stop using header files in favor of modules, I want to be able to still have my IDE function as normal.

Its Difficult to Find Answers to Questions About Modules

As a result of Modules being very new, not many people are using them and there is not a wealth of questions and answers about them online. When I ran into an issue or something I didn’t understand about them I had a lot of trouble finding help. I expect this to improve over time, but right now it’s rough out there.

For posterity, I found these sources to be the most helpful in my journey to understand and use modules:

A three part series on Modules by vector of bools

A practical tutorial on how to actually get started with them by Microsoft

Can You Trust Them to Work?

I was most intrigued by Modules’ ability to optimize the compilation of templated code by not requiring each TU to fully recompile the generic functions. As such, I immediately set out to test its capabilities and see if it really could handle the complexities of generic programming. Unfortunately, one of the first things I tested out, explicit template specialization, resulted in link errors, or incorrect behavior if it managed to build. I created a bug report but I’m not hopeful it will be addressed any time soon.

This gives me great pause. If this is broken in Modules, what else is? How can I write any templated code and have confidence it will work?

I don’t have an answer to these questions. The times the explicit specialization code did manage to build but exhibited incorrect behavior at runtime really scares me. This was a toy example, but if these were to happen in a much more complicated implementation it would be a nightmare to diagnose and debug.

Conclusion

To use Modules right now you need to be comfortable being on the absolute bleeding edge. If you are trying to make something that will be maintainable and correct, I think it’s too early to use them. I still have high hopes for them, I just don’t think we are there yet. Maybe around the time C++26 is implemented we’ll start to see more mature tooling emerge around Modules? 🤷