The Power of Namespaces in Computing

In recent years, containers have become one of the most powerful tools in computing. Containers are essentially isolated areas of code that perceive the outside world as whatever they need it to be. Technologies like Docker have revolutionized software deployment and usage, allowing each containerized application to define its own environment. On Linux, Flatpak is gaining popularity because it addresses the issues of fragmentation and lack of standardization by allowing each application to specify its required system environment. Using relative paths instead of absolute paths is often recommended for making programs more portable. Similarly, we solved the problem of running out of IP addresses by using Network Address Translation (NAT), which splits one IP address into many. These are all examples of the fundamental concept of namespaces. ¹

At a basic level, a namespace defines the context in which a program operates. It is the “world” that a program sees. For example, in C++, when you use using namespace std;, you can access all functions starting with std:: without the prefix. Another example is the file system: after navigating to a directory with cd folder, you can access files within it without specifying the full path. Namespaces provide more than convenience; they enable multiple instances of the same program to run without resource conflicts.

Containers heavily rely on namespaces to achieve their isolation. Each container runs in its own set of namespaces, which isolate the container’s view of the operating system. This includes process IDs, hostnames, user IDs, file systems, and network interfaces. For example, the PID namespace ensures that processes inside a container cannot see processes outside of it, and the network namespace allows containers to have their own network interfaces and IP addresses. This is a godsend for many of my (and regrettably other people’s) spaghetti code. These programs use obscure techniques that sometimes rely on libraries that can only be installed on the system level, or that they hard code the path of their dependencies using absolute paths. Sometimes they require dependencies are conflict with other dependencies. These would be very difficult problems to solve otherwise, but by leveraging namespaces, we can use containers that create a self-contained environment that behaves as if it is running on a separate machine, while actually sharing the same underlying kernel with other containers and the host system. Containers are functionally just a virtual machine without the runtime cost of a virtual machine.

A key feature of namespaces is that they are managed by the system, not by the program. For instance, you don’t need to know the current directory’s full path to access a file within it-the operating system handles that. This abstraction simplifies programming and reduces errors. For example, in a well-structured build system, you can override the compiler by setting the CC environment variable. However, using a namespace like chroot allows you to override the compiler for any Makefile without modifying the code.

An interesting application of namespaces is NAT. Recently, I created a Linux network namespace to use a VPN for a specific application while keeping other apps unaffected. This was easily achieved with NAT: I connected the two network namespaces, set a default route inside the namespace, and let NAT handle the rest. Essentially, connections from within the namespace go through two layers of NAT, but routing remains transparent to both the client within the namespace and the server on the internet. No changes are required on either end.

With IPv6, this setup would be less trivial. Internet service providers usually provide a large /98 block, offering many addresses. However, without NAT, managing network namespaces becomes complex. The router would need to assign multiple IP addresses to my computer, but the exact number needed is uncertain. As more layers of network namespaces are added, you would run out of IP addresses long before exhausting other resources. An alternative would be a protocol to request additional IP addresses from upstream, but such a protocol, if it exists, would be less efficient than NAT. It would require proper configuration of all devices involved. In contrast, NAT is transparent to all actors in the chain, and the router doesn’t need to manage the virtual networks inside my computer, even if there are millions of IP addresses.

This is the power of namespaces. They allow each entity to have its own reality while seamlessly communicating with others.

I wrote this blog post, and then asked chatgpt to edit it for me. ↩