iDeviceTech

 

A Mach -O System

In the '80s, a team at Carnegie Mellon University began working on a next-generation UNIX system. Traditional UNIX systems, like the 4.2BSD system that they were using at the time, had a single process for the kernel. Everything that the kernel is responsible for was part of a single binary, with no protection between the various parts. The goal of Mach was to separate out all of the bits and provide a mechanism for joining them together.

A full discussion of this philosophy and its advantages and disadvantages would take up much more space than I have for this article, so I'll simplify things somewhat. The microkernel approach, taken by Mach, was not a great success. In the end, they had a version of the BSD kernel running as a process on top of their microkernel and calling down to Mach instead of executing privileged instructions.

This had some advantages. One was that every user on a powerful system could have his or her own, completely isolated, BSD kernel and userland. Effectively, it was an early form of paravirtualization.

It also had some major disadvantages. The biggest of these was speed. On a modern system, there are some trades between speed and scalability. A modern laptop has two to four cores, and a desktop may have 16. Cheap servers are starting to have huge numbers, just as high-end servers have for a decade or so. Back in the '80s, however, when Mach was created, single-processor machines were the norm. On a modern SMP system, if you split your code up into separate processes, then you get some overhead from their communication and some speed increase from running them in parallel.

Mach just got the slowdown. To make matters worse, it got a lot of this slowdown. In Mach, there was only one form of interprocess communication: message passing. This is a very clean abstraction, but Mach managed to pick exactly the wrong way of implementing it. Every Mach message send required checking the sending and receiving port access rights and some complex memory mapping operations, which resulted in a Mach message send being much slower than a system call. On Mach-based UNIX systems, a UNIX system call meant sending a message from the userspace process to the BSD process, which may then send more messages to other processes, and then waiting for the reply.

The XNU kernel uses Mach at the core. This is one of the things that everybody knows about OS X, but it's quite misleading. Unlike most other Mach-based operating systems, such as GNU HURD (which, contrary to popular belief, does exist), XNU implements UNIX system calls in the same way as a BSD system. You set some registers and issue a system call instruction (or an interrupt) and have the kernel code called directly from the trap handler. You can see this quite easily in Activity Monitor. The vim process that I am using to write this has, so far, issued 64K UNIX system calls, but only 88 Mach system calls and has sent and received less than 200 Mach messages.

So what does XNU use Mach for? The simplest answer is as an abstraction layer, along the lines of the Windows HAL, but this doesn't tell the entire story. The closest analog in a modern system is the nanokernel in the Symbian microkernel. Mach is responsible for managing the CPU and memory and for providing some basic abstractions to the top of the kernel, such as tasks and threads.

The BSD part of the kernel calls down to the Mach part, but can call these functions directly, not requiring a Mach message. This means that most of the disadvantages of Mach are eliminated. So are Mach messages still used? Wave the mouse over an application while inspecting it and Activity Monitor, and you get an immediate answer: Yes.

Mach messages are used for a lot of things in OS X, not least of which is the delivery of events from the user. Other processes are also free to use Mach-level IPC. This has some advantages—for example, the fact that it's easy to check the other party in Mach communication, which can be useful when implementing things like the Keychain, which rely on the ability to permit communication based on the program making the request.