ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


An Interview with OpenBSD's Marc Espie

by Federico Biancuzzi
03/18/2004

Editor's Note: As with FreeBSD's ports and NetBSD's packages, OpenBSD's ports system is a compelling reason to use the system. Its designers and maintainers are, too often, unsung heroes. That's one reason Federico Biancuzzi sat down to interview OpenBSD's Marc Espie. Along the way, they discussed security, licensing, and future plans for the system.

Federico Biancuzzi: Can you give us a short introduction to yourself and your role as open source developer?

Marc Espie: Like most OpenBSD developers, I am very interested in the stability and robustness and security of the whole system. Which means that I do a lot of development outside of my own area: see bug, fix bug. It's as simple as that.

That said, as far as OpenBSD is concerned, I'm the main person responsible for a few main areas:

And I'm one of the guys who works on gcc and binutils on a continuing basis.

FB: After three months since the release of OpenBSD 3.4 the errata page already contains 11 patches. Most of them fixes kernel bugs. Do you think that in the future we could see more bugs in the kernel than in userland?

ME: I don't have any opinion. It's hard to predict the future, especially where bugs are concerned.

FB: Is this because of a lack of kernel developers? This could explain why someone claimed on full-disclosure that searching for "XXX" and/or "FIXME" strings in obsd kernel is a guaranteed way to locate a ring 0 vulnerability.

ME: This is a publicity stunt. Those kind of childish comments, made by people who don't even give their full name, never cease to amuse me.

If it were that easy to find and exploit vulnerabilities in OpenBSD, it would happen more often.

We could use more developers. There are things we currently do not do because we don't have the manpower for them. There's a scarce pool of talented developers, though. If those wannabe-hackers made a real try at improving our kernel source, that would be incredibly more useful than jibes of bugtraq. Of course, writing good kernel code (or userland code, for that matter) is much harder...

Related Reading

BSD Hacks
100 Industrial Tip & Tools
By Dru Lavigne

FB: A lot of work has been done removing unsafe string functions. What other activities are part of proactive source-auditing?

ME: Evangelism. If we remove unsafe string functions and other projects add them back, it's like Sisyphus' story. We have to rewrite the same patches again and again when we import some external software.

This is very important and central to OpenBSD: we deal with Unix programming. We're not in that to invent our own little interfaces that NOBODY else in the world has. We're in that to do Safe Unix programming, and introduce a few new techniques when they are necessary.

Evolve the OS, not Revolutionize it. This is in violent contrast to Linux.

We have had a lot of success explaining the issues and getting a lot of people to switch from strcpy/strcat to strlcpy/strlcat.

Weirdly enough, the Linux people are about the only major group of people that has constantly stayed deaf to these arguments. The chief opponent to strlcpy in glibc is most certainly Ulrich Drepper, who argues that good programmers don't need strlcpy, since they don't make mistakes while copying strings. This is a very mystifying point of view, since bugtraq daily proves that a lot of Linux and free software programmers are not that bright, and need all the help they can get.

(Considering the shining, flaming personality of Drepper, and the fact that he is always Right, this is not that surprising, though).

That said, the source audit is work never finished. It's more a question of process than of a specific bug being hunted. The situation goes like this:

FB: Could you talk about the new static bounds checker?

ME: The static bounds checker is mostly the work of Anil Madhavapeddy.

It was an idea of mine originally that he took much further than my idle thinking. My original remark was that code, such as:

char buffer[BUFSIZE];
...
read(fd, buffer, sizeof(buffer));

Often becomes:

char *buffer = malloc(BUFSIZE);
...
read(fd, buffer, sizeof(buffer));

At some step during development, this has become code that is, of course, wrong, and can often stay uncovered for months or years (if it's error-recovery code that doesn't get invoked very often, for instance).

Anil took it one step further and introduced an extension attribute to gcc: bounded, that can tie two function parameters, so that you can say, "Here is the buffer and the corresponding size, try to check that it fits."

With a few small changes to gcc, and with declaring that read is such a function, gcc is now able to detect erroneous code, such as:

char buffer[BUFSIZE];
...
read(fd, buffer, BUFSIZE * 2);

Among the same lines, we also added a sentinel checker that can verify that varargs functions such as execl(cmd, arg1, arg2, NULL) are indeed null-terminated (when the arguments are constant), a non-null checker, that verifies that a parameter cannot be a NULL pointer (to allow distinction between functions such as err(3), which takes a format string that can be NULL, and the standard library printf(3), which takes a format string that can NOT be NULL, and some other extra static checks.

The idea is that lots of interfaces in the systems have some extra conditions that are easy to enforce by the compiler, provided we add the correct annotations.

This is just an extension of the default behavior of gcc that does already some similar magic for printf and scanf.

FB: Could you talk about ProPolice and random loading libraries?

ME: ProPolice is a gcc extension developed by Hiroaki Etoh, from IBM, based on older concepts such as StackGuard. ProPolice makes several advances compared to StackGuard:

So then, what is ProPolice, and what does it do?

When you have a buffer allocated on the stack, pirates may try to overflow that buffer to change values beyond it and modify the behavior of the running program.

During compilation, ProPolice detects such buffers, and adds code that inserts a so-called "canary" beyond the end of the buffer (the canary image is a mining analogy. In olden times, miners used to get under the earth with a small bird in a cage. As long as the bird was singing, the air was not foul.), and checks that the canary is still alive and well at various moments.

Thus, ProPolice detects buffer overflows on the stack, and terminates the program early. It is useful as a prevention device, because one can get early warning of potential buffer overflows in programs, and fix the bug at the source, and also as a warning system that someone might be trying to enter your system.

Integrating ProPolice in OpenBSD has been hard work. ProPolice has found tons of bugs in various programs that shipped with the system. It's also been the first real-scale test of ProPolice itself. With a lot of hard work from Hiroaki Etoh and Miod Vallat (and Peter Valchev and Christian Weisgerber...). ProPolice itself modifies gcc a wee little bit. But, like most programs of its size, gcc itself is buggy, partly due to its gigantic design that is not quite sane in places. In a typical release of gcc, you don't see the bugs, because the corresponding code paths are never taken. Add ProPolice, and suddenly you're sending gcc through some dark venues that have seen less attention, and all of a sudden you are fixing actual, genuine bugs in gcc.

Now, if you pay attention, you'll realize that doing the same work for gcc 3.x [for] ProPolice is going to be necessary. Hopefully, at some point in the future, we'll get enough attention from the Free Software Foundation that ProPolice is going to be a standard part of gcc, and hence become as well or little tested as gcc proper...

FB: What is the plan for gcc3 introduction?

ME: gcc3 sets a big challenge, because it's sooo much slower than 2.95, compile-time wise. This is a big contending controversial issue with the Free Software Foundation. Basically, gcc3 puts a lot of nails in the coffins of elderly, slower architecture. One possible route (like NetBSD) would be to cross-compile those architectures, which will help kill them. A full build is a very good test suite for kernel behavior -- more bugs in pmap have been found and fixed by Theo de Raadt, Art Grabowski, Miod Vallat, and similar-minded people by making a release on a slow machine than by letting the machine just sit there and act as a firewall with a cross-compiled system. Another route is to use gcc3 on architectures that really require it, or don't mind the slowdown, and keep gcc 2.95 for other architectures.

There is some faint hope that maybe gcc 3.4 might almost be as fast as gcc 2.95 is...

That said, all of source compiles with gcc3, arches under development, such as arm and x86-64, are based upon gcc3, and I don't know yet whether sparc64 will ship a gcc2 release or a gcc3 release.

As usual, the project goes slowly and carefully.

Enough issues were fixed so that you could compile a whole system and run it with gcc3. That's what my laptop is running, in fact.

FB: Could you explain how W^X works?

ME: In simple words, if you can write stuff into memory, and then tell the processor, "Go execute that," then you have the basis for a security hole. You just need a buffer overflow or something similar to be able to write arbitrary code in memory, and go execute it.

W^X is, basically, being very selective about what areas get written to, and what areas are marked as executable. Make sure those areas are, as far as possible, distinct.

Traditional Unix already says that program text should not be writable. W^X involves changes in various areas of the system, most notably the kernel, and the dynamic loader, so that other areas are NOT left writable and executable at the same time.

For instance:

In order to implement this, it has been necessary to have program sections that differentiate between "normal program text" and "relocation area" in a very fine way. This has driven the switch of i386 a.out to ELF, because ELF allows for this kind of distinction, and a.out does not.

On many modern processors, one can directly mark memory as writable or executable.

Some other less-gifted processors (with a history of design bugs that plague them to this day), like Intel's family, don't have this distinction. That's to be expected: this distinction is only useful for security, and security is not a selling argument for Intel. The OpenBSD kernel then uses a hack that allows memory over a given boundary to be effectively marked as non-executable. Thus allowing the implementation of W^X. Neat trick, that.

FB: Someone of the PaX team asked for credits on your W^X system. What happened?

ME: Independent invention. Nobody in OpenBSD was even aware of PaX until *after* W^X happened. There are even substantial differences that show this.

FB: Systrace is included since the 3.2 release, but it seems not used anywhere. Only since 3.4 is there an option to protect ports building. Why?

ME: There are a few reasons for that. First, Systrace was developed by Niels Provos, who is no longer developing code for OpenBSD due to disagreements with our project leader, Theo de Raadt. Pity.

Second, Systrace has taken a bit of time to mature. There were a few issues with it that needed to be solved. Then, it's hard to have genuine example uses of Systrace in a "generic" system.

OpenBSD more or less provides Systrace as part of a toolbox that you can use to secure the system. Also, Systrace can be used to change the security model of Unix in a rather cataclysmic way. And that we won't do. We still run normal Unix, and changing the semantics of Unix privileges in such a fundamental way may have wide-ranging and unpredictable consequences.

Finally, ports-building under Systrace has been a long endeavor that has taken several people dozens of days to make work (most notably Nikolay Sturm). It's not surprising it is only now robust enough to actually be usable.

One small detail you forgot to mention is that a *lot* of system users and groups have appeared in OpenBSD over the last few releases... If you look very carefully, there are very few programs that are still setuid root in the system. This might look like a quiet change, but this is more important, security-wise, than Systrace.

FB: FreeBSD 5.x and NetBSD-current have introduced new code to handle threads. What is the plan for OpenBSD?

ME: 3.4 has switched from libc_r to libpthread. Most of the effort is focused on userland currently. Thread support in 3.4 is MUCH better than it was in 3.3, a lot of programs that need threads (MySQL, KDE, etc.) work MUCH better than they used to be. This is mostly due to Marco S. Hyman's work.

FB: What is a priority for future releases and what's going on in 3.4-current branch?

ME: The priority for future releases is to get stuff more stable, more usable, and more secure ... as usual.

Beyond that, gee.... My personal priority is to finish replacing the old pkg_* tools with a new set of improved pkg_* tools. The switch has already occurred in OpenBSD post-3.4, and so far, people are happy with the new tools. The new tools have a simpler design, are easier to debug, are more powerful, and will permit lot of things that were simply not possible to implement under a reasonable timeframe with the old tools.

Already, the new tools do more checks for consistency, have better behavior with respect to dependencies, and can extract stuff in place with the need for a temporary area.

Hopefully, there will be on-the-fly package signature checking, smart policies for adding packages through ftp or scp, and system updates before 3.5 is released. Some of these issues are hard to tackle, so don't hold your breath. It may take some time to be ready for public usage.

Speaking for myself, this is part of a larger plan to improve our ports and packages. I've been rewriting our infrastructure from the ground up, transforming bsd.port.mk into something manageable, efficient, sensible, and useful. Adding features such as fake building (ensure packages and ports install do the same thing), flavors and multi-packages (uniform encoding of variations in building), and, lately, full documentation for bsd.port.mk (there's a bsd.port.mk(5) manpage) and fast computation of dependencies (new in 3.4!!!) are just the tip of the iceberg.

The next natural step was to replace the pkg_* tools, as the old ones were hindering progress. This plan is advancing nicely.

FB: Are they developed with Perl?

ME: About the new pkg_* tools, yes, they are in Perl. And they're in the tree now, just grab a copy of current, and you'll be able to see them.

FB: Why does OpenBSD keep a branch tag also for ports? Don't you think that most people would prefer FreeBSD-like "always" updated ports?

ME: Two reasons:

The result is that, for each release, a very large subset of our ports collection builds correctly on a large array of machines. And we keep the infrastructure simple.

It's as simple as that -- we focus on quality and robustness. And we don't have the resources to keep it working over long periods.

Keeping important patches alive for past releases already takes enough resources as it is. We don't have enough competent people that would simply build the ports tree over several releases, analyze errors, and ensure that the ports tree builds correctly ... realize that there are over 2,000 packages built from the ports tree. At any one point, there are normally less than 10 packages broken on i386. A little bit more for PowerPC and sparc64, but not that many. And still a bit more on other architectures.

There is a core team (in an informal sense, there is no core in OpenBSD, just people that you notice because they do most of the work) responsible for handling most packages and keeping them working. You can see their work in commits, and that's at most two handfuls of people at any given point. Which amounts to each person being roughly responsible for tracking problems in 200 ports or more. Yes, we have a much larger number of maintainers. But most of these people don't have the resources nor the will to make sure their ports work and keep working on a wide set of architectures. If we could keep the number down to 10 for all architectures, then we'd be very happy already.

Several well-wishers have come forward in the past, and offered to do that.

Nothing but words. And we work with hard facts.

In comparison to FreeBSD, how many FreeBSD ports work at any given time? How many are broken? Looking at the FreeBSD mailing lists, I see a lot of lists of broken ports they are going to remove if no one steps forward and adopts them. And how many architectures does FreeBSD truly support?

I don't want to disparage their efforts, but I think our relative situations are very comparable. They chose to have an eternal ports collection and they pay the price in complexity and stability and manpower.

We have made a different compromise, and I much prefer our results: lots of reliable packages all the time.

FB: Is there any plan to provide binary updates, maybe using the ports system?

ME: We're thinking about it. It's not going to happen yet.

FB: Binary compatibility is strategic for a niche OS. It seems that writing compatibility would be a good way to discover what features you don't yet support. Of course, OpenBSD is a mature OS, but have you found any improvements that needed to be made?

ME: Binary compatibility is necessary to run a select few programs that software houses have not seen fit to port to OpenBSD. For instance, I run maple for Linux, adom for Linux, acroread (though gv works fine), and when I'm sick, Linux Java machines. I used to run Netscape, but we have a working Mozilla AND a working Konqueror now. I used to run RealPlayer, but MPlayer now does a fine job of it.

As far as I'm concerned, improvements that need to be made are more along the lines of actual standard compliance. You know, the stuff that is called POSIX, or ISO C, or Single Unix...

FB: With the many different machines supported by OpenBSD, how was the binary distribution available on the FTP server created? Did you have all machines available and do all the builds yourself?

ME: The binary distributions were made by following release(8) and ports(7) (see BULK PACKAGE BUILDING) and loads of careful testing. A few people did the builds for this release, mostly using their own machines, often donated to them by people friendly to the project, or bought using project's money. I personally didn't have time this round.

Obviously, there are some huge variations in build-time between a modern i386, which can rebuild src in under one hour, and an elderly Sun3 that needs a week to achieve the same result.

FB: It seems that there will be two new supported platforms: Pegasos and cats. Isn't this a resources dispersion? Have you enough manpower to handle this and still improve the most-used platforms (i386 and macppc and sparc)?

ME: OpenBSD is still improving, so I guess having two more platforms doesn't detract.

FB: What is the plan for SMP support?

ME: Some people are working on it. As usual with OpenBSD, we don't make announcements until things are ready. On the performance front, you can see interesting advances in Asymmetric Multiprocessing, more specifically the growing support for crypto-accelerators.

FB: Any plan to develop something like user-level Linux or virtualization extensions for the FreeBSD kernel? Or maybe add support for xen? (xen is a fast and free PC virtualizer, like VMware.)

ME: Not to my knowledge. But OpenBSD is a lot of volunteers' work, so I can't know what everybody is working on.

FB: A lot of attention is given to fix licenses and reduce the amount of GPLd software in the base system. Is there any chance for big-and-bad software like sendmail to be replaced? And gcc?

ME: Yes. This has already happened, in fact. We have replaced some fairly major pieces of non-BSD software with BSD stuff. To wit:

We might get traditional nroff at some point, and some other stuff, who knows?

FB: Another license war has started and it seems worse than before. Does OpenBSD really want to fork XFree starting from the last 4.4.0-RC2?

ME: Yes.

FB: Why do you think a lot of open source projects are modifying their license, limiting users freedom?

ME: Because they're not doing their technical job in the first place, and playing with the license gets them more attention, in the short term.

XFree changed their license because they thought they were not getting enough recognition. And they don't get enough recognition because they don't communicate enough. Also, because they have internal trouble.

I don't know all that much about the Apache license. What I know is that their code is not that great, from a security standpoint, and that we haven't updated to the new license.

The main issue is not exactly the *freedom* of the license, but the *proliferation* of different licenses. And the fact that most of them have not been written by lawyers. The XFree people tell us they haven't really changed their license, that what they do is not even new. Well, the wording is new. It isn't a tested license, and it's not 100% obvious what it says, exactly. A bit similar to the Apache license.

You've got two different worlds, here: the corporate world, where law is very important; and a portion of the free software world, who drafts licenses based on goodwill. Well, law is NOT goodwill. We don't know how the new XFree license will stand in court. The old one is better tested. And besides, there is not even a consensus in the XFree license.

Another issue is that the groups who write this software are not even closed. We do resent the fact that some member of the group (David Hawes) changed the license, almost unilaterally. There are hundreds of people, all over the place, who contribute to XFree routinely. Heck, I contribute to XFree: I test stuff, I find bugs, I give back changes to the imake infrastructure in there. None of us were consulted, not really.

If you allow the first change, who knows where the NEXT change will go?

Federico Biancuzzi is a freelance interviewer. His interviews appeared on publications such as ONLamp.com, LinuxDevCenter.com, SecurityFocus.com, NewsForge.com, Linux.com, TheRegister.co.uk, ArsTechnica.com, the Polish print magazine BSD Magazine, and the Italian print magazine Linux&C.


Return to the BSD DevCenter.


Copyright © 2009 O'Reilly Media, Inc.