Vanishing Features of the 2.6 Kernel
Pages: 1, 2
Bye Bye Task Queues
The kernel often has to defer some tasks, scheduling them for later execution, though often as soon as possible. Commonly this is in interrupt service routines, where work is often divided between a "top half" and a "bottom half."
A typical top half stores incoming data and primes a device to be ready for new interrupts as quickly as possible. Other interrupts may even be disabled during this process, which highlights the necessity of quick execution. The bottom half performs less time-critical processing, such as filtering, copying to user space, etc. This helps increase device throughput.
Ancient kernels used a fixed set of 32 bottom half queues. These were superceded by the use of task queues, several of which were maintained by the kernel. Others could be created for specific purposes.
The task queue implementation has inherent limitations, and its use has been deprecated throughout the 2.4 series. For one thing, only one task queue could run at a time systemwide. For another, most task queues ran out of process context, which made it impossible for them to sleep. It also exposed them to all the other vulnerabilities of code running at "interrupt time."
The 2.4 kernel introduced tasklets as a partial replacement. Multiple tasklets of different types can run simultaneously. Tasklets always run on the CPU that scheduled them, which minimizes cache thrashing and, since it serializes things, simplifies re-entrancy problems and race conditions. However, tasklets still run in an interrupt-like context.
In the 2.5 kernel, task queues were gradually minimized and written out of existence. A new replacement called work queues took their place. Tasks are still placed on queues, but they run in process context, which permits sleeping. Unlike task queues, each work queue is tied to a set of threads, one per CPU, so a sleeping task doesn't block other work. One can also specify a minimum time period before the task is performed.
As a wrinkle, only GPL-licensed modules will be able to create their own work queues. Other modules will have to live with a default work queue maintained by the kernel.
In-Kernel Web Server Acceleration
Web servers often dish up static file content, and each request requires at least two system calls and context switches. The 2.4 kernel included the khttpd in-kernel Web accelerator to handle such requests directly within the kernel. More complicated requests and requests with questionable security were passed off to the external Web server, such as Apache.
khttpd could raise the number of handled requests by a factor of two-to-five, a later kernel patch called TUX (also known as the Red Hat Content Accelerator)
by Ingo Molnar of Red Hat, achieved much higher speeds and included more
advanced features. TUX can coordinate with kernel and user space modules and
daemons that provide dynamic content. It also provides caching and can send a
mixture of dynamically generated data. Pre-generated objects also can be sent.
TUX uses zero-copy networking, can run its own CGI engine, and can be used as
an ftp server.
As a result
khttpd became less popular and less maintained. Many developers
assumed TUX would take its place in the 2.6 kernel.
At the same time, important advances in user-space Web servers have helped them to reach performance levels previously available only to in-kernel Web accelerators. For example, the x15 server from Chromium uses a small pool of threads (4-8 per CPU) to control all network connections and network and disk I/O. Real time signals notify the server whenever data appears on a socket or when output is possible on the connection; no polling ever occurs. x15 avoids launching a thread for each connection and also benefits from zero-copy networking and other kernel enhancements. Dan Kegel has written an excellent summary of some of the issues involved in what he calls the C10K problem.
Many developers had been unhappy about pushing Web services into the kernel, feeling it was a slippery slope. Why not absorb all sorts of user-space facilities inside the kernel? Returning these features to user-space is thus quite welcome. TUX can also be applied as a patch and is available on all Red Hat systems.
The three issues we have highlighted are likely to affect a considerable number of users. Other worthy changes include:
kdev_t, which encodes device major and minor numbers, has morphed from what was effectively a 16-bit quantity into a structure. Eventually the structure will contain more device-specific information. The major and minor bit fields should expand to a total of 32 bits, which will permit more devices to be registered uniquely.
The API for block drivers has undergone a significant overhaul as part of the major enhancement of I/O operations.
pcibiosfunctions have been exterminated.
kiobuf, kiovecmechanism of pinning down user pages to permit direct access has been replaced by the
Kernel building and its interface have been reworked. The old Tcl/Tk graphical interface to
xconfighas been replaced with a prettier and more functional GUI based on the QT graphical libraries.
Housecleaning is almost an obsession in Linux: features which have grown old or weak are euthanized, good ideas which no one ever used are obliterated, and sometimes mistakes are surgically removed before they grow out of control. This helps keep the kernel lean and understandable. Its growth in size is probably entirely due to new hardware devices and architectures, rather than new general features. Perhaps other deprecated features are yet to be removed before the new kernel debuts.
Return to the Linux DevCenter.