oreilly.comSafari Books Online.Conferences.


Linux Compatibility on BSD for the PPC Platform: Part 5
Pages: 1, 2, 3

Emulating the U-dot zone

Linux's gdb was able to load, present its prompt to the user, and it was able to display the online help, but it was not possible to do anything actually useful with it. It was even impossible to launch a Linux program. This was not suprising because the Linux ptrace() system call emulation was not yet implemented for the PowerPC.

In Unix systems, the ptrace() system call is used almost exclusively by debuggers such as gdb. It provides facilities for reading or writing the CPU register values during the program execution, stepping the program, and reading or writing the memory allocated to the traced program. All these operations are requested through ptrace() commands such as PEEKTEXT, POKETEXT, GETRETGS, SETREGS, and so on. You can have a look to the ptrace (2) man page if you want more information about ptrace() commands.

Linux ptrace() emulation is split into two parts. On one hand, a machine-independent part, located in sys/compat/linux/common/linux_misc.c, and on the other hand, a machine-dependent part linux_sys_ptrace_arch(), activated through the LINUX_SYS_PTRACE_ARCH macro in sys/compat/linux/common/linux_trace.h.

The linux_sys_ptrace_arch() function is located in sys/compat/linux/arch/powerpc/linux_ptrace.c for the PowerPC. The machine-independent part can handle some commands, such as reading or writing to the traced process memory, by calling the NetBSD native ptrace implementation: sys_ptrace().

The machine-independent part of ptrace() emulation, linux_sys_ptrace_arch() should ideally implement the PEEKUSER, POKEUSER, GETREGS, SETGREGS, GETFPREGS, and SETFPREGS. The easier way to write the linux_sys_ptrace_arch() function for the PowerPC was obviously to pick up the i386 version, and change what was really machine dependent. This includes all reference to CPU registers, and all references to data structures that do not exist on the PowerPC, for instance, the u_debugreg field of struct linux_user.

Operations on registers are quite straightforward to implement. Linux binaries expect reading and writing through a pt_regs structure, defined in Linux's header linux/include/asm-ppc/ptrace.h. The job is to get the registers and rearrange them appropriately.

The two tricky operations that help reading and writing the user structure are PEEKUSER and POKEUSER. Before explaining how we emulate these two commands, let us first introduce the user structure, also known as the U-dot zone.

When running several processes at once, the Unix kernel needs to maintain some information for each process. This process information is split into kernel-memory-based and user-process-memory-based parts. The kernel part of the information is stored in the struct proc, which is defined in sys/sys/proc.h on NetBSD. This structure contains data that must remain in main memory at all times (kernel memory is never swapped out). Kernel-based process information includes, for instance, the user owning the process. That information must always remain resident in main memory because we do not want a ps -aux to cause some pages of each swapped out process to be reloaded into main memory.

The user-based, or "userland", process information is called the user structure. The information contained in the user structure is only needed when the process is running. On NetBSD, the user structure is defined as struct user, in sys/sys/user.h. On Linux, this is the struct user, defined in linux/include/asm-ppc/user.h. In kernel code, user structures used to be named "u", and therefore accessed in C through the u.<field> syntax. Hence the "U-dot" name.

The NetBSD U-dot zone is rather small, because most of the fields in this structure were moved to other locations, including the kernel stack or struct proc. On the other hand, Linux stores lots of information in the U-dot zone, such as text, data, and stack location and sizes. It also uses the U-dot zone to save user values of CPU registers of the traced process when entering kernel space. Linux's gdb reads the U-dot zone to get and set the register values of the traced processes. For reading, this works because the traced process is stopped when gdb does the operation. gdb reads the latest values of the traced process registers before it was stopped and the CPU entered kernel space, saving the registers in the U-dot zone. For writing, it works because when the kernel runs the traced process again, it will restore the modified registers from the U-dot zone.

Now, let us examine how PEEKUSER and POKEUSER are emulated in NetBSD.

These two ptrace() commands are used with three other arguments: the PID of the traced process, the address of the target field in the U-dot zone relative to the beginning of the U-dot zone, and a data field, used for write operations. As you can imagine, it is not trivial to emulate operations on the U-dot zone, because they involve manipulating fields of the U-dot zone that do not exist in NetBSD's U-dot zone: registers, stack location and size, and so on. We therefore have to check the target address, and return a value from another place in the kernel depending on the target address.

The LUSR_OFF macro helps. It returns the address of a given field in the U-dot zone. Here is the definition of LUSR_OFF, from

#define LUSR_OFF(member) offsetof(struct linux_user, member)

And here is some code that emulates reading the stack size, code location, and stack location from Linux's U-dot zone. As you can see, we grab the revelant information from locations in the struct proc (p is a pointer to the struct proc of the current process):

if (addr == LUSR_OFF(u_ssize))
    *retval = p->p_vmspace->vm_ssize;
else if (addr == LUSR_OFF(start_code))
    *retval = (register_t) p->p_vmspace->vm_taddr;
else if (addr == LUSR_OFF(start_stack))
    *retval = (register_t) p->p_vmspace->vm_minsaddr;

And here is a code snippet that emulates reading traced process registers from the U-dot zone:

    error = process_read_regs(t, regs);
/* (snip) */
    if (addr == LUSR_REG_OFF(lnip))
        *retval = regs->pc;
    else if (addr == LUSR_REG_OFF(lctr))
        *retval = regs->ctr;
    else if (addr == LUSR_REG_OFF(llink))
        *retval = regs->lr;

With ptrace() implemented, gdb was able to start the traced program, but there was a remaining bug that made it unable to get a backtrace or to trace the program. We will examine the problem in the next section

Pages: 1, 2, 3

Next Pagearrow

Sponsored by: