More Venix reconstruction work

More Venix reconstruction work

With the simulator working well enough to run many / most of the Venix binaries (the C compiler being a notable except), I thought I'd turn my hand to some reconstruction work. You know, the whole reason that I started this thing up.

System Calls

There's no easier code to write in Unix that does something useful than interfacing to system calls. These calls are usually 'load these registers (or this block) with those values and trap to the kernel'. Venix is no exception to this rule.

Venix has about 60 system calls it implements. They are so regular I thought I'd be able to write a generator for all the system calls, except maybe pipe. I thought this because FreeBSD generates the glue for all its system calls, though pipe has been an exception because it needs to return two values.

Little did I know there's really 74 .s files associated with the system calls. Only about 50 of the system calls are regular. The rest are irregular in a number of different ways.

Return Values and weird pointers

There are 5 system calls that require some special handling just because the return values are weird. These include time(2) which you pass a pointer to a long to put the value of time into (that's done in userland in Venix, rather than with a copyout call that other systems use). This mirrors what's done on the PDP-11, so it's no real surprise here. pipe(2) also falls into this category. You pass it an array, and the system call caller is responsible for stuffing the data back into this array. wait(2) is the same way.

stime(2) is similar, but in the opposite direction: It loads the values from a pointer into a register rather than having the kernel just copy that value into the kernel. That's weird because plenty of other things do it with pointers.

Variations on a theme

dup(2) can be generated automatically, but dup2(2) can't. dup2 is the variant where you set the new fd rather than allowing the kernel to pick one for you. Rather than having two system calls, you just add 64 to the fd and call dup. What's weird is that dup(2) is documented to take one argument, but the dup.o file, when disassembled, clearly passes two arguments. This means that there's tack garbage for one of them (a bug!). dup2(2) makes sense to pass two args, but dup(2)? Really? So that's the first bug I've found in the generated code.

brk(2) and sbrk(2) are similar, but they also have to keep track of where the actual break point in the address space is.  And it's a little weirder than that for some NMAGIC binaries that put the stack at the top of memory (right?) and have the heap grow between the top of bss (ebss) and the bottom of the stack. I suspect bugs in this area of my emulator since the C compiler is one of the few binaries with this sort of odd arrangement.

Then there's the exec(2) family of calls. They are all a bit different in terms of calling them, but in assembler you can morph them all into one system call. Sweet, eh? Turns out to be hard in 'C' to pull this off portably, but this predates those worries. Both PDP-11 and the 8086 port use this same trick.

4 arguments are hard.

There's 3 system calls that have 4 arguments. 2 (lseek(2) and locking(2)) do it one way, the other does it a second way (ptrace(2)). And the 2 that appear to do it right (in that it follows the same convention as the 3 arg call) have the same bug.

Most of the 3 arg calls look something like the following:
        push    bp
        mov     bp,sp
        mov     bx,#3
        mov     ax,*4(bp)
        mov     dx,*6(bp)
        mov     cx,*8(bp)
        int     0xf1
which is simple and straight forward. The 2 call variants don't load anything into cx, the one arg calls skip dx, etc.

But the 4 arg ones are weird. In that they are really 3 args with one of the args being too fat. Let's look at lseek, which has three args, but one of them is a long. lseek takes an int, a long and a second int, let's see how it does it's thing:

        push bp
        push si
        mov bp,sp
        mov bx,#19
        mov ax,*6(bp)
        mov dx,*8(bp)
        mov cx,*10(bp)
        mov si,*12(bp)
        int 0xf1
Notice how the offsets are all buggered up. This code will work, and the offsets are correct, but only because the code botched the preamble to setup bp so we can access the args. The push si interrupts that, so bp has the wrong value, so you have to offset everything by 2. Another bug found through the powers of disassembly. So I mostly generate lseek, then hand tweak it to make sure it's the same file.


Then there's signals. They are hard, and they do all this wonderful weird stuff with trampolines and the like. This one file is by far the longest one.

Getting the same .o

A few years ago, I found the Minix disassembler dis88 floating around. I've been steadily hacking on it to produce good quality disassembled code. It's tough, though, since there's so many different rules. As I'm doing this reconstruction, I'm learning more. I'll go into those on another post.

But to make things as testable as possible, I've created a gensys script. This generates all the system calls I can, and tries to test the ones I can't. It does this by using the emulator (86sim) to run the assembler. we then compare the disassembled output between the original and the new one and report diffs. No diffs, I'm done! Like I said, the emulator is coming along nicely.

I tried running this same process on the Rainbow, and it was so slow I could only do one or two items in the time it took me to iterate through 5 or 6 different problems and rebuild everything.  The emulator is starting to save time for the investment in writing it...  We'll see if it is all worth it in the end.

No comments: