20200804

Bootstrapping 2.11BSD (no patches) from 2.11BSD pl 195

Bootstrapping 2.11BSD (original)

I've had the sources for what I think is the original 2.11BSD for some time now. However, how do I know these sources are good? That's a very good question. I have a series of tests that I'm doing to verify that the sources are consistent with what we know, or have some kind of known deviation / reconstruction when not. They had passed only the first of my many tests (they were consistent with the patches themselves, but nothing else). It was time to see if I could build what I'd made.
Click to close image, click and drag to move. Use arrow keys for next and previous.

One thing we know for sure: The 2.11BSD release happened. This means that sources for the release must be buildable, in some way. The 2.11 BSD release notes don't mention any reproducibility issues. Presumably the documented way will work. However patches 106-111 fix dozens of build issues that affected reproducibility of the build. In addition, one should be able to build twice in a row and get identical results, modulo a few binaries that encode dates and such. Experience has shown that many programs in /usr/local or /usr/new are the worst offenders. I've made the decision that if make install doesn't install it from the top level, then it won't be in the release I recreate. Though I also made the decision that building some man pages by hand was also OK to make that happen...

Part of building it twice is building it at all. In patches 158-178, the binary format of the .o files changes to accommodate longer symbol names. As a result, the binaries in the 195 image don't produce binaries that work on the unpatched release (well, the binaries themselves do, but the .o's are wrong, as are all programs that read symbols). In addition, there's issues just building everything on the 195 image: as, nm, and ld don't even build, and without those, you won't get far. In fact, the 195 assembler won't even assemble the assembler I've recreated. Since the straight forward way won't work, I thought I'd document what does.

For a background on the toolchains, please see an earlier blog post. It goes over all the basics of toolchains, which I assume people are familiar with.

Bootstrapping as

So, we have to bootstrap the assembler. The 2.11pl195 assembler won't assemble it properly. The v7 assembler will. However, building it on a v7 system isn't the solution: the resulting binaries won't run on the 2.11BSD system. The system call format changed with 2.11BSD, so even the 2.10BSD binaries won't run. One advantage, though, of either the 2.11BSD or the V7 assembler is that it will run under apout.

Apout is a tool that the unix-1972 crowd over at tuhs created to run PDP-11 binaries on modern hardware. It doesn't implement all the system calls. The C compiler which forks other things won't run, for example. However, the assembler will. And the loader. And cpp. Why's the last one important? Well, if we have cpp, then we can assemble the 2.11BSD system call glue in libc.

The assembler is written in fairly low-level code. It calls half a dozen system calls, so this is easy, right? For the system calls, one needed only cpp and the assembler to create them. However, there's one other function it calls: signal. Signal used to be a system call when as was written. In 2BSD, Berkeley reworked how signals worked, so they created a compatibility shim written in C for the old way. That presents a problem... Getting the C compiler going was a lot of effort because it was so many passes and I'd have to string them all together by hand. My solution was to look at the sources and notice that it was just called to register an atexit function to cleanup tmp files when SIGINT was received. This is important for real, old-school PDP-11 hardware that measured the speed in hundreds of thousands of operations a second (or worse!). It would mean that ^C would clean up the temp files. But for bootstrapping? It's not really needed. So I created a .s file that was just '_signal: rts pc', which does nothing but satisfy linkage...

To make things simple, I used ld's partial link functionality to link all the .o's together to create a bootstrap.o. This took the place of libc. So I was able to bootstrap the assembler using the V7 as and ld binaries as well as the 2.11BSD cpp binary to pre-process the 2.11 sources. I did this twice, once for each pass of the assembler. I added the code to the script that I use to create the 2.11BSD (original) tree. This script took care of copying the results into the 2.11BSD tree. It was able to assemble itself, so on to the next step.

Now that I had the assembler bootstrapped, I could move on to the next things. Here we shift from the FreeBSD host that was creating the 2.11BSD (original) tree to a 2.11BSD pl 195 simh image that had a copy of this tree (which I'll call ur2.11 below to distinguish it from 2.11BSD pl 195 which I'll just call '195 below) mounted on /scratch. FYI: the 'ur' prefix means 'original' and it's often used in linguistics to describe the original version of something, now lost but reconstructed.

Bootstrapping ranlib (and to get there ld and nm)

So, one of the things you need is something called ranlib. It reads through a library and collects a table of all the symbols in that library and puts it in the first member of that archive. ld then uses that to pull in what it needs from the library. This eliminates the need to worry about cycles and other strange things. Normally, without a table of contents, ld will just make a single pass through the .a file, pulling in everything that's needed. When there's no cycles in the dependencies, this works great when you create the library with 'lorder *.o | tsort' so that it can be pulled in with one pass. If there are cycles, the library has to be pulled in multiple times to resolve them all.

libc, of course, has cycles. So, how do we fix that? Well, we need to build ranlib (since the newer ranlib uses a different table of contents format, because why would it be easy). To make matters worse, 2.11BSD changed the archive format to the portable archive format from the old PDP-11 format.

So, to build ranlib, we need libc and ld. For libc, we need nm because the lorder shell script uses it and I didn't want to hack the build process. Let's focus on the first two of those. In an ideal world, we could just build them on the '195 image. For once in this project, that's entirely possible, but with a caveat. The include files have changed, so I needed to build this on the 195 system, but using the ur2.11 includes (not the '195 ones, they had been rototilled in the 158-178 patch sequence for the new binary format). I needed to do this in the '195 system because it could create new binaries (but chrooted to the ur2.11 system could not). I was able to do this simply enough:
cd /scratch/usr/src/bin
cc -o ld -O -i ld.c -I/scratch/usr/include
cc -o nm -O -i nm.c -I/scratch/usr/include
 Now I had everything I needed to bootstrap ranlib... almost....

Drop into the chroot

As readers of my blog know, I recently did some search into chroot. The reason was this effort. I'd recalled reading that it was added into 4.2BSD, etc. So I went looking and found an interesting story (that I've already told).

Now you know why I was looking: the next step is to chroot into /scratch. Once we're there, we need to do a few things. First, let's copy things over:
chroot /scratch
cd usr/src/bin/as
cp as /bin
cp as2 /lib
rm as as2
cd ..
cp nm ld /bin
rm nm ld
cp /bin/true /usr/bin/ranlib
OK. That gives us a working assembler, loader and nm. What about cc? Don't we need to rebuild it? Turns out, no. It's already working, creating perfectly fine assembler. Since we just swapped out the assembler, we're good: it produces the new format. And the loader, it can combine them into binaries that will run (we're quite fortunate that the '195 loader can create binaries that work on ur2.11). What about ar(1)? Well, we don't have to bootstrap that either (at least not yet) since the format is the same, even if the program was imported from 4.3BSD in the 158-178 patch series. Finally, we avoid an extra step later by copying /bin/true to ranlib. This means the ranlib in the ur2.11 tree right now (which came from '195) won't create an entry in libc.a we have to delete later.

Building libc.a and crt0.o

So, next up, we need to rebuild libc and crt0.o. cc uses these to create working binaries, and we need cc to rebuild ranlib. Thankfully, it's relatively straight forward to rebuild libc and install it:
cd /usr/src/lib/libc
make clean
# Hack around make sometimes failing to descend on some runs
(cd pdp/compat-4.1; make)
(cd pdp; make)
make
make install
make clean
so now we've replaced the '195 libc.a with it's newer format binaries with ur2.11 libc.a with the proper for this version format. When building, you may have noticed tsort reported a cycle in the dependency graph. It's safe to ignore that for now, we'll work around it in a minute. Depending on dates of directories, you may need to build deep directories by hand because directories in the future aren't considered out of date so aren't rebuilt...

Building ranlib (for real)

Now we can build ranlib, and use it to add a table of contents to libc.a. We'll need to specify libc.a twice in order for it to resolve the circular dependency. When linking libraries w/o the ranlib table of contents, ld only makes one pass through the library. So, if we list it twice, it will get the rest of the dependencies when it makes a second pass through the library. Since all the other symbols are resolved, we don't wind up with two copies of anything.
cd /usr/src/usr.bin
cc -o ranlib -O -i ranlib.c -lc
cp ranlib /usr/bin
ranlib /lib/libc.a
So, now we have a sane libc.a and ranlib.

Finishing up the Bootstrapping

OK. We could go on from here and make a lot of progress. Along the way, though, we'll discover that there's some programs whose Makefile assumes certain things about ar, or want to exec the strip program, etc. So we'll build those now and install them to make for smoother sailing later. All the other dependencies are properly handled.
cd /usr/src/bin
make ar strip
cp ar strip /bin
And we're not quite done. install groks the binary format, so it has to be bootstrapped now before we use install -s as part of many make install targets:
cd /usr/src/usr.bin
make xinstall
cp xinstall /usr/bin/install 

Doing the Build

At this point, the simple way to build is to do the following
cd /usr/src
make clean
make all
make install
make clean
make all
make install
which builds everything twice. This is far from optimal, but will work. The things that fail the first time around, due to missing libraries and such, will succeed the second time through.

One could look in the sources and find there's another process, 'make build' which installs the includes (well, that's commented out, and that caused version skew between /usr/src/include and /usr/include), builds and installs libc, builds and installs the C compiler, rebuilds libc, rebuilds and reinstalls the C compiler, then builds and installs usr.lib before building and installing 'bin usr.bin etc ucb new games' directories. This works mostly OK. However, in our situation, this leaves a big hole: there's programs in /usr/src/usr.lib that need other libraries in /usr/src/usr.lib, so they fail to build in the make build scenario. Plus, I've had it fail in the second build of libc for reasons unknown (it just fails to descend into the pdp directory, which it had no trouble doing the first time).

So if you go and look at the bootstrap program, you'll see the following crazy dance that it does. Of course, it knows it's already built libc once (and it lacks the above workaround for libc, but the actual automation has it):
cd /usr/src
make clean
cd lib
for i in ccom cpp c2 libc ccom cpp c2; do
    (cd $i; make all install)
done
cd ../usr.lib
for i in lib[0-9A-Za-z]*; do
    (cd $i; make all install; make clean)
done
ln /usr/lib/libom.a /usr/lib/libm.a
cd ..
make all
make install
make clean
which is similar enough to 'make build' but avoids the holes in it and avoids having to build absolutely everything twice (though it does build libc and the C compiler 3 times total, which likely is overkill). The funky pattern for building libraries is because there's a lib.b that's installed (it's just a text file with what appears to be B code in it). The link for libm afterwards is to mimic what the make install target does in usr.lib since we're not using it and libom.a is used for libm.a on 2.11BSD. Since we remove all .a's in creating the root, we have to recreate this here.

In the end, we're left with a complete user land that we can then move to the next phase with. Once we have a kernel, we can rebuild the release tapes which I'll leave as a topic for another day. With the boot block rework, the disk label changes and the changing needs of the 2.11BSD community, rebuilding them for ur2.11 is somewhat different than 2.11BSD patch 469.

As a workaround for some build issues, I also needed to build a number of man pages so the program associated with them would be properly installed... Suffice to say I rebuilt all the man pages in the end as part of the bootstrap script, but they aren't strictly required to run the system.

Building the Kernel

Normally, one would re config the kernel and build it. However, in 2.11BSD as released, there were a number of hacks made to the kernel Makefile to get it to fit into memory. Normally, one would hack these things in /sys/conf/Make.sunix so configuring GENERIC wouldn't destroy any carefully worked out overlay, but that wasn't done initially. So, we have to be careful how we build.

Also, in the initial version, the root partition was hard coded into the kernel. There was a script call /GENALLSYS that would create all versions of the: rpunix, raunix, xpunix, hkunix, etc.When installing, one needs to know the proper one to use. So, putting that all together, we can just do this:

cd /usr/src/sys/GENERIC
make && make install && (cd / ; cp unix genunix; sh -x /GENALLSYS)

which builds all possible bootable kernels... 

Building all the Standalone Programs

When we built everything, a few things still weren't build: the boot loader, the autoconfig and boot program (which is different than the boot loader). One just needs to build in /sys/mdec, /sys/autoconfig and /sys/pdpstand:
cd /sys/mdec
make && make install && make clean
cd /sys/autoconfig
make && make install && make clean
cd /sys/pdpstand
make && make install && make clean
Once one has mdec installed, one needs to dd the blocks onto the disk to make it bootable. When I was bootstrapping this disk, I did it with the intention of making a bootable system. I had to add /usr to /etc/fstab too, but all the things I did might fill another blog entry...

Conclusion

Building entire systems is messy, and has always been messy. Unless you skipped to the conclusion, I suspect that you've already formed this opinion about the 2.11BSD build process.  I've managed to enshrine everything above into build.sh and build2.sh to make things automated. Using this technique I've managed to build a ur2.11BSD boot disk, created boot tapes and installed from those tapes. Automation was key, though, to recording all the right steps in the right order.

No comments: