Showing posts with label historic unix. Show all posts
Showing posts with label historic unix. Show all posts

20200803

Missing 2.11BSD patches

2.11BSD Missing Patches

While looking into some date anomalies in the final image (since I'd like to get the dates right) I discovered a number of source directories had dates slightly newer than the date in the announcement. This lead me to discover some missing patches in a couple of different places.

The Anomaly

I've automated the system generation, tape generation and installing from tapes to allow me to make small tweaks and get end to end testing. As part of this, after the system is installed, I'll do a test boot, similar to the following, as if I'd installed the system on April 10th, 1991 and booted it on April 15th. The boot looks something like this:
sim> boot rq

boot: 73Boot
: ra(0,0)unix

2.11 BSD UNIX #1: Fri Mar 15 15:48:55 PST 1991
    root@wlonex.imsd.contel.com:/usr/src/sys/GENERIC

phys mem  = 4186112
avail mem = 4008640
user mem  = 307200

Apr 10 13:50:01 init: configure system
ra 0 csr 172150 vector 154 attached
rl 0 csr 174400 vector 160 attached
tms 0 csr 174500 vector 260 attached
ts 0 csr 172520 vector 224 attached
xp 0 csr 176700 vector 254 attached
erase, kill ^U, intr ^C
# date 9104151234
date: can't write wtmp file.
Mon Apr 15 12:34:00 PDT 1991
# Fast boot ... skipping disk checks
/dev/ra0c on /usr: Device busy
checking quotas: done.
Assuming non-networking system ...
preserving editor files
clearing /tmp
standard daemons: update cron accounting.
starting lpd
starting local daemons:.
Mon Apr 15 12:34:01 PDT 1991


2.10 BSD UNIX (my.domain.name) (console)

login:
I set the date in single user then bring it up to multiuser. In one of my tests, I found the following:
-rw-r--r--  1 imp  imp   9777 Aug 31  1991 alloc.c
-rw-r--r--  1 imp  imp   4817 Aug 31  1991 alloc11.c
-rw-r--r--  1 imp  imp  12474 Aug 31  1991 doprnt.c
-rw-r--r--  1 imp  imp   3299 Feb 23  1987 doprnt11.s
-rw-r--r--  1 imp  imp    831 Aug 31  1991 printf.c
-rw-r--r--  1 imp  imp  20446 Aug 31  1991 sh.c
-rw-r--r--  1 imp  imp   1771 Aug 31  1991 sh.char.c
which I thought was quite strange. There shouldn't be any files dated newer than the release in the tree. I know all files patched don't get the time set right, so I exclude those from my search (I plan on fixing that bug later). The above files (and others, it's just a short list) shouldn't be there. So I started looking...

The Diffs

Running diffs against 2.10.1 I discovered that the csh files were almost all the same. However, a typical diff looked like:

diff -ur root-2.10.1/usr/src/bin/csh/alloc.c root-2.11/usr/src/bin/csh/alloc.c
--- root-2.10.1/usr/src/bin/csh/alloc.c 1987-02-08 15:27:23.000000000 -0700
+++ root-2.11/usr/src/bin/csh/alloc.c   1991-08-31 01:03:00.000000000 -0600
@@ -4,10 +4,10 @@
  * specifies the terms and conditions for redistribution.
  */

-#ifndef lint
+#if    !defined(lint) && defined(DOSCCS)
 /* From "@(#)malloc.c  5.5 (Berkeley) 2/25/86"; */
 static char *sccsid = "@(#)alloc.c     5.3 (Berkeley) 3/29/86";
-#endif not lint
+#endif

 /*
  * malloc.c (Caltech) 2/21/82

which removed the SCCS IDs from the binary to save size. Other changes included introducing overlays for the first time. This indicated size issue. Let's take a look at what else was going on around the 31 Aug 91 in the patch stream. Looking, we find that this is just after patch 18 (which fixed a long vs int bug in test) as well as patch 17, which updated pcc. This sounds like a size hack by someone that had just updated the compiler, or was testing with pcc (the normal system compiler wasn't pcc, but the earlier Thompson compiler). Another of the changes also fixes an issue with character handling, which other patches have done to reduce the size of binaries that got too big.

So, in context, this change makes perfect sense. The only trouble is that it wasn't posted to comp.bugs.2bsd, nor did it make it into Steven Schultz's patch repo. And csh isn't the only troublesome one. There's issues in rn, and games/warp.

The Catch Up Patch

There was a catchup kit that was issued officially as Patch 80 (though it omitted patch 79).  Looking in that patch kit we find this change! So it's in one that was intended. So what to do?  And it turns out there's 5 such patches (but only 4 of them made it into the patch kit... I'll talk about #5 in a minute).

I've decided to look at the dates of each of these patches and pretend they happened just after the patch whose date is closest). I've updated my mk211bsd script to extract these from the catch up patch.

Oh, and there were a number of new programs added in the catchup patch as well. These must be deleted too, but I'd already noticed that and deleted them.

tftp changes

So, on May 4th, 1991 a patch to dd.c was posted to comp.bugs.2bsd. It's also included in the official archive as patch 1. The release announcement was dated March 14th, 1991. But there's tftp files dated May 15th, 1991. What's up with those? Turns out, this is another missed patch (but one that's assumed to be in place in the catch-up patch because it's not in there. Well, it's partially in the patches, partially in the scripts. It's an update of tftp and tftpd to a new version. It was posted to comp.bugs.2bsd on May 15th, 1991, but isn't in the official list of patches. So not only do we have to dig it out of the catch-up patches (from two different files), we also have to restore the old man pages from 2.10.1BSD, but in a different place, so this patch will be a patch + rm (which going backwards is patch + cp)

csh changes

As discussed above, these are various hacks to get the size of csh down.

warp changes

The changes here are around the config script used to generate files for the build. The changes use full path names, and cope with the new shadow password format changes.

pcc changes

[[ edit -- these were poorly documented: they are in patch 17, but not called out specifically to apply ]]

rn changes

As part of the catchup, there's a number of minor patches to rn that were included in the catch up, but weren't formally published or occupy a number of their own.

My Work

So, how does this affect me? Well, it means that I need to understand the catch up patch a lot better. I had hoped to use it later as a cross-check against my work. I didn't anticipate that I'd be using it "sooner" as to get missing bits I'd found using other techniques. I've had to update mk211bsd to extract the bits, as well as creating a couple of hints files to help me undo the changes.

And when the time comes to reply all the patches, I'll need to take these anomalies into account as well. But that's a problem for future me.

I also have to complete the audit of the weird file dates. There's 63 of them right now, 29 in /usr/src still. Some of clearly old man pages that I can remove. Some are the result of running 'configure' or similar script (rn has 7 of these). Some are config files that change over time (like for the root name server). Some may be just left-over detritus of a running system. I need to see which ones fall into which categories and update accordingly. This may dovetail back into needing to bring them back to make sure I can march back to pl195 and get the same system. Since I started with ~1500 such anomalies, I think being down to 63 is quite good. And there's others elsewhere in the system... 

Current status

As you might guess, if I'm finding things like this, that means I'm getting closer. I've shared a lot of this on my @bsdimp twitter account, but now is a good time for a wrap up. Here's what's done currently:
  1. Script to undo all the patches, including helper 'hints' scripts, where possible from existing artifacts.
  2. Miss patches reconstructed and integrated into the build
  3. Automated installing 2.11BSDpl195 image
  4. Automated bootstrapping back from 195 -> 0. There's a number of interesting problems here that I'll blog about soon
  5. Building the 2.11BSD pl 0 tapes automatically
  6. Test installing the pl  0 system from the pl 0 tapes.
The missing bits include
  1. Getting the dates right (or failing that plausible) for the patched files
  2. Finishing the date audit and tracking all anomalies to ground.
  3. Cleaning up my helper scripts off the image
  4. Creating a github repo with all the patches in it
  5. Reproducing the build on a second system
  6. Getting the ownership right for some files (eg using the mtree hack to get the ownership and permissions right, generating it from the pl195 tape/image, etc)
  7. Getting dates right on /, right now restor(8) doesn't restore the date in one at a time mode, so these are all wrong.
  8. Fixing tmscp boot. It's broken. The tmscpboot.s, ported from tkboot.s, only existed for a short period of time and has been lost. My reconstruction has issues (it won't boot), and I've not delved into why.
  9. Creating automation to ensure that the 'catch up' kit will apply cleanly.
And of course, I need to figure out the best way to publish the artifacts when I think I'm done.

20200727

When Unix learned to reboot(2).

History of Reboot(2)

Recently, a friend asked me the history of halt, and when did we have to stop with the sync / sync / sync dance before running halt or reboot. The two are related, it turns out.

That sync; sync; sync Thing...

If you go looking around the net, you'll find some people giving advice like "when shutting down, type 'sync; sync; sync; halt' to be safe." There's good reasons behind this advice which aren't immediately clear and are interesting to explore. Before exploring, I'd been told that the reasons for the sync dance were a driver bug in v6 that's been fixed 45 years ago... But it turns out whoever told me that must have been mistaken because the code tells a different story...

The sync program called the sync system call and exited (and still does). The sync system call in research editions of unix was implemented as approximately:
foreach mountpoint
    write superblock with bwrite
for each dirty inode
    write the inode with iupdat
bflush
 
It would step through the fixed list of buffers in the system, writing the dirty ones out. It used bwrite() to do this, which was synchronous.  Each write had to complete before the next one started. iupdat will read the inode off the disk, update that inode, and write out, again synchronously. bflush writes everything with bwrite, but marks the buffers as B_ASYNC which means in that case it won't wait. And nothing else waits either. So, recommendation for typing sync 3 times, one line at a time, was to give time for the buffers to drain (the subsequent syncs would schedule no new I/O on a quiet system). Typing all three on one line with semicolons, didn't give this time...

If you look at the recommendation, it's actually quite smart. Typed one line at a time, waiting for the prompt each time, would schedule a lot of I/O the first time, then give the operator a harmless task to do for a few seconds that would allow the I/O to complete before they did anything. The kernel avoided all kinds of nasty deadlocks that later systems would face when they implemented waiting for the I/O to complete.

Edit: One bit of lore that was passed on to me was the first sync returned right away, but the second one blocked.... I've found no evidence of that in BSD or System V based systems... although there is an increasing amount of protection against multiple threads being in the sync code as concurrency in Unix increased.

Why not do this in reboot(2)?

None of the versions of Research Unix had a system call to reboot. To restart things, one killed init with SIGHUP, which would in turn kill everything else and fork a new shell in single user. There was no other way to restart the system, and bad things happened if init actually died. But there was no clean reboot option, nor any way to stop the kernel cleanly (apart from the power switch).
Looking at the sources, there was one small hint that something was planned, but never executed. All the system calls were defined in /usr/include/sys.s. A close examination shows the following:
lock    = 53.
ioctl   = 54.
reboot  = 55.
mpx     = 56.
setinf  = 59.
which proves me wrong, right? Well, maybe not. Looking at sysent.c, we see the following:
1, 0, syslock,          /* 53 = lock user in core */
3, 0, ioctl,            /* 54 = ioctl */
0, 0, nosys,            /* 55 = readwrite (in abeyance) */
4, 0, mpxchan,          /* 56 = creat mpx comm channel */
0, 0, nosys,            /* 57 = reserved for USG */
0, 0, nosys,            /* 58 = reserved for USG */
3, 0, exece,            /* 59 = exece */
which lists 'nosys' as the handler, so there's no implementation. There's no reboot system call. I'll also note that there's another disconnect: system call 59 is listed as setinf (whatever that is), but is implemented as exece.

Enter 4BSD

The first reference to reboot(2) I can find is in 4.0BSD in sysent.c, we see the following;
3, 0, ioctl,            /* 54 = ioctl */
1, 0, reboot,           /* 55 = reboot */
4, 0, mpxchan,          /* 56 = creat mpx comm channel */
where reboot landed in the slot allocated for it. There's a new command in /etc/ that calls it (by the syscall number, not the normal wrapper). It wasn't in 2.8BSD, but is also present in a similar form in 2.9BSD and later. 3BSD still has the placeholder pointing at nosys. Since the 2BSD evolution tracked 4BSD, I'll not call it out further.

In 4.0BSD (1980), reboot() just call a machine dependent boot() routine. It called update(), which scheduled the writes as described above. It printed that it was waiting for the IO to finish. However, the 'wait for it' code was basically 'sleep(5)' and so if all the data didn't get out in 5 seconds, bad things would happen. So the "sync sync sync halt" dance was still useful advice. It would get thee ball rolling and easily double the amount of time that the data had to make it to the disk, depending on the typist... 4.1c (1982) bumped this to 10s and had ifdef'd out code to try to wait for all the dirty buffers to clear.

In 4.2BSD (1983), the wait for the writes code is engaged. It tries up to 20 times to walk the list of bufs in the system to let the buffers drain. So progress is made here. 4.3 adds a delay that's 40ms * iter (total of 8.4s)... which remains through at least 4.4BSD (1993)...  So while things got better, so did systems, and more and more I/O could be piled up. Successor BSD systems improved on this as well, including various ways to solve it (mostly when systems got big enough so update(8) couldn't flush all the I/O in 30s before the next sync call started).

AT&T Unix

Meanwhile, on the AT&T side of things...  System III (1980) didn't have anything. System Vr1 doesn't have anything. None of the Programmer Work Bench releases (PWB) have a reboot(2).

System Vr2 (1984) defined a new uadmin(2) system call. It acted like an indirect system call (you passed it what you wanted it to do as the first arg). One of these, A_REBOOT, is called from fsck, but the kernel doesn't implement it.

Fast forward to System Vr3 (1987) and we find an implementation. It's basically a call to umount for the / filesystem followed by a call to reset the CPU. The other filesystems are unmounted as part of shutdown process before uadmin(A_REBOOT, ..) gets called, so only / remained mounted by the time it's called. This call to umount flushes the dirty buffers, and cleanly unwinds everything else so nothing is pending when the call to the CPU reset happens. So finally the 'sync sync sync halt' problem had been solved... Well, maybe... There's no timeout, nor any way to avoid deadlock. Still, a fairly clean solution to the issue, especially relative to what BSD was doing at the time.

Commercial Unix

The availability of even leaked source from the early days. By the time SunOS gets to 4.1, however, there was a vfs_syncall() which the MD boot() function called to synchronize everything (it only returned when all the scheduled I/O was done). I've not checked to see if there was a halting problem here or not, but empirical evidence of rebooting a lot of suns suggests that it was rarely a problem in practice if any of the I/Os somehow got wedged...  Or that I was lucky enough not to have a large enough fleet of machines where flaky disk problems start to show up... I can't find any earlier versions of the Sun sources, so it's hard to know when this solution entered into the tree (the code I have looked at dates from 1994, 11 years after the initial release).  I suspect Sun solved this problem early, but have no proof of this beyond a hunch.

Other Unixes, that aren't just System V ports, are hard to find in source form, so I can't say from original sources whether or not they solved the issue or not prior to System Vr3. There were also a lot of 4.2BSD and 4.3BSD ports that didn't survive either to be examined.

The one exception to this general rule was a copy of the Unisoft 1.0 kernel I found on bitsavers. It dates from 1986 (so relatively late). It has a reboot system call (number 64, not 55). That system call calls update(), like 4BSD's, but then does a big for() loop (1 to 1,000,000) before calling a routine that resets the CPU. This kernel is a V7 port, and most likely got the idea (and maybe the code) from one of the 4BSD releases. This kernel appears to be the basis Sony's SUNIX (which appears to be a 7th Edition-based Unix that predated SONY's NEWS-OS based on 4.2BSD). NEWS-OS likely behaved like 4.2BSD, but I can't confirm that due to lack of sources. If you know more info about SUNIX or NEWS-OS, please leave a comment.

Linux

Linux's sync call is synchronous. You get the same guarantees as you do from fsync. This behavior was introduced in 1.3.20, released in 1995. Prior to that the same sync dance advice was useful since early versions of Linux were more aggressively asynchronous in their handling of disk writes than other contemporary systems. While this helped it compete in benchmarks, it caused data integrity problems when Linux machines started to be put into production (which was one of the reasons motivating the change). Modern Linux systems flush out all the dirty buffers as part of the shutdown sequence and wait for the flush to complete before proceeding to reboot, turning the system off or halting.

Conclusion

For years, I'd been told the reasons for the 3 sync dance was due to a driver bug, long since fixed, in a DEC disk driver in v6. However, digging into it shows that there were decent reasons for doing this dance, even after Unix learned to reboot() itself.

20200721

Adding Networking to the 2.11BSD pl 195 system

Adding Networking to 2.11BSD pl 195

So, now that I have an auto-installer, it would be nice, sometimes, to have networking. It also gives us a chance to push the envelop a little and learn about kernel builds. These instructions are for FreeBSD, alas, and other system may differ. However, the first section is the only part that's FreeBSD specific.

Simh config changes

I added the following to my simh.ini file:
SET XQ ENABLED
SET XQ TYPE=DEQNA
SET XQ MAC=08-00-4b-13-37-12
ATTACH XQ tap:tap0
and I also have the following in my /etc/rc.conf file:
cloned_interfaces="tap0 bridge0"
ifconfig_bridge0="addm em0 addm tap0 up"
ifconfig_tap0="up"
and the following in my /etc/sysctl.conf file:
net.link.tap.user_open=1
net.link.tap.up_on_open=1
These create a tap0 device, add it to the bridge0 device along with em0. em0 is my ethernet device, configured elsewhere.

For other OSes, you'll need to do something different for this section, but the rest of the blog is still useful.

Building a Networking Kernel

The GENERIC kernel that's installed lacks networking. It works on a wide variety of machines, but since the installation is from tape, there's no networking support (it's too big). So you'll need to build a networking kernel. This can be a bit tricky in the general case. However, given the machines selected, there's a close match to the SMS kernel. The instructions in the installation guide, however are a bit off. Well, not wrong, per se, but configuring kernels on 2.11BSD can be tricky. The documented instructions are straight forward enough;
# cd /usr/src/sys/conf
# ./config KERNEL
# cd ../KERNEL
# make all
# make install
which are familiar to old-time BSD hackers. Configure the kernel, cd to the build dir, then build and install it. However, with 2.11BSD, space is at a premium. This means that sometimes the make all will fail. if something is too big it will fail. The instructions for this are vague (move things around until it works, here's a few constraints to work with).

2.11BSD uses overlays to fit into the 64kB address space that the kernel has to work with. Well, on the PDP-11 that 2.11BSD runs on, there's actually 128kB of address space: 64k for instructions and 64k for data. These machines have separate I&D spaces, as this is called, managed by the MMU. And since things are managed by a MMU, there's an overlay scheme the kernel uses to fit that maps the upper 8kB into an 'overlay' region that the linker arranges to flip between as needed using one of the MMU segments. So this limits the 'base' part of the kernel to 56kB, and the overlays to 8kB. You can have many overlays (the kernel has 8 predefined). There's a small performance hit for calling routines in an overlay,
but it's not too bad unless the routine is called all the time. Think of it as a fairly static form of dynamic paging the VAX and others would introduce later. Finally, data isn't overlaid: the sum of the Data and the BSS sections have to be strictly less than 64kB. Many changes in the 469 patches concern saving space in the kernel in different ways. For a primer on details of text, data and bss, you can see traditional toolchain blog.

So the trouble comes when you configure too many things into the kernel (or the options align just so in a bad way to create a base > 56kB or any one overlay > 8kB). So, when this happens, you have to move things around a little (or a lot0. The canonical way to do that is to edit the /usr/src/sys/KERNEL/Makefile to move different .o files around so things fit. And when I've done this movement, I've had to do a make clean and start over to get the kernel to link properly (some of the helper programs built aren't properly rebuilt). Oh, and if the overflow is really big, the build stops at a stop where you think there's undefined symbols. But they are just from the network stack. I'll explain that in a future blog since it's a little non-standard, but kinda clever in how they shoe-horned a 62kB network stack into the kernel that's already out of space...

So, if you were to follow those directions for the SMS kernel at patchlevel 195, you'd hit this snag. One nice thing about the config shell script is that it saves a copy of the old Makefile. The SMS kernel comes pre-configured (eg pre-tweaked to be small enough). Diffing the makefiles reveals the fix. Moves vm_swp.o from the BASE set of objects to the OV6 set of objects. These are the swapping routines that swap things out. So while this is less than ideal, this is for an I/O operation, so the slight overhead of MMU segment flipping is acceptable. However, knowing what you can move and why it's OK is a black art that's likely learned by a lot of trial an error (and a good backup copy of Unix for the inevitable ooops).

tl;dr: do follow the directions. Instead "cd /usr/src/sys/SMS; make all; cp /unix /genunix; make install" But don't reboot just yet. Oh, and only do the cp if you've not installed a kernel before. This will be your backup to boot if you install a bad kernel, and it's best to never change that.

This is the issue, I think, that I'm running into with the kernel reconstruction for my as released project.

2.11BSD Network Configuration

Edit /etc/netstart. You'll need to fill in these lines:
hostname=my.domain.name
netmask=255.255.255.0
broadcast=127.255.255.255
default=127.0.0.0
as appropriate. I leave the default route alone, since I don't want this machine on the internet.  You'll also need to uncomment out the qe0 line and change 192.26.147.13 to the IP address for this machine. Then, when you reboot, you'll be able to get to the machine.

You may want to populate /etc/hosts. And maybe /etc/networks. Following the documentation for 4.2BSD or 4.3BSD network configuration will get the job done.

Create a user

Since you don't want to login as root, create a user using vipw. The sms account can be used as a template for what to do. You'll want to add this user to the 'wheel' group in /etc/group. You'll also want to make a home directory, chown it to this user, populate dot files, etc. And you'll likely want to set a password, but since you are using telnet, the password will be transmitted in the clear, so don't use a valuable one.

If you really do want to login as root, mark the PTYs as secure in /etc/ttys. You'll likely not want to do this. But it's a useful hack if you forgot to start simh in a 'screen' session and you want to hack as root from another room. OK for the short term, but you really don't want to leave it in this state.

Reboot

Use halt to halt 2.11BSD and get back to the simh> prompt. Type boot rq0 and hit return at the : prompt. hit ^D at the # prompt for single user and then login. ifconfig qe0 should show everything configured correctly. ftp and telnet are configured by default, for better or worse.

20191010

Video Footage of the first PDP-7 to run Unix

Hunting down Ken's PDP-7: video footage found

In my prior blog post, I traced Ken's scrounged PDP-7 to SN 34. In this post I'll show that we have actual video footage of that PDP-7 due to an old film from Bell Labs. this gives us almost a minute of footage of the PDP-7 Ken later used to create Unix.

The Incredible Machine

The Incredible Machine is a Bell Labs film released in 1968 and available on youtube here: https://www.youtube.com/watch?v=iwVu2BWLZqA It outlines a number of innovative computer things that Bell Labs was doing around audio and visual things with different PDP machines from DEC. Pretty cool, right? Especially because this film features the song "Daisy" sung by a computer, a plot point that would feature heavily in Stanley Kubrick's 2001: A Space Odyssey. (although that plot point was set in 1962, and was based on work done by IBM with the first song sung by a computer).

I'll concentrate on footage from 9:19 to 10:31 in the film. This footage talks about making computer made music. If you listen to the audio, it sounds quite quaint, although when it was made it was cutting edge.

Making the case that it's a PDP-7

Here's a screen shot from 9:43 in The Incredible Machine. From it we can make the case that we're looking at a PDP-7, hereafter called TIM PDP-7.
The screen shot looks a little boring until we compare it against two photos of PDP-7s from the archives. The first one is photo from a DEC PDP-7 sales catalog that's available online at https://www.soemtron.org/pdp7.html. The second photo is from SN115 from a machine in Oslo from the Institute of Physics that's been semi-restored (picture also from Soemtron).
I've superimposed the three photos together and highlighted 4 areas of convergence with numbers:
  1. The register panel that reports the status of an expansion cabinet. This is clearly visible in both photos in similar places.
  2. The control panel. It's clearly the same between these two photos. The control panel is used to examine and modify memory contents of the system as well as displaying internal registers of the PDP-7.
  3. The paper tape reader (option 444B). This reader is also visible from 9:19 to 9:30 in The Incredible Machine reading in a new program.
  4. Is the PDP-7 name badge. Although it's quite obscure in these photos, its clearly the same.
So, I think it is safe to conclude that the computer in this footage is a PDP-7. We have two different pictures of actual PDP-7s that the computer in The Incredible Machine clearly corresponds to. I'll leave it as an exercise for the reader to exclude all the other machines from that era, though my experience suggests that the register and control panels should be enough.

Hunting the Serial Number for this PDP-7

So we have found footage of a PDP-7 from Bell Labs. That's cool, can we push the envelope further and track down which serial number TIM PDP-7 might be? Let's look at the key features of the machine in the picture above and the video footage.
  • Option 444B, paper tape reader (Also seen in 9:19-9:30 in The Incredible Machine)
  • Option 340 Display (seen 10:06-10:14 and 10:22-10:25)
  • Option 370 High Speed Light Pen (seen 10:06-10:25 as well)
So, if we look at the PDP-7 field service list available at https://www.soemtron.org/downloads/decinfo/18bitservicelist1972.pdf (itself a excerpt of a more complete one at bitsavers), we find there's two machines with the display and light pen: SN 34 and SN149.

Ken's machine (SN34) has all these options:
The other candidate machine (SN149) in the list has them as well:

So how can we decide which is which?

If we look at dates, we see that the SN34 machine was in place early enough to be in a 1968 film with an installation date of 1965. SN149 appears to be too late with a 1969 date. However, that's not conclusive. The other fields are blank and SN148 and SN150 both have 1967 dates. It's weakly suggestive, so we need more. We can't eliminate it based on dates, as pleasing as it would be to do so.

We may be able to eliminate the TIM PDP-7 as SN149 because TIM PDP-7 clearly had the Option 444B paper tape reader, and SN149 doesn't list that in the field service log. Based on this we can exclude SN149, but only weakly because the paper tape readers were common.

Can we make the case stronger? The service logs show that SN149 has a Option 550/TU55 which is a DECtape and controller, while the SN34 does not. Ken Thompson has confirmed there was no DECtape, just paper tape on the machine he used. If we could confirm this machine didn't have a DECtape, our case would be strong for it being SN34.

Looking at the footage is hard because it is so dark. Even so, we can see a blank panel over the Option 444B paper tape reader shown starting at 9:19, though it's hard to be sure. If we look at the 9:43 frame above, we can't tell. When the color balance is adjusted we see the following:
we can clearly see here the card reader from the initial footage and what appears to be a blank panel above. There's no tell-tale circles that would indicate an installed DECtape there. Single stepping the video with this enhancement shows no other targets. There is something weird just over the younger gentleman's head, but it's not a DECtape.

Looking at the field log, the DECtape components were serviced in 1969, after this film was made. It's not clear if this was when the parts were added, or if they were merely repaired or replaced. After studying the field service log for a while, I thing we'd bias our data towards replacement rather than installation. Especially since there's no other bulk input media, like a paper-tape, listed.

Pulling it all together: we have clearly found a PDP-7. There were only 4 PDP-7s shipped to AT&T. Only two had the 340 display option clearly seen in the film. Of those two, one had DECtape, the other had a paper-tape reader. We know from Ken that his had a paper-tape reader. There's no DECtape evident in this film, but clear evidence of the paper-tape reader. It's not known where either SN34 or SN149 lived inside Bell Labs, but we know that Ken used a machine that had been cast off from the Visual and Acoustics Department. While the film doesn't list the internal departments that contributed to it, the computer generated music strongly suggests it could have been the Visual and Acoustics Department. Taken together, we can say that three lines of evidence support that the PDP-7 in The Incredible Machine from 9:19 to 10:30 would later be used by Ken to create Unix.

20190709

The PDP-7 Where Unix Began

Serial Number of First Unix System

In preparation for a talk on Seventh Edition Unix this fall, I stumbled upon a service list from DEC for all known PDP-7 machines. From that list, and other sources, I believe that PDP-7 serial number 34 was the original Unix machine.
PDP-7 System from DEC sales literature

Building The Case

We know from simh sources, the restored PDP-7 Unix version 0 sources, and recollections from the time that the original machine used by Ken Thompson and Dennis Ritchie had, or likely had, the following hardware:
  1. 8k word of memory (option 149B)
  2. A tape reader (option 444B)
  3. A tape punch (option 75D)
  4. A 1MB disk drive (RB09 same as an RD10)
  5. A tty controller for a teletype (option 649)
  6. A standard video display (option 340)
  7. A custom video display (Unknown option number)
  8. A keyboard for input (also option 340)


We know from the service list that Bell Labs had three PDP-7s and one PDP-7A. Several of these machines had the standard options (the tape reader and teletype) and extra memory. Only one system, serial number 34, also had a disk drive, a custom unknown board that could be a Bell Custom display, and the standard display. In addition, that system shipped to Bell Labs in 1965 and appears to have been refurbished in 1969. This timeline matches the oral histories describing a discarded PDP-7 used to bring up the system in late 1969.

Here's the full table of all the systems shipped to Bell Labs with each system's options, taken from the 18 bit service list provided by Bob Supnik. You can check the list for other contenders.

Serial NumberOption #Option NameShip Date
PDP-7 #31100?
7PDP7 CPU unit?
75DPerforated paper tape punch and control
173Data interrupt multiplexer07-68
177BExtended arithmetic element1128?
444BPerforated tape reader and control
550ADECtape dual magnetic tape control12-67
649Teleprinter and control
CR01B100Cpm card reader and control
TU55Single DECtape transport12-67
TU55Single DECtape transport12-67
TU55Single DECtape transport03-69
PDP-7 #3401-69
75DPerforated paper tape punch and control07-65
149BCore memory module 8K, extends in 8K blocks07-65
177Extended arithmetic element07-65
340Precision incremental CRT display07-65
342Symbol generator for 340 display, first 64 characters07-65
370High speed light pen07-65
444BPerforated tape reader and control07-65
649Teleprinter and control07-65
CR01B100Cpm card reader and control12-66
PDP7CPU unit07-65
RC09RB09 disk?01-69
76 05477Custom Bell Labs Display?01-69
PDP-7 #4411-65
75DPerforated paper tape punch and control
149BCore memory module 8K, extends in 8K blocks11-65
177Extended arithmetic element
444BPerforated tape reader and control
649Teleprinter and control
PDP7CPU Unit11-65
PDP-7A #14903-69
149Core memory module 4K, extends subsequent 4K blocks
175Information collector expansion
175Information collector expansion03-69
177BExtended arithmetic element
340Precision incremental CRT display
347CCPU CRT subroutine interface
370High speed light pen
550DECtape dual magnetic tape control03-69
637Bit synchronous data communication system
CR01B100Cpm card reader and control
KA71AI/O device package
KA77AProcessor unit (PDP-7/A)
KB03Device selector expansion03-69
TU55Single DECtape transport03-69

Another surprise

V0 Unix could run on only one of the PDP-7s. Of the 99 PDP-7s produced, only two had disks. Serial number 14 had an RA01 listed, presumably a disk, though of a different type. In addition to the PDP-7 being obsolete in 1970, no other PDP-7 could run Unix, limiting its appeal outside of Bell Labs. By porting Unix to the PDP-11 in 1970, the group ensured Unix would live on into the future. The PDP-9 and PDP-15 were both upgrades of the PDP-7, so to be fair, PDP-7 Unix did have a natural upgrade path (the PDP-11 out sold the 18 bit systems though ~600,000 to ~1000). Ken Thompson reports in a private email that there were 2 PDP-9s and 1 PDP-15 at Bell Labs that could run a version of the PDP-7 Unix, though those machines were viewed as born obsolete.


Please see this followup post where I make the case that footage of the PDP-7 Ken would later use has been found on youtube....