20191010

Video Footage of the first PDP-7 to run Unix

Hunting down Ken's PDP-7: video footage found

In my prior blog post, I traced Ken's scrounged PDP-7 to SN 34. In this post I'll show that we have actual video footage of that PDP-7 due to an old film from Bell Labs. this gives us almost a minute of footage of the PDP-7 Ken later used to create Unix.

The Incredible Machine

The Incredible Machine is a Bell Labs film released in 1968 and available on youtube here: https://www.youtube.com/watch?v=iwVu2BWLZqA It outlines a number of innovative computer things that Bell Labs was doing around audio and visual things with different PDP machines from DEC. Pretty cool, right? Especially because this film features the song "Daisy" sung by a computer, a plot point that would feature heavily in Stanley Kubrick's 2001: A Space Odyssey. (although that plot point was set in 1962, and was based on work done by IBM with the first song sung by a computer).

I'll concentrate on footage from 9:19 to 10:31 in the film. This footage talks about making computer made music. If you listen to the audio, it sounds quite quaint, although when it was made it was cutting edge.

Making the case that it's a PDP-7

Here's a screen shot from 9:43 in The Incredible Machine. From it we can make the case that we're looking at a PDP-7, hereafter called TIM PDP-7.
The screen shot looks a little boring until we compare it against two photos of PDP-7s from the archives. The first one is photo from a DEC PDP-7 sales catalog that's available online at https://www.soemtron.org/pdp7.html. The second photo is from SN115 from a machine in Oslo from the Institute of Physics that's been semi-restored (picture also from Soemtron).
I've superimposed the three photos together and highlighted 4 areas of convergence with numbers:
  1. The register panel that reports the status of an expansion cabinet. This is clearly visible in both photos in similar places.
  2. The control panel. It's clearly the same between these two photos. The control panel is used to examine and modify memory contents of the system as well as displaying internal registers of the PDP-7.
  3. The paper tape reader (option 444B). This reader is also visible from 9:19 to 9:30 in The Incredible Machine reading in a new program.
  4. Is the PDP-7 name badge. Although it's quite obscure in these photos, its clearly the same.
So, I think it is safe to conclude that the computer in this footage is a PDP-7. We have two different pictures of actual PDP-7s that the computer in The Incredible Machine clearly corresponds to. I'll leave it as an exercise for the reader to exclude all the other machines from that era, though my experience suggests that the register and control panels should be enough.

Hunting the Serial Number for this PDP-7

So we have found footage of a PDP-7 from Bell Labs. That's cool, can we push the envelope further and track down which serial number TIM PDP-7 might be? Let's look at the key features of the machine in the picture above and the video footage.
  • Option 444B, paper tape reader (Also seen in 9:19-9:30 in The Incredible Machine)
  • Option 340 Display (seen 10:06-10:14 and 10:22-10:25)
  • Option 370 High Speed Light Pen (seen 10:06-10:25 as well)
So, if we look at the PDP-7 field service list available at https://www.soemtron.org/downloads/decinfo/18bitservicelist1972.pdf (itself a excerpt of a more complete one at bitsavers), we find there's two machines with the display and light pen: SN 34 and SN149.

Ken's machine (SN34) has all these options:
The other candidate machine (SN149) in the list has them as well:

So how can we decide which is which?

If we look at dates, we see that the SN34 machine was in place early enough to be in a 1968 film with an installation date of 1965. SN149 appears to be too late with a 1969 date. However, that's not conclusive. The other fields are blank and SN148 and SN150 both have 1967 dates. It's weakly suggestive, so we need more. We can't eliminate it based on dates, as pleasing as it would be to do so.

We may be able to eliminate the TIM PDP-7 as SN149 because TIM PDP-7 clearly had the Option 444B paper tape reader, and SN149 doesn't list that in the field service log. Based on this we can exclude SN149, but only weakly because the paper tape readers were common.

Can we make the case stronger? The service logs show that SN149 has a Option 550/TU55 which is a DECtape and controller, while the SN34 does not. Ken Thompson has confirmed there was no DECtape, just paper tape on the machine he used. If we could confirm this machine didn't have a DECtape, our case would be strong for it being SN34.

Looking at the footage is hard because it is so dark. Even so, we can see a blank panel over the Option 444B paper tape reader shown starting at 9:19, though it's hard to be sure. If we look at the 9:43 frame above, we can't tell. When the color balance is adjusted we see the following:
we can clearly see here the card reader from the initial footage and what appears to be a blank panel above. There's no tell-tale circles that would indicate an installed DECtape there. Single stepping the video with this enhancement shows no other targets. There is something weird just over the younger gentleman's head, but it's not a DECtape.

Looking at the field log, the DECtape components were serviced in 1969, after this film was made. It's not clear if this was when the parts were added, or if they were merely repaired or replaced. After studying the field service log for a while, I thing we'd bias our data towards replacement rather than installation. Especially since there's no other bulk input media, like a paper-tape, listed.

Pulling it all together: we have clearly found a PDP-7. There were only 4 PDP-7s shipped to AT&T. Only two had the 340 display option clearly seen in the film. Of those two, one had DECtape, the other had a paper-tape reader. We know from Ken that his had a paper-tape reader. There's no DECtape evident in this film, but clear evidence of the paper-tape reader. It's not known where either SN34 or SN149 lived inside Bell Labs, but we know that Ken used a machine that had been cast off from the Visual and Acoustics Department. While the film doesn't list the internal departments that contributed to it, the computer generated music strongly suggests it could have been the Visual and Acoustics Department. Taken together, we can say that three lines of evidence support that the PDP-7 in The Incredible Machine from 9:19 to 10:30 would later be used by Ken to create Unix.

20190709

The PDP-7 Where Unix Began

Serial Number of First Unix System

In preparation for a talk on Seventh Edition Unix this fall, I stumbled upon a service list from DEC for all known PDP-7 machines. From that list, and other sources, I believe that PDP-7 serial number 34 was the original Unix machine.
PDP-7 System from DEC sales literature

Building The Case

We know from simh sources, the restored PDP-7 Unix version 0 sources, and recollections from the time that the original machine used by Ken Thompson and Dennis Ritchie had, or likely had, the following hardware:
  1. 8k word of memory (option 149B)
  2. A tape reader (option 444B)
  3. A tape punch (option 75D)
  4. A 1MB disk drive (RB09 same as an RD10)
  5. A tty controller for a teletype (option 649)
  6. A standard video display (option 340)
  7. A custom video display (Unknown option number)
  8. A keyboard for input (also option 340)


We know from the service list that Bell Labs had three PDP-7s and one PDP-7A. Several of these machines had the standard options (the tape reader and teletype) and extra memory. Only one system, serial number 34, also had a disk drive, a custom unknown board that could be a Bell Custom display, and the standard display. In addition, that system shipped to Bell Labs in 1965 and appears to have been refurbished in 1969. This timeline matches the oral histories describing a discarded PDP-7 used to bring up the system in late 1969.

Here's the full table of all the systems shipped to Bell Labs with each system's options, taken from the 18 bit service list provided by Bob Supnik. You can check the list for other contenders.

Serial NumberOption #Option NameShip Date
PDP-7 #31100?
7PDP7 CPU unit?
75DPerforated paper tape punch and control
173Data interrupt multiplexer07-68
177BExtended arithmetic element1128?
444BPerforated tape reader and control
550ADECtape dual magnetic tape control12-67
649Teleprinter and control
CR01B100Cpm card reader and control
TU55Single DECtape transport12-67
TU55Single DECtape transport12-67
TU55Single DECtape transport03-69
PDP-7 #3401-69
75DPerforated paper tape punch and control07-65
149BCore memory module 8K, extends in 8K blocks07-65
177Extended arithmetic element07-65
340Precision incremental CRT display07-65
342Symbol generator for 340 display, first 64 characters07-65
370High speed light pen07-65
444BPerforated tape reader and control07-65
649Teleprinter and control07-65
CR01B100Cpm card reader and control12-66
PDP7CPU unit07-65
RC09RB09 disk?01-69
76 05477Custom Bell Labs Display?01-69
PDP-7 #4411-65
75DPerforated paper tape punch and control
149BCore memory module 8K, extends in 8K blocks11-65
177Extended arithmetic element
444BPerforated tape reader and control
649Teleprinter and control
PDP7CPU Unit11-65
PDP-7A #14903-69
149Core memory module 4K, extends subsequent 4K blocks
175Information collector expansion
175Information collector expansion03-69
177BExtended arithmetic element
340Precision incremental CRT display
347CCPU CRT subroutine interface
370High speed light pen
550DECtape dual magnetic tape control03-69
637Bit synchronous data communication system
CR01B100Cpm card reader and control
KA71AI/O device package
KA77AProcessor unit (PDP-7/A)
KB03Device selector expansion03-69
TU55Single DECtape transport03-69

Another surprise

V0 Unix could run on only one of the PDP-7s. Of the 99 PDP-7s produced, only two had disks. Serial number 14 had an RA01 listed, presumably a disk, though of a different type. In addition to the PDP-7 being obsolete in 1970, no other PDP-7 could run Unix, limiting its appeal outside of Bell Labs. By porting Unix to the PDP-11 in 1970, the group ensured Unix would live on into the future. The PDP-9 and PDP-15 were both upgrades of the PDP-7, so to be fair, PDP-7 Unix did have a natural upgrade path (the PDP-11 out sold the 18 bit systems though ~600,000 to ~1000). Ken Thompson reports in a private email that there were 2 PDP-9s and 1 PDP-15 at Bell Labs that could run a version of the PDP-7 Unix, though those machines were viewed as born obsolete.


Please see this followup post where I make the case that footage of the PDP-7 Ken would later use has been found on youtube....

20190211

Strange Code

Now That's Weird

I was trying to compile some ancient code I pulled off the net. It is related to the Venix stuff I've been doing on and off of late.
put = bp->b_nleft;
if (put > cnt)
    put = cnt;
bp->b_nleft -= put;
to = bp->b_ptr;
asm("movc3 r8,(r11),(r7)");
bp->b_ptr += put;
p += put;
cnt -= put;
goto top;
So that's weird, right. What the heck is that movc3 doing in the middle of that code.

This code originally ran on BSD 4.1. The only system that version of Unix ran on was a VAX (later versions were more widely ported, but 4.1 was more of a limited distribution version). OK, looking up the movc3 instruction on vax references online, we see it is the "Move Character" instruction. r8 is the length. srcaddr is (r11) and dstaddr is (r7). So in effect, someone has done an inline of bcopy() here. Now, that's half of the problem. The other half is puzzling out what is in r7, r8 and r11 at the time of this call. In a perfect world, I'd just crank up the compiler to tell me. We live in an imperfect world where spinning up a 4.1 BSD system takes a substantial amount of time.

Fortunately, we can guess. cnt -= put gives us our first clue. We're decrementing by how much we copied, it seems. So r8 (the length) is put. OK. Now, we have this nice variable named 'to' that was most likely in the dstaddr (so r7), and we update it after (to appears only to be here for the side effect, so that's nice). But what's the from? The only thing it could logically be is 'p' since we += it by put as well.

So my best guess is that can be replaced by memcpy(to, p, put); and life will be good. My spidy sense also tells me that we don't need memmove here because they aren't overlapping ranges.

20181217

Adding additional revisions

Adding other directories

Sometimes you need to add commits from other places / directories to a repo you've slimmed down. This post uses the timed example to offer some advice.

SMM.doc

Fortunately for us, the SMM.doc directory was moved late in the game. As such, it was easy to edit the commit stream to remove that commit, and then replay all the commits that came after it. Fortunately, there was only one to remove the 3. clause (advertising clause). That was done by hand, committed and the original commit message pasted into the log. I then used git rebase to order this commit in the right place temporally.

etc/rc.d/timed

For this directory, I followed a different path. After looking at this file (or should I say how it is currently called libexec/rc.d/timed), I determined there were only a few real commits. Since there were only 10 commits, I just created a dumb script to run in the FreeBSD root of a github mirror repo:
#!/bin/sh

j=1
d=/tmp/timed-junk
for i in $(grep ^commit /tmp/3 | awk '{print $2;}' | tail -r); do
        git show $i etc/rc.d/timed | sed -e s=/etc/rc.d=/rc.d=g > $d/$(printf %04d $j)
        j=$(($j + 1))
done
Where /tmp/3 had 'git log etc/rc.d/timed' filtered to remove all the bogus commits (eg the merge ones).

Once I had these in place, I was able to then import them into my repo by cd'ing to the root and running
git am --patch-format=stgit /tmp/timed-junk/*
I oopsed and let a merge commit sneak through, and if you do that too, you can just delete the file in /tmp/timed-junk. Also, don't know why it didn't autodetect the format, but with an explicit format it just worked.

This produced 9 commits that resulted in the same timed file as was in svn. I cheated a little and omitted the movement commits, and since this is in git, $FreeBSD$ isn't expanded. This time, I didn't bother to sort them into the stream chronologically since I have no automation to do that and 9 commits by hand was more than I had time for.

Push the result

Since I rebased, I had to do a forced push. Should someone come along and want to make this a port, I'll do the sorting of commits then and do another forced push then publish the final results under FreeBSD's github account rather than my own personal one.

Splitting up a git repo -- Single directory

Splitting up a git repo -- Single directory

FreeBSD has a large, sprawling svn repo that was once a CVS repo. There are times that things in that repo have outlived their usefulness. Sometimes those items are best moved to a FreeBSD port. One easy way to manage a port is to toss it into a github repo and have the port point there. This article discusses how to do that. While it should be 'easy' to get a clean history has a few gotchas.

Clone the FreeBSD repo

If you are contemplating moving something out of the FreeBSD repo, one way to do that is to take the FreeBSD github mirror and trim it. When doing this, I always use a new repo, but in theory you could do it in an existing repo. Given how much is tossed away, it's best to use a fresh copy to avoid disaster. Disk space is cheap, right? Here's what I used to kick off pulling timed into a separate repo.

git clone https://github.com/freebsd/freebsd timed

First pass at trimming

When one googles the topic, git filter-branch comes up. The canonical answer is a good starting point:
git filter-branch --prune-empty --subdirectory-filter usr.sbin/timed master
will do the first pass. This will leave just timed as the top level directory. For the moment, we'll leave aside the stray timed files elsewhere in the tree. That gets 'complicated' which will explore in the second part of this blog. It would be good to drop a 'go back' tag here:
git checkout -b timed-trimmed
Now, this gives a decently trimmed tree. However, there are some problem. --prune-empty is a lie, or to be more charitable, it is incompletely implemented. It doesn't prune every single thing. Especially merge commits. Those are retained, but should be omitted. So the next step is use the very flexible history rewriting "feature" of git to remove them.

Next, use git rebase to rebase things. There may be more smooth ways to do this, but I find the first version in the tree with git log | tail, and then do my rebase like so:
git rebase -i HASH_OF_FIRST_COMMIT  master
Now the fun part starts. For each commit you suspect of being a merge commit, you have have to see if that hash is included in the output of the git log --merges command above. Remove all those commits. However, this can be hard. If you have a *LOT* to sort through, it's easier to make one pass for the obvious cvs2svn and MFC commits, but if you miss one it's not the end of the world. It's also a good idea to save an unmodified version of this file. It will come in handy later if your efforts lead to only a couple of missing commits.

You can 'fix' the todo as you go, though this is tricky. Basically, when you hit an error, it's because the prior commit deleted everything as part of its 'merge'. So, to back up one, you need to just do this set of magic, I'll show, then explain:
git rev-parse HEAD
 Get the hash this prints
git rebase --edit-todo
Add 'pick ' this will make sure we keep the commit we're about to toss.
git reset --hard HEAD^
This resets the current mess (which is still in the todo list) back to one before HEAD.  At this point, you're back one commit in the rebase and have effectively skipped the troublesome commit.
git rebase --cont
This reapplies what was HEAD and then proceeds. See Git Rebase Stepping Forward and Back for more info. It turns out things are a bit tricky in that you want to make sure you are dumping the bad commit, and keeping earlier ones, and the behavior, especially when multiple branch merges happen, is a bit variable.

In the timed example, I needed to do this about a dozen times. I suspect that will be fairly typical as I had similar issues when I created the ctm repo.

Test: did I do it right?

Next, you need to make sure that you did things correctly. There's always a chance you'll drop commits that shouldn't be dropped. I ran a diff against my base FreeBSD tree and noticed that at least one commit was lost, so I had to go find it. So I missed 3 or 4 commits, and I had to go back and try again, so I had to use git reflog to find the result of the filter-branch and start over.

After I redid things (which I've not reproduced here, the second time was much easier) I did the diff and found one commit missing. I found it in my original todo file, so was able to do the rebase again, add it to the end (to make sure it applies) and then move it to the right place chronologically....

Push the result for testing

I created a new upstream (that was my personal space, not FreeBSD which I don't have permission to push to). I created a new repo, then pushed the 'master' branch from this repo  upstream. I know I'm missing the docs (which ironically were copied away early on) and the rc files. Those will be covered in the second part of this.

20181116

Backing up git repos on multiple machines to central repo w/o collision

Git Tree Management

Here's a quick column about git. It's not a complete how-to or tutorial, but more an interesting way to manage multiple trees.

The problem: I have a dozen trees on a half dozen machines. I'd like at least backup all the branches in these trees to github. Trouble is, I don't want branch names to step on each other. This can happen for a number of reasons, let's say I called something 'junk' by habit on N trees and don't want a push to screw that up...

Git's world view: To understand git, you have to understand that it is a graph of versioned trees with labels. Each node in the tree has the familiar hash, and some of the hashes have refs that the git branch command groks. It's all just a directed graph with labels under the covers.

Normally, you when clone a repo, all its tags magically change from foo to origin/foo (for some value of origin).

Enter refspecs

Turns out this has been thought of before. The answer is simple:
fred% git push origin foo:fred/foo
will push the foo branch to your origin and rewrite its name to fred/foo. Don't forget to push master too.

Now when you go to barney and fetch, you'll have a bunch of remote branches named origin/fred/foo, etc.

Costs

Since I'm doing this with a number of git svn trees, the cost is kinda high since git svn creates new, unique git hashes for all the upstream revisions to git svn rebase. It also means that you'll need to learn how to use the --onto arg of git-rebase, since if you want to move a branch from one repo like this to another.
barney% git checkout -b foo fred/foo
barney% git rebase -i fred/master foo --onto master
since you're effectively creating a new name space on your local machine for the new branch. The rebase will now properly take just those commits from foo, and then play them back onto master on the current machine and leave you with a 'foo' branch for the results.

20181110

Most things in the VENIX emulator are working

SUCCESS

So I got tired of the terrible progress I was making chasing down issues. I thought if I could just create a simple program and get that working, I'd have much better luck.

So I wrote a simple K&R style C program:
int a=123;
int b;
extern char *etext, *edata, *end;
main() {
int c;
printf("CS: %x DS: %x ES: %x SS: %x\n", getcs(), getds(), getes(), getss());
printf("&data = %x &bss = %x &stack = %x\n", &a, &b, &c);
printf("etext = %x edata = %x end = %x\n", &etext, &edata, &end);
}
which printed the segment addresses and then locations of the segments.

and I ran it on my old Rainbow running Venix.

I made some interesting discoveries. First, that there are two kinds of stacks (low and high) in addition to there being two kinds of binary (OMAGIC and NMAGIC). So my loader was all wrong. Next, I discovered I needed to jump to a_entry, not 0, to make low stack binaries work (all the ones I'd been testing so far were low stack, but somehow mostly worked when the stack and text segments were swapped).

Armed with this knowledge, I built 4 binaries (no flags, -z, -i, -i -z) to test all 4 cases. The -z ones worked (yea!) while the non-z ones didn't. My loader was right in this case, but I was returning EFAULT for all the writes. Why? Because I had a check in there to make sure the address was between 0 and brk. High stack binaries also have a valid area from sp() to 0xffff. When I added that change, all 4 test programs worked.

Of course, getting them from the Rainbow to the server was a challenge. The key to remember here is that you needed to use 'set line /dev/com1.m' on the rainbow so that kermit would on login port. I also had to down-clock to 2400 baud to get it reliable.

So, I started testing a lot of programs that failed to work before. Sort(1) is now working. ls isn't, but comes closer (it tries really hard to interpret a modern FreeBSD dirent as a v7 one and that's not so good, but that's fixable). nm is still giving me problems, for reasons unknown. I have enough things working, though, that I can start to try out as, ld and friends. Maybe even cc (though I'd need to get both fork and signals working for that driver program). /bin/sh fails missing dup() (and likely a bunch of others).

So excellent progress in the last few days.