20200627

Whither chroot?

Chroot Origins

This blog post will examine original artifacts to clear up some confusion about where chroot(2) and chroot(8) came from. The answer turns out to be simple, and the confusion was understandable. This shows the benefits of groups like TUHS in preserving Unix history, and how the kindness of Caldera and Lucent in releasing the historic Unix systems has helped in our understanding of the evolution of Unix. 

EDIT: After initially published, this was revised with more links to historic artifacts (inline and in the Appendix) and a screen shot of wikipedia. The Wikipedia chroot entry has since been updated.

tl;dr: chroot(2) came from 7th Edition Unix

chroot is system call 61 in 7th Edition Unix from Bell Labs. There is no chroot system call in 6th Edition or earlier. All derivatives of 7th edition have chroot(2) for at least 2 decades after the 7th Edition release in 1979.

What confusion?

Wikipedia has this in their entry for chroot:
which suggests that Bill Joy had something to do with its creation in the BSD world. Turns out it's confused because earlier literature on the topic is also confused.

What Sparked the Confusion?

Poul-Henning Kamp created the jail system for FreeBSD. This system takes a chroot environment to the next level in terms of security. As a security device, chroot was terrible because it's fairly easy to jailbreak out of a chroot if you are root. The short version is to open '/' to get a reference to it. Then chroot to some directory further down the tree. Then fchdir to the fd you saved from '/'. Now chdir(".."); a bunch of times. This will walk you back to the real root. Now chroot(".") and you are out. There's lots of variations on this theme, and dozens of papers in the literature and an almost infinite number of ways to leak references to FDs outside the jail...

One of the wonderful thing he did was to create an extensive set of docs and write a paper about the jail(2) facilities. In this paper Mr Kamp wrote:
[CHROOT]
Dr. Marshall Kirk Mckusick, private communication: ``According to the SCCS logs, the chroot call was added by Bill Joy on March 18, 1982 approximately 1.5 years before 4.2BSD was released. That was well before we had ftp servers of any sort (ftp did not show up in the source tree until January 1983). My best guess as to its purpose was to allow Bill to chroot into the /4.2BSD build directory and build a system using only the files, include files, etc contained in that tree. That was the only use of chroot that I remember from the early days.''
This paper was presented at the 2nd International System Administration and Networking Conference "SANE 2000" May 22-25, 2000 in Maastricht, The Netherlands and is published in the proceedings.

In 2000, the BSD SCCS tree was not publicly available. Dr McKusick had access to it as his role with the Computer Science Research Group (CSRG) that produce the 4BSD releases. This predated various litigation that suggested 32V had no copyright, and the Ancient Unix License that SCO granted for 32V, so it was necessarily private per agreements between AT&T and The University of California at Berkeley.

What Actually Happened in 1982?

What happened was a shuffling of the deck chairs. the commit log, made as root, from March 18, 1992 says:

rearrange for kirk

SCCS-vsn: 4.21
and introduces chroot to ufs_syscalls.c. If you read the diffs, it also introduced 'open', 'creat' and several others to this file. These system calls are known to be in the PDP-7 Unix implementation, so it's unlikely that they were really introduced in this commit. One problem that makes this harder to track is that SCCS didn't track renames, and ufs_syscalls.c was renamed to vfs_syscalls.c in 4.4BSD.  It's quite clearly in ufs_syscalls.c in 4.1cBSD:
/*
 * Change notion of root (``/'') directory.
 */
chroot()
{

        if (suser())
                chdirec(&u.u_rdir);
}
which is the identical code that was added by Bill Joy to ufs_syscalls.c. This was moved between 4.1BSD and 4.1c from sys4.c as part of the UFS work, and is different only by the BSD-stylistic change to add a blank line before the rest of the code if there's no local variables:
chroot()
{
        if (suser())
                chdirec(&u.u_rdir);
}
which, apart from the comment, is identical. Without beating a dead horse (too late?), this code is the same all the way back to 4BSD, 3BSD, 32V, 2.8BSD and finally to V7:
chroot()
{
        if (suser())
                chdirec(&u.u_rdir);
}
Since the code is identical from V7 all the way through 4.2BSD when it was, according to this footnote in the jail appeared, added. This is direct evidence that the footnote was in error.

So what was the rearrangement for Kirk? It was to move things around in the kernel to make the system calls more generic. It was code motion, nothing more, that Dr. McKusick was reporting in the private email to Mr Kamp. Now that the SCCS tree is public, via a translation to svn by John Baldwin, we can see the above.

chroot(2) Conclusions

Given that the code was moved around alot, it's an understandable mistake that Dr. McKusick made, which explains how the error could have happened. Given that the code is identical to v7 code, and it was somewhere in all the extant versions between the two (2BSD, 32V, 3BSD, 4.0BSD, 4.1BSD, 4.1cBSD and 4.2BSD), modulo a trivial whitespace change, we can conclude that Bill Joy did not introduce chroot into 4.2BSD, but instead it was moved around a lot from the original V7 code.

The FreeBSD chroot(2) manual has been updated to correct this mistake.

But what about chroot(8)?

But what about chroot(8)? There's some confusion about this as well. Until recently, chroot(8) said in FreeBSD:
HISTORY
     The chroot utility first appeared in 4.4BSD.
However, that too is in error (or was at least not precise enough). The error comes from the 4.4BSD release itself, which has identical text. In a sense this is not wrong. 4.4BSD was the first full release that chroot(8) appeared in in the Berkeley world. It's first appearance, though, in any BSD tape was in the interim 4.3BSD-Reno release.

But what about the AT&T world? There, more system calls are wrapped in programs to make it easier to use in shell scripts. It turns out that System III had a usr/src/cmd/chroot.c, which I won't quote here, that's a different chroot than appeared in BSD (the code looks completely different, apart from the elements that have to be the same...). So, the history has been corrected to read:
HISTORY
     The chroot utility first appeared in AT&T System III UNIX and
     4.3BSD-Reno.
to represent the first time in each of the two branches of Unix after the 7th Edition that it appeared.

And that concludes today's software archeology deep dive on chroot...

Appendix

Here's the evolution of the chroot(2) implementation, as see from TUHS. You'll need to search for 'chroot()' in each of these source files since the current TUHS web site doesn't allow line number links.
AT&T Unix: V7, 32V, System III, System V

I'd also like to plug the Historic Unix Repo, which also helps navigate and allows line numbers. Here's a link to the 4.1c version, for example. I recalled this after I'd found all the TUHS references, or I'd done all of them like that.

1 comment:

Unknown said...

https://blog.dionresearch.com/2020/05/data-infrastructures-for-rest-of-us-iii.html

mentions chroot(2) being in the 7th Edition manual as well.