20181217

Adding additional revisions

Adding other directories

Sometimes you need to add commits from other places / directories to a repo you've slimmed down. This post uses the timed example to offer some advice.

SMM.doc

Fortunately for us, the SMM.doc directory was moved late in the game. As such, it was easy to edit the commit stream to remove that commit, and then replay all the commits that came after it. Fortunately, there was only one to remove the 3. clause (advertising clause). That was done by hand, committed and the original commit message pasted into the log. I then used git rebase to order this commit in the right place temporally.

etc/rc.d/timed

For this directory, I followed a different path. After looking at this file (or should I say how it is currently called libexec/rc.d/timed), I determined there were only a few real commits. Since there were only 10 commits, I just created a dumb script to run in the FreeBSD root of a github mirror repo:
#!/bin/sh

j=1
d=/tmp/timed-junk
for i in $(grep ^commit /tmp/3 | awk '{print $2;}' | tail -r); do
        git show $i etc/rc.d/timed | sed -e s=/etc/rc.d=/rc.d=g > $d/$(printf %04d $j)
        j=$(($j + 1))
done
Where /tmp/3 had 'git log etc/rc.d/timed' filtered to remove all the bogus commits (eg the merge ones).

Once I had these in place, I was able to then import them into my repo by cd'ing to the root and running
git am --patch-format=stgit /tmp/timed-junk/*
I oopsed and let a merge commit sneak through, and if you do that too, you can just delete the file in /tmp/timed-junk. Also, don't know why it didn't autodetect the format, but with an explicit format it just worked.

This produced 9 commits that resulted in the same timed file as was in svn. I cheated a little and omitted the movement commits, and since this is in git, $FreeBSD$ isn't expanded. This time, I didn't bother to sort them into the stream chronologically since I have no automation to do that and 9 commits by hand was more than I had time for.

Push the result

Since I rebased, I had to do a forced push. Should someone come along and want to make this a port, I'll do the sorting of commits then and do another forced push then publish the final results under FreeBSD's github account rather than my own personal one.

Splitting up a git repo -- Single directory

Splitting up a git repo -- Single directory

FreeBSD has a large, sprawling svn repo that was once a CVS repo. There are times that things in that repo have outlived their usefulness. Sometimes those items are best moved to a FreeBSD port. One easy way to manage a port is to toss it into a github repo and have the port point there. This article discusses how to do that. While it should be 'easy' to get a clean history has a few gotchas.

Clone the FreeBSD repo

If you are contemplating moving something out of the FreeBSD repo, one way to do that is to take the FreeBSD github mirror and trim it. When doing this, I always use a new repo, but in theory you could do it in an existing repo. Given how much is tossed away, it's best to use a fresh copy to avoid disaster. Disk space is cheap, right? Here's what I used to kick off pulling timed into a separate repo.

git clone https://github.com/freebsd/freebsd timed

First pass at trimming

When one googles the topic, git filter-branch comes up. The canonical answer is a good starting point:
git filter-branch --prune-empty --subdirectory-filter usr.sbin/timed master
will do the first pass. This will leave just timed as the top level directory. For the moment, we'll leave aside the stray timed files elsewhere in the tree. That gets 'complicated' which will explore in the second part of this blog. It would be good to drop a 'go back' tag here:
git checkout -b timed-trimmed
Now, this gives a decently trimmed tree. However, there are some problem. --prune-empty is a lie, or to be more charitable, it is incompletely implemented. It doesn't prune every single thing. Especially merge commits. Those are retained, but should be omitted. So the next step is use the very flexible history rewriting "feature" of git to remove them.

Next, use git rebase to rebase things. There may be more smooth ways to do this, but I find the first version in the tree with git log | tail, and then do my rebase like so:
git rebase -i HASH_OF_FIRST_COMMIT  master
Now the fun part starts. For each commit you suspect of being a merge commit, you have have to see if that hash is included in the output of the git log --merges command above. Remove all those commits. However, this can be hard. If you have a *LOT* to sort through, it's easier to make one pass for the obvious cvs2svn and MFC commits, but if you miss one it's not the end of the world. It's also a good idea to save an unmodified version of this file. It will come in handy later if your efforts lead to only a couple of missing commits.

You can 'fix' the todo as you go, though this is tricky. Basically, when you hit an error, it's because the prior commit deleted everything as part of its 'merge'. So, to back up one, you need to just do this set of magic, I'll show, then explain:
git rev-parse HEAD
 Get the hash this prints
git rebase --edit-todo
Add 'pick ' this will make sure we keep the commit we're about to toss.
git reset --hard HEAD^
This resets the current mess (which is still in the todo list) back to one before HEAD.  At this point, you're back one commit in the rebase and have effectively skipped the troublesome commit.
git rebase --cont
This reapplies what was HEAD and then proceeds. See Git Rebase Stepping Forward and Back for more info. It turns out things are a bit tricky in that you want to make sure you are dumping the bad commit, and keeping earlier ones, and the behavior, especially when multiple branch merges happen, is a bit variable.

In the timed example, I needed to do this about a dozen times. I suspect that will be fairly typical as I had similar issues when I created the ctm repo.

Test: did I do it right?

Next, you need to make sure that you did things correctly. There's always a chance you'll drop commits that shouldn't be dropped. I ran a diff against my base FreeBSD tree and noticed that at least one commit was lost, so I had to go find it. So I missed 3 or 4 commits, and I had to go back and try again, so I had to use git reflog to find the result of the filter-branch and start over.

After I redid things (which I've not reproduced here, the second time was much easier) I did the diff and found one commit missing. I found it in my original todo file, so was able to do the rebase again, add it to the end (to make sure it applies) and then move it to the right place chronologically....

Push the result for testing

I created a new upstream (that was my personal space, not FreeBSD which I don't have permission to push to). I created a new repo, then pushed the 'master' branch from this repo  upstream. I know I'm missing the docs (which ironically were copied away early on) and the rc files. Those will be covered in the second part of this.